Definitions
• Data mining (knowledge discovery in databases):
– Extraction of interesting (non-trivial, implicit, previously unknown and potentially useful) information or patterns from data in large databases
• Data mining helps end users extract useful business information from large databases
• Data mining is the exploration and analysis of large quantities of data in order to discover meaningful patterns and rules. • The goal of data mining may be to allow a corporation to improve its marketing, sales, and customer support operations through a better understanding of its customers.
Lecture 8
2
Intro to Data Mining
Definitions cont’d
• The non-trivial extraction of novel, implicit, and actionable knowledge from large datasets.
– – – – Extremely large datasets Discovery of the non-obvious Useful knowledge that can improve processes Can not be done manually
• Technology to enable data exploration, data analysis, and data visualisation of very large databases at a high level of abstraction, without a specific hypothesis in mind.
Lecture 8 3 Intro to Data Mining
What is Data Mining and its purpose?
• Search for relationships and global patterns that exist in large databases but are hidden in the vast amounts of data. • Analyst combines knowledge of data and machine learning technologies to discover nuggets of knowledge hidden in the data. • Serendipity to science. • Easier and more effective when the organization has accumulated as much data as possible, such as with a data warehouse • A data warehouse is not a prerequisite to data mining
Lecture 8
4
Intro to Data Mining
Data Mining and Other Disciplines
Lecture 8
5
Intro to Data Mining
Sample Data Mining Applications
• Commercial : – Fraud detection: Identify Fraudulent transaction – Loan approval: Establish the credit worthiness of a customer requesting a loan – Investment analysis : Predict a portfolio's return on investment –