IT433 Data Warehousing and Data Mining — Data Preprocessing — 1 Data Preprocessing • Why preprocess the data? • Descriptive data summarization • Data cleaning • Data integration and transformation • Data reduction • Discretization and concept hierarchy generation • Summary 2 Why Data Preprocessing? • Data in the real world is dirty – incomplete: lacking attribute values‚ lacking certain attributes of interest‚ or containing only aggregate data • e.g.‚ occupation=“ ”
Premium Data analysis Data management Data mining
University CS 450 Data Mining‚ Fall 2014 Take-Home Test N#1 Date: September 22nd‚ 2014 Final deadline for submission September 29th‚ 2014 Weighting: 5% Total number of points: 100 Instructions: 1. Attempt all questions. 2. This is an individual test. No collaboration is permitted for assessment items. All submitted materials must be a result of your own work. Part I Question 1 [20 points] Discuss whether or not each of the following activities is a data mining task.
Premium Data mining Data analysis Data
Handling Consumer Data Introduction When I visit my local Caltex Woolworths petrol station on “cheap fuel Wednesday” to cash in the 8c per litre credit that my Wife earned the previous Friday buying the groceries with our “Everyday Rewards” card‚ I did not‚ until researching this report‚ have any clue as to the contribution I was making to a database of frightening proportions and possibilities… nor that‚ when I also “decide” to pick up the on-sale‚ strategically-placed 600mL choc-milk‚ I am
Premium Marketing Loyalty program
Data Projectors Amy Shipman $50- $66‚525 What is a data projector? It is “a device that projects computer output onto a white or silver fabric screen that is wall‚ ceiling or tripod mounted." The three most common types of data projectors are the LCD‚ DLP‚ and the LCoS. Each type of projector will project your audio and video‚ they just have different ways to process the output of your audio and video. DLP stands for Digital Light Processing. This type of data projector has a light that
Premium Liquid crystal display Digital Light Processing Video projector
Data warehousing is the process of collecting data in raw form for analyzing trends. The benefits to data warehousing are improved end-user access‚ increased data consistency‚ various kinds of reports can be made from the data collected‚ gather the data in a common place from separate sources and additional documentation of data. Potential lower computing costs‚ increased productivity‚ end-users can query the database without using overhead of the operational systems and creates an infrastructure
Premium Data warehouse Data mining Database management system
Data Collection QNT/351 July 10‚ 2014 There are many times when companies have to collect data to come to a conclusion about an issue. The data may be collected from their employers‚ their competition or their consumers. BIMS saw that there had been an average turnover that was larger then what the company had seen in the past. Human Resources decided that they would conduct a survey to see what had changed in the company from the employee’s point of view. They attached
Premium Qualitative research Level of measurement Scientific method
Data Gathering ➢ used to discover business information details to define the information structure ➢ helps to establish the priorities of the information needs ➢ further leads to opportunities to highlight key issues which may cross functional boundaries or may touch on policies or the organization itself ➢ highlighting systems or enhancements that can quickly satisfy cross-functional information needs ➢ a complicated task especially in a large and complex system ➢ must
Free Interview Semi-structured interview Documentary film techniques
Components of DSS (Decision Support System) Data Store – The DSS Database Data Extraction and Filtering End-User Query Tool End User Presentation Tools Operational Stored in Normalized Relational Database Support transactions that represent daily operations (Not Query Friendly) Differences with DSS 3 Main Differences Time Span Granularity Dimensionality Operational DSS Time span Real time Historic Current transaction Short time frame Long time frame Specific Data facts Patterns Granularity Specific
Premium Data warehouse
Increase Your Data Center Energy Efficiency • Increase Your Data Center Energy Efficiency • Increase Your Data Center Energy Efficiency • Increase Your Data Center Energy Efficiency • Increase Key Best Practices Optimize the Central Plant Quick Start Guide to Increase Data Center Energy How To Start A Problem That You Can Fix Data Center energy efficiency is derived from addressing BOTH your hardware equipment AND your infrastructure. Commit to Improved Design and Operations
Premium Data center
Turning data into information © Copyright IBM Corporation 2007 Course materials may not be reproduced in whole or in part without the prior written permission of IBM. 4.0.3 Unit objectives After completing this unit‚ you should be able to: Explain how Business and Data is correlated Discuss the concept of turning data into information Describe the relationships between DW‚ BI‚ and Data Insight Identify the components of a DW architecture Summarize the Insight requirements and goals of
Premium Data warehouse Business intelligence Data management