Preview

Omm Data Cleaning

Good Essays
Open Document
Open Document
584 Words
Grammar
Grammar
Plagiarism
Plagiarism
Writing
Writing
Score
Score
Omm Data Cleaning
Data Cleansing/Scrubbing
The concept of information cleansing / scrubbing is to improve the quality of organizational information and thus the effectiveness of decision making businesses must formulate a strategy to keep information clean. This is a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
Specialized software tools use sophisticated algorithms to parse, standardize, correct, match and consolidate data warehouse information. This is vitally important because data warehouses often contain information from several different databases, some of which can be external to the organization.
In a data warehouse, information cleansing occurs first during the ETL process and second on the information once it is in the data warehouse. Companies can choose information cleansing software from several different vendors including Oracles, SAS, Ascential Software, and Group1 Software. Ideally, scrubbed information is error free and consistent.

Text Book - Business Driven Technology - Baltzan/Philips - Page 100 - 101 Definition: Data Cleaning

A process used to determine inaccurate, incomplete, or unreasonable data and then improving the quality through correction of detected errors and omissions. The process may include format checks, completeness checks, reasonableness checks, limit checks, review of the data to identify outliers (geographic, statistical, temporal or environmental) or other errors, and assessment of data by subject area experts (e.g. taxonomic specialists). These processes usually result in flagging, documenting and subsequent checking and correction of suspect records. Validation checks may also involve checking for compliance against applicable standards, rules, and conventions.
The general framework for data cleaning (after Maletic & Marcus 2000) is: Define and determine error types; Search and identify error instances; Correct the errors; Document error instances and error

You May Also Find These Documents Helpful

  • Powerful Essays

    Ibm 211 Week 3

    • 4383 Words
    • 18 Pages

    IBM Telecommunications Data Warehouse V8.4 and IBM Health Plan Data Model V8.4 help accelerate development of cost-efficient industry data warehouse solutions…

    • 4383 Words
    • 18 Pages
    Powerful Essays
  • Powerful Essays

    MIS 563 COURSE PROJECT

    • 2795 Words
    • 12 Pages

    This finding has ABC University seeking a streamlined way to manage their data and for users to access the data that is clean. With this, the University has proposed the creation and implementation of a data warehouse to house all the data from each one of these operational databases into one central location where all students, staff and faculty can access the data using a self service tool such as a report or a data connection to Microsoft Excel to pull data into pivot tables.…

    • 2795 Words
    • 12 Pages
    Powerful Essays
  • Good Essays

    Audit and organize the data. Understanding your data before cleaning improves the efficiency of your project and reduces the time and cost of data cleaning. Understand the purpose, location, flow, and workflows of your data before you start.…

    • 522 Words
    • 3 Pages
    Good Essays
  • Satisfactory Essays

    This document is a proposal for building a data warehouse architecture that will consolidate and transform data into useful information for the purpose of decision-making and for establishing a new function that offers a broad array of decision support services to all units at ABC Retail Chain Corporation. Executives and decision-makers often need information to analyze the past, describe current circumstances, and anticipate the future. Presently, decision-makers across the Institute rely on hard copy reports or Excel Sheets to provide information. Typically, any request for information is forwarded to the operational areas of the Organization, which provide hard copy reports reflecting the data gathered in their functional area. To analyze and transform data into useful information, decision-makers and their staff have to manually re-enter the non-integrated data into their own mini-systems. This type of operation hinders the ability of decision making and the executives are either drowning in too much data with no option to analyze it or too little data, which means they are back to square one and must request additional information. Often executives receive multiple, conflicting information or information that is based on incomplete assumptions about the types of analysis required.…

    • 641 Words
    • 3 Pages
    Satisfactory Essays
  • Good Essays

    Data normalization is very important in transactional, or the online transactional processing database world where many data modifications take place constantly and randomly throughout the stored data. In contrast to that, the data warehouse will contain a substantial amount of denormalized and summarized data that is…

    • 752 Words
    • 3 Pages
    Good Essays
  • Good Essays

    | * The data warehouse of St George bank supports the integrated data among different departments * Data from different departments can be accessed freely * Integrated data from the data warehouse is more beneficial and creates more opportunities and BI for all departments (1+1=3) * “Most departments extract what they need from the warehouse using customer relationship management and BI applications without intervention.” * “They have access to all the data, can create their own filters, their own campaigns.”…

    • 341 Words
    • 1 Page
    Good Essays
  • Good Essays

    Canadian Tire Case

    • 656 Words
    • 3 Pages

    In order to reach his goal, there are many issues that need to be addressed. The first issue is that in order to ensure that the data in the data warehouse is correct, there needs to be strong data governance by all users. The 2nd concern is that users of the current systems will not use BI; they might stick to what they’re comfortable with. Another problem he came across was that one of the key sponsors of the project had left the company, which brought the project to a halt in 2004. In order to keep the project moving, it is critical that there is buy-in from the Company’s upper-management. Another crucial issue was that data was inconsistent, due to the fact that data was collected and managed differently. If this data were loaded into BI in its current state, it would be useless. Garbage in will always result in garbage out. An issue in the company culture was also present. Users were concerned that there were not enough resource to dedicate to cleaning the data. Executives were reluctant to move away from the tools they were currently using.…

    • 656 Words
    • 3 Pages
    Good Essays
  • Good Essays

    Bis Midterm Sheet

    • 1467 Words
    • 6 Pages

    A data warehouse is to extract and clean data from operational systems and other sources to store and catalog that data for processing by BI tools. Data warehouses can include external data purchased from outside sources. Meta data is kept in the data warehouse. Physically, a data warehouse consists of a few fast computers with very large storage devices.…

    • 1467 Words
    • 6 Pages
    Good Essays
  • Powerful Essays

    10. data cleansing is a critical aspect of data warehousing that includes reconciling conflicting data definitions and formats organization-wide.…

    • 2021 Words
    • 9 Pages
    Powerful Essays
  • Satisfactory Essays

    Discuss Data Quality

    • 351 Words
    • 2 Pages

    As a HIM professional data quality is very crucial within the health care industry. The HIM professional must provide accuracy when collecting patient data. Data Quality Management (DQM) is defined as the business processes that ensure the integrity of an organization's data during collection, application (including aggregation), warehousing, and analysis. AHIMA,(2012). While the health care industry still have a long road ahead in reaching their goal pertaining to the national health care data standards, there are necessary steps by providing…

    • 351 Words
    • 2 Pages
    Satisfactory Essays
  • Good Essays

    * Recipient of a Leadership in Data Warehousing Award from the Data Warehousing Institute (TDWI), the premier association for data warehousing.…

    • 1302 Words
    • 6 Pages
    Good Essays
  • Satisfactory Essays

    Confidentiality

    • 395 Words
    • 2 Pages

    The meaning of purging is to clear any unwanted information. When this information is found in the database, the patient should be notified and the doctor also. After their notification, the information is should be changed to the correct information and the patient should be notified.…

    • 395 Words
    • 2 Pages
    Satisfactory Essays
  • Best Essays

    Data Warehousing and Olap

    • 2507 Words
    • 11 Pages

    Data warehousing and on-line analytical processing (OLAP) are essential elements of decision support, which has increasingly become a focus of the database industry. Many commercial products and services are now available, and all of the principal database management system vendors now have offerings in these areas. Decision support places some rather different requirements on database technology compared to traditional on-line transaction processing applications. This paper provides an overview of data warehousing and OLAP technologies, with an emphasis on their new requirements. We describe back end tools for extracting, cleaning and loading data into a data warehouse; multidimensional data models typical of OLAP; front end client tools for querying and data analysis; server extensions for efficient query processing; and tools for metadata management and for managing the warehouse.…

    • 2507 Words
    • 11 Pages
    Best Essays
  • Satisfactory Essays

    MANAGING DATA RESOURCES

    • 1048 Words
    • 18 Pages

    c h a p t e r 7 MANAGING DATA RESOURCES 7.1 © 2002 by Prentice Hall LEARNING OBJECTIVES • COMPARE TRADITIONAL FILE ORGANIZATION & MANAGEMENT TECHNIQUES • DESCRIBE HOW DATABASE MANAGEMENT SYSTEM ORGANIZES INFORMATION * 7.2 © 2002 by Prentice Hall LEARNING OBJECTIVES • IDENTIFY TYPES OF DATABASE, PRINCIPLES OF DATABASE DESIGN • DISCUSS DATABASE TRENDS * 7.3 © 2002 by Prentice Hall MANAGEMENT CHALLENGES • TRADITIONAL DATA FILE ENVIRONMENT • DATABASE APPROACH TO DATA MANAGEMENT • CREATING DATABASE ENVIRONMENT • DATABASE TRENDS * 7.4 © 2002 by Prentice Hall MANAGEMENT CHALLENGES 1.…

    • 1048 Words
    • 18 Pages
    Satisfactory Essays
  • Best Essays

    Introducing Database System

    • 4276 Words
    • 18 Pages

    Introduction 1.1 Manual File System 1.1.1 Disadvantages of Manual File System 1.2 Computerised File System 1.2.1 Disadvantages of Computerised File System 1.3 Database System 1.4 Database 1.4.1 Characteristics of Database 1.5 Database Management System 1.5.1 Functions of Database Management System 1.5.2 Advantages of Database Management System 1.5.3 Disadvantages of Database Management System…

    • 4276 Words
    • 18 Pages
    Best Essays