#1 - Introduction
Once or twice every decade, the IT marketplace experiences a major innovation that shakes the entire IT industry. In recent years, Apache Hadoop has done the same thing by infusing data centres with new infrastructure
By giving the power of parallel processing to the programmer Hadoop is on such an exponential rise in adoption and its ecosystem is expanding in both depth and breadth, it is natural to ask whether Hadoop’s is going to replace traditional Data Warehouse.
Let’s see what Alasdair Anderson (Executive Vice President at Nordea ) said at a Hadoop Summit about this hot topic in the town.
"There's no relationship between the EDW and Hadoop right now — they are going to be complementary. It's NOT about rip and replaces: …show more content…
A data warehouse is a relational database that is designed for query and analysis data. It usually contains historical data derived from transaction data, but it can include data from other sources as well. It separates analysis workload from transaction workload and enables an organization to consolidate data from several sources.
In addition to a relational database, a data warehouse environment includes an extraction, transportation, transformation, and loading (ETL) solution, an online analytical processing (OLAP) engine, client analysis tools, and other applications that manage the process of gathering data and delivering it to business users.
Let’s summarize what data warehouse is -
1. Subject-oriented
A data warehouse can be used to analyse a particular subject area like sales, finance, and inventory. Each subject area contains detailed data.
2. Integrated
A data warehouse integrates data from multiple data sources. For example, dates are in the same format, male/female codes are consistent. In a data warehouse, there will be only a single way of identifying a product and they use the same customer record, not copies
3.