Mirjana Mazuran mazuran@elet.polimi.it
December 15, 2009
1/18
Outline
Data Warehouse logical design
ROLAP model star schema snowflake schema
Exercise 1: wine company Exercise 2: real estate agency
2/18
Introduction
Logical design
Starting from the conceptual design it is necessary to determin the logical schema of data We use ROLAP (Relational On-Line Analytical Processing) model to represent multidimensional data ROLAP uses the relational data model, which means that data is stored in relations Given the DFM representation of multidimensional data, two schemas are used: star schema snowflake schema
3/18
ROLAP
Star schema
Each dimension is represented by a relation such that: the primary key of the relation is the primary key of the dimension the attributes of the relation describe all aggregation levels of the dimension
A fact is represented by a relation such that: the primary key of the relation is the set of primary keys imported from all the dimension tables the attributes of the relation are the measures of the fact
Pros and Cons few joins are needed during query execution dimension tables are denormalized denormalization introduces redundancy
4/18
ROLAP
Star schema: example
MaxAmount Amount Year Month Accident Day NrOfAccidents Cost IDCustomer IdPolicy Motivation Class
Time IdTime Day Month Year Accident IdTime IdPolicy IdCustomer IdMotivation NrOfAccidents Cost Customer IdCustomer Birthday Gender City Region
Policy IdPolicy Class Amount MaxAmount
BYear Sex
City Region
Motivation IdMotivation Motivation
5/18
ROLAP
Snowflake schema
Each (primary) dimension is represented by a relation: the primary key of the relation is the primary key of the dimension the attributes of the relation directly depend by the primary key a set of foreign keys is used to access information at different levels of aggregation. Such information is part of the secondary