Question 1: Assume a base cuboid of 10 dimensions contains only three base cells: (1) (a1, b2, c3, d4; ..., d9, d10), (2) (a1, c2, b3, d4, ..., d9, d10), and (3) (b1, c2, b3, d4, ..., d9, d10), where a_i != b_i, b_i != c_i, etc. The measure of the cube is count. 1, How many nonempty cuboids will a full data cube contain? Answer: 210 = 1024 2, How many nonempty aggregate (i.e., non-base) cells will a full cube contain? Answer: There will be 3 ∗ 210 − 6 ∗ 27 − 3 = 2301 nonempty aggregate cells in the full cube. The number of cells overlapping twice is 27 while the number of cells overlapping once is 4 ∗ 27 . So the final calculation is 3 ∗ 210 − 2 ∗ 27 − 1 ∗ 4 ∗ 27 − 3, which yields the result. 3, How many nonempty aggregate cells will an iceberg cube contain if the condition of the 4, iceberg cube is "count >= 2"? Answer: There are in total 5 ∗ 27 = 640 nonempty aggregate cells in the iceberg cube. To calculate the result: fix the first three dimensions as (***), (a1**), (*c1*), (**b3) or (*c1b3), and vary the rest seven ones. 4, How many closed cells are in the full cube? Answer: There’re 6 closed cells in the full cube: 3 base cells; (a1, *, *, d4, …, d10); (*, c2, b3, d4, …, d10) : count 2; (*, *, *, d4, .., d10): count 3. Question 2: (Half open questions, make sure your algorithm and assumptions are correct, no need to be very specific) Suppose a base cuboid has the following tuples:
A B C D Count Sales a1 b1 c1 d1 1 a1 b2 c2 d1 1 a1 b3 c1 d2 1 a2 b4 c1 d2 1 a2 b3 c2 d3 1 6 4 2 10 12
1, Show the representative steps to demonstrate how a complete data cube (with Count and SUM(Sales) as measures) is computed by the multiway array aggregation algorithm; Answer (from fang2): Suppose dimensions A, B, C, D are organized into 2, 4, 2, 3 partitions respectively. So in total there are 2*4*2*3 = 48 chunks. The cardinality of dimensions A, B, C, D is 2, 4, 2, 3 respectively, i.e. A and C have the smallest size, followed by D, and lastly B has