OUTLINE CHAPTER 10 Recursive Hierarchies Table of contents - - PowerPoint PPT Presentation

outline chapter 10
SMART_READER_LITE
LIVE PREVIEW

OUTLINE CHAPTER 10 Recursive Hierarchies Table of contents - - PowerPoint PPT Presentation

OUTLINE CHAPTER 10 Recursive Hierarchies Table of contents Recursive Hierarchies and Bridges Reporting Challenge Flattening a Recursive Hierarchy The Hierarchy Bridge Double Counting Resolving Many to Many Dinesh Maharjan


slide-1
SLIDE 1

CHAPTER 10 Recursive Hierarchies and Bridges

Dinesh Maharjan G20169202 Intelligent Information System Lab University of Seoul 2017.05.08

Table of contents

Recursive Hierarchies Reporting Challenge Flattening a Recursive Hierarchy The Hierarchy Bridge Double Counting Resolving Many to Many Impact of Changes to Hierarchical bridge Ripple Effect Variation of Hierarchical Bridge

2

OUTLINE

Table of contents

3

Recursive Hierarchies

Attribute Hierarchies describes relationships between columns in a dimension table. For eg: in product dimension, each product have brand, brand have category. Such hierarchy can be defined within the table or separate table. But Instance Hierarchy cannot be defined within single table. Instance Hierarchy describes relationships between rows of a dimension

Table of contents

4

Rows Referring to other Rows

Instance hierarchy is a set of relationship among individual rows of a table. This is recursive, because it is self-referring relationship For eg one company may own another company, and another company may own other companies.

slide-2
SLIDE 2

Table of contents

5

Recursive Hierarchy

Table of contents

6

Balanced Vs Unbalanced Hierarchy

Balanced contains fixed number of levels All attribute hierarchies are balanced For eg: product dimension has fixed level of

  • hierarchies. Product, brand, category

Instance hierarchies are unbalanced hierarchy as it rarely contains fixed number of levels For eg: Company dimension has no fixed level

  • f hierarchy.

Table of contents

7

The Reporting Challenge

Looking Down: to summarize the facts grouping particular member in recursive dimension Looking Up: summarizes the facts up the recursive dimension using certain member For this traversing the recursive dimension is required which is difficult

Table of contents

8

The Reporting Challenge

For eg: if we want to generate total sales report at company 2 i.e. Total sales made by 2,3,4 Cannot use SQL to answer this question. GROUP BY function is difficult to be used as traversal of series of recursive relationships is required to summarize the fact

slide-3
SLIDE 3

Table of contents

9

The Reporting Challenge Table of contents

10

Flattening a Recursive Hierarchy

Flattened hierarchy looks and behaves like attribute hierarchy Requires creation of new attributes that represents fixed number of levels This is done during ETL process None need to traverse the recursive relationship Facts can be summarized at various levels by grouping at desired level

Table of contents

11

Flattening a Recursive Hierarchy

Table of contents

12

Drawbacks of Flattening

Flattening creates artificial attribute hierarchy Backfilling is Necessary by its own identity It allows query to group data without losing any transaction Looking up and Looking down is still difficult Fixed number of levels only If company 8 acquires another company, cannot accommodate

slide-4
SLIDE 4

Table of contents

13

Hierarchy Bridge

Allows to aggregate the facts at any point of hierarchy No need of subqueries Looking up and Looking down within the recursive hierarchy Captures recursive instance relationship instead of dimension table

Table of contents

14

Structure of Hierarchy Bridge

two foreign keys, one superior_company_key represents upper level company Subordinate_company_key represents lower level company Optional column levels removed describes number of levels separated

Table of contents

15

Structure of Hierarchy Bridge

Table of contents

16

Content of Hierarchical Bridge

Captures relationship between rows in dimension table Done during ETL process One row for each of its direct subordinates One row for each of its indirect subordinates One row for itself

slide-5
SLIDE 5

Table of contents

17

Content of Hierarchical Bridge

Table of contents

18

Looking Down

Aggregates at its level and levels below it Subordinate key will join fact table Superior key join the dimension table For eg if we want to aggregate order down the company 5 Select sum(order_facts.order_dollars) Where bridge.superior=5 and ……….

Table of contents

19

Looking Down

Table of contents

20

Looking Up

Aggregates at its level and levels up it Subordinate key will join dimension table Superior key join the fact table For eg if we want to aggregate order up the company 6 Select sum(order_facts.order_dollars) Where bridge.subordinate=6 and ……

slide-6
SLIDE 6

Table of contents

21

Looking Up

Table of contents

22

Double Counting

Many to Many relationship between fact and bridge table So possibility of double counting For eg, we want to aggregate orders at or below 5, then companies 7 and 8 are subordinates for both 5 and 6. so they will be counted twice. Miscalculation occurs Looking down- avoided by selecting a single upper level member

  • r by grouping results by upper level members. ( for 6 look down

group by 5 and 1)

Table of contents

23

Double Counting

Looking Up- avoided by selecting a single lower level member or by grouping results by lower level members For eg: if we want to look up at 6 then put constraint where subordinate company key=6 or Put Group by 7, 8

Table of contents

24

Double Counting

slide-7
SLIDE 7

Table of contents

25

Hiding the Bridge from Novice

High probability of bad queries by novice user To avoid this, bridge table can be withhold from such users. One to many relationship is created between dimension and fact table Can generate reports without hierarchy

Table of contents

26

Hierarchy Bridge to RDBMS

Many to many relationship between facts and bridge table Most of RDBMS cant accommodate such relationship. So, to deal with such condition, dimension table is kept between facts table and hierarchical bridge table One to many relationship is created between dimension and facts table using primary key at dimension and foreign at fact Similarly one to many between dimension and bridge

Table of contents

27

Hierarchy Bridge to RDBMS

This has still problem in hierarchical report generation

Table of contents

28

Hierarchy Bridge to Business Intelligence

Business intelligence requires, meaningful query results from SQL For this we alias company table twice One for superior company and one for subordinate company Subordinate kept between fact and bridge while superior kept beside bridge Here superior alias is subject of query i.e. point to start query Superior company is linked with superior key of bridge Subordinate company is linked with subordinate key of bridge

slide-8
SLIDE 8

Table of contents

29

Hierarchy Bridge to Business Intelligence

Table of contents

30

Hierarchy Bridge to Business Intelligence

Table of contents

31

Looking Up without many to many

Superior company is kept between bridge and fact table Subordinate is kept beside bridge Superior company is linked with superior key of bridge and fact Subordinate company is linked with subordinate key of bridge Here subordinate company represents subject of query or point from which query starts

Table of contents

32

Looking Up without many to many

slide-9
SLIDE 9

Table of contents

33

Type 1 Change in Dimension

Bridge organization must respond to changes in data sources Otherwise, it wont generate correct and updated reports Type 1 change does not require to preserve historic context When type 1 occurs, simply update the dimension row There is not impact of type 1 change on bridge and query techniques

Table of contents

34

Type 1 Change in Hierarchy

If change in hierarchy occurs, delete all the rows in the bridge table relating to the change And replace them with a new set of rows

Table of contents

35

Type 2 Change in Dimension

Response to the type 2 change creates a version of history It means there will be row for old data and new row for new data. A query can ignore the history version by using natural key Ripple effect occurs. It means change in one member of hierarchy affects all members of hierarchy

Table of contents

36

Mechanism of Type 2 change Response

Create new row in dimension table for changed member Create new row for all other members in hierarchy Create new rows in bridge table for all these new dimension rows For eg: company E has moved its headquarter location

slide-10
SLIDE 10

Table of contents

37

Mechanism of Type 2 Response

Even though E only changed, two version of companies exists.

Table of contents

38

Reason for Ripple Effect

If we only change E to E-1 and want to look down from E, then we will count orders of the subordinates F, G, H for both E and E-1 twice and occurs double count

Table of contents

39

Type 2 Changes to Hierarchy

When hierarchy changes, create new rows in dimension table for all the members Create new rows in bridge table reflecting new status of hierarchy

Table of contents

40

Reason for Ripple Effect

Suppose company H is sold and if we still associate company H-1 with same hierarchy Then looking up from H, can provide the transaction mad by A,E,F that occurred after H is sold When we only want orders made by H-1, we cannot avoid the

  • rders of H as they both use same natural key
slide-11
SLIDE 11

Table of contents

41

Reason for Ripple Effect

Table of contents

42

Reason for Ripple Effect

Here, even though we use natural key for H, the facts after change will not be counted The orders after change will be associated with new members

Table of contents

43

Variations of Hierarchy Bridge

Variation of bridge can be constructed by addition effective and expiration dates Additional flags We can design bridge with multiple parents These all are called embellishing of bridge

Table of contents

44

Effective and Expiration date

Two columns for effective and expiration data can be added in bridge table It is similar to time stamped dimensions It allows to study hierarchy analysis over time

slide-12
SLIDE 12

Table of contents

45

Adding Flags

Allows to describe hierarchy further It tells whether the superior company is at top or bottom of hierarchy It also tells where the subordinate company is at top or bottom of company

Table of contents

46

Adding Flags

Table of contents

47

Multiple Parents

A node in hierarchy may have more than one parent If two company owns same company then the shared company have two parents We can use same looking up and looking down techniques But levels removed column should be removed

Table of contents

48

Multiple Hierarchies

Some member of dimension can participate in more than one hierarchy For eg: department may have budgetary hierarchy and also chain of command hierarchy In such case separete bridge table must be created for each hierarchy During report generation, appropriate bridge is joined with appropriate dimension and fact table

slide-13
SLIDE 13

Table of contents

49

THANK YOU