FedDW Global Schema Architect UML-based Design Tool for the - - PowerPoint PPT Presentation

feddw global schema architect
SMART_READER_LITE
LIVE PREVIEW

FedDW Global Schema Architect UML-based Design Tool for the - - PowerPoint PPT Presentation

FedDW Global Schema Architect UML-based Design Tool for the Integration of Data Mart Schemas Dr. Stefan Berger Department of Business Informatics Data & Knowledge Engineering Johannes Kepler University Linz ACM 15 th DOLAP 12


slide-1
SLIDE 1

FedDW Global Schema Architect

UML-based Design Tool for the Integration of Data Mart Schemas

  • Dr. Stefan Berger

Department of Business Informatics – Data & Knowledge Engineering Johannes Kepler University Linz

ACM 15th DOLAP ’12 — November 2, 2012

slide-2
SLIDE 2

FedDW Approach Tool Support: FedDW Tool Suite

Outline

1

FedDW Approach

2

Tool Support: FedDW Tool Suite

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 2 / 31

slide-3
SLIDE 3

FedDW Approach Tool Support: FedDW Tool Suite

1

FedDW Approach General overview of FedDW Integrating heterogeneous multidimensional schemata

2

Tool Support: FedDW Tool Suite

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 3 / 31

slide-4
SLIDE 4

FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW

Problem definition; our contribution

Problem: similar autonomous data marts/DWs, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions ⇒ Preexisting DW data across autonomous organizations Contribution: comprehensive tool suite for integration of autonomous data marts/DWs Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006]

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 4 / 31

slide-5
SLIDE 5

FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW

Problem definition; our contribution

Problem: similar autonomous data marts/DWs, but heterogeneous schemata and/or data Business collaboration Mergers and acquisitions ⇒ Preexisting DW data across autonomous organizations Contribution: comprehensive tool suite for integration of autonomous data marts/DWs Visual integration of multidimensional schemas OLAP front-end prototype, based on SQL-MDi [Berger and Schrefl, 2006]

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 4 / 31

slide-6
SLIDE 6

FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW

Motivating example

Telecommunications sector—sample, heterogeneous conceptual data mart schemas:

customer p_name date month year product prod_name regular_fee

duration tn_tel tn_misc connections

category

dur_min turnover connections

month quarter year customer products customers dates

red blue

date date/hr contract_type base_fee date customer cust_name contract_type base_fee age_grp product prod_name products promo promotion promo_type

Dimensionality (extra dimension blue.promotion) Hierarchy of date dimensions Decorations of product dimensions Measures of connections facts

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 5 / 31

slide-7
SLIDE 7

FedDW Approach Tool Support: FedDW Tool Suite General overview of FedDW

Motivating example

Telecommunications sector—sample, heterogeneous conceptual data mart schemas:

customer p_name date month year product prod_name regular_fee

duration tn_tel tn_misc connections

category

dur_min turnover connections

month quarter year customer products customers dates

red blue

date date/hr contract_type base_fee date customer cust_name contract_type base_fee age_grp product prod_name products promo promotion promo_type

Dimensionality (extra dimension blue.promotion) Hierarchy of date dimensions Decorations of product dimensions Measures of connections facts

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 5 / 31

slide-8
SLIDE 8

FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata

Conflict classification I

re

Modeling Scope

re

stance Dimension Instance („Members“) Cube Instance („Cells“) Ins („ ) Conflicts („ ) Conflicts Schema- Instance Conflicts Model Entity Schema Dimension Schema Conflicts Cube Schema Conflicts

ts

Model Entity Dimension Cube

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 6 / 31

slide-9
SLIDE 9

FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata

Conflict classification II

Facts: conflicts Relevant operator of FedDW Merge measures: PIVOT MEASURES (Fact) Schema-instance Split measures: PIVOT SPLIT MEASURES (Fact) Dimensionality Choose attributes: add DIM reference (Cube) Different measures Choose measures: add MEASURE reference (Cube) Domain (measures) Convert domain: CONVERT MEASURES APPLY ... (Measure) Naming of attributes Rename attributes: operator “–> ...” (Measure, Dimension) Base levels Roll-up dimension attributes: ROLLUP TO LEVEL ... (Dimension) Cube cells (fact extensions) Join cubes: MERGE CUBES (n-ary) Derive measure values: AGGREGATE MEASURE (n-ary)

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 7 / 31

slide-10
SLIDE 10

FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata

Conflict classification III

Dimensions: conflicts Relevant operator of FedDW Hierarchies Map corresponding levels: add level reference [...] (Dimension) Domain (levels / decorations) Convert domain: CONVERT ATTRIBUTES APPLY ... (Dimension) Naming (levels) Rename attributes: operator “–> ...” (Level) Naming (decorations) Map decorations: MATCH ATTRIBUTES (under Merge Dimensions—n-ary) Members (dim. extensions) Merge sets of members: MERGE DIMENSIONS (n-ary) Roll-up functions Overwrite hierarchies: RELATE Expression (under Merge Dimensions clause—n-ary) Decoration values Correct values: add RENAME function (under Merge Dimensions clause—n-ary)

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 8 / 31

slide-11
SLIDE 11

FedDW Approach Tool Support: FedDW Tool Suite Integrating heterogeneous multidimensional schemata

Integration workflow

Establish a federation of autonomous data marts:

1

Import data mart schemas (CWM supported)

(Optional: enrich roll-up hierarchies ⇒ minimum match integration strategy)

2

Design global multidimensional schema (canonical model)

3

Define semantic mappings – both-as-view paradigm [see McBrien and Poulovassilis, 2003]

(a) Resolve schema–instance conflicts (b) Intensional integration – map conceptual schemata

Fact tables Dimension tables + hierarchies

(c) Extensional integration – consolidate data

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 9 / 31

slide-12
SLIDE 12

FedDW Approach Tool Support: FedDW Tool Suite

1

FedDW Approach

2

Tool Support: FedDW Tool Suite FedDW Global Schema Architect FedDW Query Tool

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 10 / 31

slide-13
SLIDE 13

FedDW Approach Tool Support: FedDW Tool Suite

Overview of FedDW tool support I

Java- and Eclipse-based interactive tool suite (EMF, GMF, UML2) Visual data mart integration: FedDW Global Schema Architect (GSA) OLAP front-end prototype: FedDW Query Tool [Berger and Schrefl, 2009] Auxiliary components: Metadata Dictionary, Dimension Repository

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 11 / 31

slide-14
SLIDE 14

FedDW Approach Tool Support: FedDW Tool Suite

Overview of FedDW tool support II

OLAP Application DM n

  • Dimension

repository Federated DW System

?

User Query (SQL)

DM 2 DM 1 Query Tool Import Schemas Global schema Mappings (SQL-MDi)

Global Schema Architect

Meta-data dictionary SQL-MDi Parser SQL-MDi Processor Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 12 / 31

slide-15
SLIDE 15

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

Overview of FedDW GSA

Visual design environment for multidimensional schemas Schema Editor — nested UML diagrams

Import schemas Global schema

Mapping Editor — graphical, high-level code editor (Master–Detail layout)

Import mappings: unary operators (Fact, Dimension entities) — intensional Global mappings: n-ary operators — extensional

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 13 / 31

slide-16
SLIDE 16

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

Sample GSA Workflow

1

Import local, autonomous connections schemas

2

Design global connections schema

3

Create import mappings

4

Create one global mapping file

5

Export the mappings to metadata repository

6

Export fact and dimension metadata

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 14 / 31

slide-17
SLIDE 17

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 1, Import Wizard I

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 15 / 31

slide-18
SLIDE 18

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 1, Import Wizard II

Wizard suggests appropriate UML stereotypes (based on PK/FK constraints):

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 16 / 31

slide-19
SLIDE 19

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 1, Import Wizard III

Initialized class diagram of red.connections:

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 17 / 31

slide-20
SLIDE 20

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 2, Global Schema Editor

Global Schema wizard: Comfortably create global schema as copy of one import schema Edit the schema later

  • n

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 18 / 31

slide-21
SLIDE 21

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA Schema Editors: edit dimension

User-friendly editing of UML diagram possible (context menus, UML palette):

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 19 / 31

slide-22
SLIDE 22

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 3, Import Mappings I

Recall the heterogeneities among red and global:

customer p_name date month year product prod_name regular_fee

duration tn_tel tn_misc connections

category

dur_min turnover connections

month year customer products customers dates

red global

date date/hr contract_type base_fee date customer cust_name contract_type base_fee product prod_name products category

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 20 / 31

slide-23
SLIDE 23

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Step 3, Import Mappings II

Repair red.connections import schema: Dimensionality: add references to all three dimensions Date hierarchy: roll-up to LEVEL [date] Product decorations: delete regular_fee from Red’s import schema Measures (schema–instance conflict): PIVOT MEASURES tn_tel, tn_misc

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 21 / 31

slide-24
SLIDE 24

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

Import Mappings: dimensionality

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 22 / 31

slide-25
SLIDE 25

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

Import Mappings: hierarchy

Prerequisite: delete level [date/hr] from red.date (see Schema Editor, slide 21)

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 23 / 31

slide-26
SLIDE 26

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

Import Mappings: pivot measures

“Merge” measures tn_tel, tn_misc into turnover, extracting values “tn_tel”, “tn_misc” as members of the new red.category dimension:

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 24 / 31

slide-27
SLIDE 27

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA Step 4, Global Mapping I

Merge Dimensions:

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 25 / 31

slide-28
SLIDE 28

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA Step 4, Global Mapping II

Merge Cubes:

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 26 / 31

slide-29
SLIDE 29

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA Step 5–6, export project metadata

Step 5: Export mapping file — export wizard Starts generation of SQL-MDi code Static syntax check Interface to FedDW Query Tool: file system Step 6: populate Metadata Dictionary Facts + dimensions conceptual and physical metadata Later accessed by FedDW Query Tool

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 27 / 31

slide-30
SLIDE 30

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA Step 5–6, export project metadata

Step 5: Export mapping file — export wizard Starts generation of SQL-MDi code Static syntax check Interface to FedDW Query Tool: file system Step 6: populate Metadata Dictionary Facts + dimensions conceptual and physical metadata Later accessed by FedDW Query Tool

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 27 / 31

slide-31
SLIDE 31

FedDW Approach Tool Support: FedDW Tool Suite FedDW Global Schema Architect

GSA: Summary

Intelligent features: Import heuristics: analyzes PK/FK constraints in import schemas to suggest adequate UML stereotypes Create global schema as copy of one import schema User-friendly and intuitive UML notation Visual conversion modeling avoids “cheap” SQL-MDi syntax errors Automatically populates Dimension Repository from the exported metadata Supports the CWM standard [Poole, 2003]

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 28 / 31

slide-32
SLIDE 32

FedDW Approach Tool Support: FedDW Tool Suite FedDW Query Tool

Query Tool in a Nutshell

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 29 / 31

slide-33
SLIDE 33

Literatur

FedDW Global Schema Architect Thanks for your attention!

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 30 / 31

slide-34
SLIDE 34

Literatur

References

Berger, S. and Schrefl, M. (2006). Analysing multi-dimensional data across autonomous data warehouses. In Tjoa, A. M. and Tho, N., editors, DaWaK, pages 120–133. Berger, S. and Schrefl, M. (2009). FedDW: A tool for querying federations of data

  • warehouses. In ICEIS (1).

McBrien, P . and Poulovassilis, A. (2003). Data integration by bi-directional schema transformation rules. In Dayal, U., Ramamritham, K., and Vijayaraman, T. M., editors, ICDE, pages 227–238. IEEE Computer Society. Poole, J. M. (2003). Common Warehouse Metamodel Developer’s Guide. John Wiley & Sons, Inc., New York, NY, USA.

Stefan Berger (Univ. Linz) FedDW Global Schema Architect DOLAP – Nov. 2, 2012 31 / 31