Fjord Bridging the between Gap OLAP SQL and - - PowerPoint PPT Presentation

fjord
SMART_READER_LITE
LIVE PREVIEW

Fjord Bridging the between Gap OLAP SQL and - - PowerPoint PPT Presentation

Fjord Bridging the between Gap OLAP SQL and Jens-Peter Dittrich 1, * Donald Kossmann 1,2 Alexander Kreutz 2 1 ETH Zurich, Switzerland *Affiliation 2003-2004: SAP BW OLAP Technology 2 i-TV-T AG, Germany OLAP: The CEOs


slide-1
SLIDE 1

Bridging the between

Jens-Peter Dittrich1,* Donald Kossmann1,2 Alexander Kreutz2

1ETH Zurich, Switzerland

*Affiliation 2003-2004: SAP BW OLAP Technology

2i-TV-T AG, Germany

Gap SQL and OLAP

Fjord

slide-2
SLIDE 2

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

OLAP: The CEO’s View

2

slide-3
SLIDE 3

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

OLAP: The Ph.D.’s View

  • 1. OLAP is just another application on top of existing DBMS.
  • 2. DBMS performance for OLAP is great.
  • 3. The relational model is well suited for OLAP.
  • 4. SQL is a great language for doing OLAP.

3

OLAP Engine RDBMS/SQL OLAP Client RDBMS/SQL OLAP Client

?

Not true! Not true! Not true! Not true!

slide-4
SLIDE 4

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi
  • 1. Result Formatting
  • 2. Query Processing
  • 1. Joins
  • 2. Aggregations
  • 3. Pivot and Cube Computation
  • 4. Caching
  • 3. Misc
  • 1. Currency Conversions
  • 2. Summarizability Checks
  • 3. Authorization
  • 4. ...

Good!

Bad!

OLAP Engine

What do OLAP Engines do?...

4

RDBMS/SQL OLAP Client ...OLAP engines bridge the gap between OLAP and SQL!

slide-5
SLIDE 5

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

The Gap

  • Relational Model
  • Pivot Tables

5

SQL OLAP

Bridging the Gap Closing the Gap

slide-6
SLIDE 6
  • 1. April 2005
  • Dr. Jens-Peter Dittrich/Institut für Informationssysteme

State Customer Product Profit S1 C1 P1 1.0 S1 C1 P2 1.0 S1 C2 P1 1.0 S1 C2 P2 1.0 S2 C1 P1 1.0 S2 C1 P2 1.0 S2 C2 P1 1.0 S2 C2 P2 1.0

Input

Relational Model

Result of SQL Rollup

slide-7
SLIDE 7

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: NULL-values

7

Take care NULL has two different meanings in SQL:

  • 1. Aggregate (from Rollup operation)
  • 2. Value does not exist (e.g. from outer joins)

The semantics of a NULL-value can be obtained calling GROUPING().

  • 1. differentiate between different semantics of NULL

Result of SQL Rollup

slide-8
SLIDE 8

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: order on rows

  • 1. differentiate between different semantics of NULL
  • 2. assume order on result table

8

The result table is ordered lexicographically. Result of SQL Rollup

slide-9
SLIDE 9

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: multi columns

  • 1. differentiate between different semantics of NULL
  • 2. assume order on result table
  • 3. merge columns containg NULLs to form multi column cells

9

Result of SQL Rollup Interpreting NULL- values as multi columns

slide-10
SLIDE 10

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: multi rows

  • 1. differentiate between different semantics of NULL
  • 2. assume order on result table
  • 3. merge columns containg NULLs to form multi column cells
  • 4. merge adjacent rows containing similar values to form multi row cells

10

Result of SQL Rollup Interpreting NULL- values as multi columns Interpreting adjacent similar values as multi rows

slide-11
SLIDE 11

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: order on columns

  • 1. differentiate between different semantics of NULL
  • 2. assume order on result table
  • 3. merge columns containg NULLs to form multi column cells
  • 4. merge adjacent rows containing similar values to form multi row cells
  • 5. interpret order on columns as hierarchy

11

These cells look different if we chose a different order

  • n the dimensions, e.g.

Customer, State, Product

slide-12
SLIDE 12

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: Pivot Tables

  • Pivot operation moves at least one of the attributes to the columns
  • Some of the sums of the pivot are not part of the rollup

e.g. (∑∑,P1), (∑∑,P2).

  • We have to use CUBE() here.
  • Should we still call this a “table”?

12

Pivot attribute values become attribute names

slide-13
SLIDE 13

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Relational Model: Summary

  • 1. differentiate between different semantics of NULL
  • 2. assume order on result table
  • 3. merge columns containg NULLs to form multi column cells
  • 4. merge adjacent rows containing similar values to form multi row cells
  • 5. interpret order on columns as hierarchy
  • 6. let attribute values become attribute names

13

State Customer Product Profit S1 C1 P1 1.0 S1 C1 P2 1.0 S1 C2 P1 1.0 S1 C2 P2 1.0 S2 C1 P1 1.0 S2 C1 P2 1.0 S2 C2 P1 1.0 S2 C2 P2 1.0

SQL to Pivot Recipe

slide-14
SLIDE 14

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

14

Closing the Gap The Gap Bridging the Gap

slide-15
SLIDE 15

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Example: BTell’s Operator Model

15

Cache Sort Group Drill-down Filter Filter Observable Result Cache Cache Fetch Split Format Link Align

XML Convert

Cell Merge

HTML Convert Excel Convert x xy y x, xy, y x xy y merged data

3S operators 1S operators RDBMS Client Client Client Btell OLAP Engine Pivot

x xy y x xy y x xy y x xy y x xy y x xy y x xy y
  • BTell has two different

types of operators:

1S: returns one output stream [Graefe: Volcano] 3S: returns three output streams:

  • 1. one for the x-axis
  • 2. one for the y-axis
  • 3. one for the xy-axis
slide-16
SLIDE 16

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

More Examples

16

  • 1. Caching (Special Caching Operator)
  • 2. Pivot Computation (based on 3S operator model)
  • 3. Check for Computability of Aggregates

see paper

slide-17
SLIDE 17

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

17

The Gap Bridging the Gap Closing the Gap

slide-18
SLIDE 18

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

18

slide-19
SLIDE 19

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

19

I m p r

  • b

a b l e

slide-20
SLIDE 20

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

  • 2. Make a new query language

Hard to agree upon Works only for part of the market (see e.g. MDX)

20

I m p r

  • b

a b l e

slide-21
SLIDE 21

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

  • 2. Make a new query language

Hard to agree upon Works only for part of the market (see e.g. MDX)

21

I m p r

  • b

a b l e I m p r

  • b

a b l e

slide-22
SLIDE 22

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

  • 2. Make a new query language

Hard to agree upon Works only for part of the market (see e.g. MDX)

  • 3. Ride the XML wave

Recent proposal for ‘group by’ in XQuery from IBM [SIGMOD 2005]

22

I m p r

  • b

a b l e I m p r

  • b

a b l e

slide-23
SLIDE 23

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

How to reach OLAP Heaven?

  • 3 possible paths to follow:
  • 1. Add even more OLAP stuff to SQL

Open questions: How to handle non-relational data? (nested relations?)

  • 2. Make a new query language

Hard to agree upon Works only for part of the market (see e.g. MDX)

  • 3. Ride the XML wave

Recent proposal for ‘group by’ in XQuery from IBM [SIGMOD 2005]

23

I m p r

  • b

a b l e I m p r

  • b

a b l e P r

  • b

a b l e !

slide-24
SLIDE 24

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Why XML/XQuery?

  • 1. Great data model
  • 2. Powerful query language
  • 3. Highly extensible

24

slide-25
SLIDE 25

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Example: Pivot Result

25

slide-26
SLIDE 26

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery+OLAP Extensions

  • Goal

Get 95% of the OLAP query processing pushed into the DBMS (today: 10-20%)

  • Consequences
  • Only minimal data transfer between DBMS

and client

  • Overhead introduced by XML is negligible
  • Coding an OLAP engine becomes much

easier

26

RDBMS/SQL OLAP Client OLAP Engine

OLAP Engine

slide-27
SLIDE 27

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: ON ROWS, ON COLUMNS

  • Specify where to place dimensions:

27

slide-28
SLIDE 28

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: ROLLUP

  • Provide Statements for Cube, Rollup, Pivot:

28

slide-29
SLIDE 29

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: MDVIEW (1/2)

  • Provide Statements for multidimensional results:

29

slide-30
SLIDE 30

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: MDVIEW (2/2)

30

slide-31
SLIDE 31

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: SESSIONS, DEFINE&REDEFINE

  • Enable declaration of XQuery sessions:

31

slide-32
SLIDE 32

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: SESSIONS, NOTIFY (1/2)

  • Subscribe to changes (Observer-pattern):

32

slide-33
SLIDE 33

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

XQuery/OLAP: SESSIONS, NOTIFY (2/2)

  • Could be used to implement push-based OLAP:

33

slide-34
SLIDE 34

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Conclusions

  • OLAP engines replicate DBMS functionality
  • Reason: SQL/relational model is

not powerful enough for OLAP

  • Observations
  • XML/XQuery works
  • But: XQuery needs some extensions
  • Create XQuery/OLAP language proposal
  • Build prototype query engine that implements our proposal

34

Future Work

SQL OLAP

slide-35
SLIDE 35

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Thanks for your attention!

Contact Author jens.dittrich @ inf.ethz.ch www.inf.ethz.ch/~jensdi

35

Questions?

More about my research on Thursday, Software Demo Group 6

slide-36
SLIDE 36

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Backup Slides

36

slide-37
SLIDE 37

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

The Gap

  • Relational Model
  • Pivot Tables

37

SQL OLAP

Bridging the Gap Closing the Gap

slide-38
SLIDE 38

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi
  • Typical OLAP session is as follows:
  • 1. User selects initial query
  • 2. User navigates through the data by doing either a
  • roll-up
  • drill-down
  • slice
  • etc.
  • 3. Goto 2.
  • However, every time the user navigates, i.e. the query gets

altered, the DBMS receives the entire query definition.

  • DBMS has no notion of navigation.

Navigation (1/2)

38

slide-39
SLIDE 39

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Navigation (2/2)

  • Navigation is best explained by transition between states
  • How come query languages do not support this?
  • Could easily be exploited to optimize caching on all tiers!

39 Dimensions:

Customer Product

Dimensions:

Customer

Roll-up Drill-down

Dimensions:

Roll-up Drill-down Drill-down Roll-up by Product by Customer by Customer, Product

slide-40
SLIDE 40

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Caching (1/2)

  • From a 10,000 feet perspective all tiers do the same
  • 1. Receive and store some input data
  • 2. Perform algebraic query processing and optimization on the data
  • 3. Store some output data, send some of it to the next tier

40

OLAP Client

{ Cubes, Rollups, Pivots } { HTML, XML, Excel }

OLAP Engine

{ Cubes, Rollups, Pivots } { Materialized Views }

RDBMS/SQL RDBMS/SQL

{ Materialized Views } { Tables }

10,000 feet perspective

slide-41
SLIDE 41

08/30/05

  • Dr. Jens-Peter Dittrich et.al.: Bridging the Gap between OLAP and SQL / ETH Zurich / www.inf.ethz.ch/~jensdi

Caching (2/2)

  • All caches outside the DBMS have to be kept in sync

manually

  • All caching outside DBMS has to be hand-coded
  • This is cumbersome and error-prone.

41

OLAP Client

{ Cubes, Rollups, Pivots } { HTML, XML, Excel }

OLAP Engine

{ Cubes, Rollups, Pivots } { Materialized Views }

RDBMS/SQL RDBMS/SQL

{ Materialized Views } { Tables }

10,000 feet perspective

Too bad: