OctopusDB Towards a one-size-fits-all Architecture for Database - - PowerPoint PPT Presentation

octopusdb
SMART_READER_LITE
LIVE PREVIEW

OctopusDB Towards a one-size-fits-all Architecture for Database - - PowerPoint PPT Presentation

Information Systems Group OctopusDB Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal Supervisor: Prof. Dr. Jens Dittrich May 31, 2010 Database Landscape Information Systems Group OLAP Streaming Archival System


slide-1
SLIDE 1 Information Systems Group

OctopusDB

Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal

Supervisor: Prof. Dr. Jens Dittrich May 31, 2010

slide-2
SLIDE 2

May 31, 2010 Alekh Jindal

Information Systems Group

Database Landscape

2

OLTP OLAP

Streaming System Archival System Search Engine

slide-3
SLIDE 3

May 31, 2010 Alekh Jindal

Information Systems Group

Database Landscape

2

OLTP OLAP

Streaming System Archival System Search Engine

Information System

slide-4
SLIDE 4

May 31, 2010 Alekh Jindal

Information Systems Group

Posting Lists

Example: Banking

3

OLTP OLAP

Streaming System Archival System Search Engine Real-time Transactions Stock Trading Older Transactions Customer Service Business Analytics

Relational Table Relational Table Flat File Web Pages Relational Table

Row Layout Column Layout Row Layout Raw Text

Several Applications Evolving Applications

Zoo

ETL style data pipelines Licensing Cost DBA Cost Maintenance Cost Eventual Integration Hard-coded optimizations Hard-coded data layouts Scenario Engineering Cost Required Performance Integration Cost Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion Drawbacks

slide-5
SLIDE 5

May 31, 2010 Alekh Jindal

Information Systems Group

Hard-coded Data Layouts

4

Workload

Type Fraction of Attribute Tuple Selectivity Query Query Query Update 0.2 1.0 0.75 1.0 0.001 0.1 1.0 0.1

Bad Good Bad Good Good Bad Bad Bad Good Good Bad Bad

OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion Row Column

Fractured Mirrors

Motivation

slide-6
SLIDE 6

May 31, 2010 Alekh Jindal

Information Systems Group

Hard-coded Data Layouts

4

Workload

Type Fraction of Attribute Tuple Selectivity Query Query Query Update 0.2 1.0 0.75 1.0 0.001 0.1 1.0 0.1

Bad Good Bad Good Good Bad Bad Bad Good Good Bad Bad

OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion Row Column

Fractured Mirrors

Inflexibility

Motivation

slide-7
SLIDE 7

May 31, 2010 Alekh Jindal

Information Systems Group

OctopusDB Core Idea

  • No fixed store
  • Why not have a flexible storage depending
  • n the workload
  • Pick the storage appropriate for the use-case
  • Emulate a variety of systems

5

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-8
SLIDE 8

May 31, 2010 Alekh Jindal

Information Systems Group

OctopusDB

6

Storage Views

Holistic Storage View Optimizer Mimic Several Systems Motivation Storage Views Holistic SV Optimizer Experiments Conclusion OctopusDB

slide-9
SLIDE 9

May 31, 2010 Alekh Jindal

Information Systems Group

System Architecture

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog

7

Motivation Storage Views Holistic SV Optimizer Experiments Conclusion OctopusDB

slide-10
SLIDE 10

May 31, 2010 Alekh Jindal

Information Systems Group

Storage Views

  • No hard-coded store in OctopusDB
  • All operations recorded in a primary log on

stable storage using WAL

  • Storage Views: arbitrary physical

representations

  • Different storage layouts under a single

umbrella

8

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-11
SLIDE 11

May 31, 2010 Alekh Jindal

Information Systems Group

Storage View Examples

  • Log SV
  • Row SV
  • Column SV
  • Index SV

9

  • Partial Index SV
  • Bag-partitioned SV
  • Key-consolidated SV
  • Vertically/Horizontally Partitioned SV
  • ... any hybrid combination of the above

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion

Primary Secondary

Storage Views

slide-12
SLIDE 12

May 31, 2010 Alekh Jindal

Information Systems Group

Use-case Scenario*

  • Flight booking system
  • Tables: Tickets, Customers
  • Tickets: several attributes, frequently updated
  • Customers: fewer attributes
  • Queries:


SELECT C.*
 FROM Tickets T, Customers C
 WHERE T.customer_id=C.id AND T.a1=x1 AND T.a2=x2 ... AND T.an=xn

10

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion * Inspired from Unterbrunner et al. in PVLDB, 2009. Storage Views

slide-13
SLIDE 13

May 31, 2010 Alekh Jindal

Information Systems Group

Flight Booking System

Log SV

Result

tickets.customer_id

πcustomer.*

( ))

σ a1=x1 .... an=xn

(

customers.id

11 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

Customers Tickets

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... .....

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-14
SLIDE 14

May 31, 2010 Alekh Jindal

Information Systems Group

Bag-partitioning

Log SV Log SV Log SV

Result

σ

bag=customers

σ

bag=tickets

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id

tickets log customers log 12

Customers Tickets

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> .....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... .....

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-15
SLIDE 15

May 31, 2010 Alekh Jindal

Information Systems Group

Key-consolidation

Log SV Log SV Log SV

Result

σ

bag=customers

Γbag,key

recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

13

Customers Tickets

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> .....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... .....

customers

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-16
SLIDE 16

May 31, 2010 Alekh Jindal

Information Systems Group

Storage View Transformation

Col SV Row SV Log SV

Result

tickets customers 14

Customers Tickets

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

bag=customers

Γbag,key

recent

γ

( ( ))

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> .....

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-17
SLIDE 17

May 31, 2010 Alekh Jindal

Information Systems Group

Hot-Cold Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days ticketsHot ticketsCold 15

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> .....

customers

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-18
SLIDE 18

May 31, 2010 Alekh Jindal

Information Systems Group

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 16

lectures, 301, <database, core, 9, winter, 01,...> students, 01, <tom, 25, bachelor, studying, ...> students, 02, <felix, male, 20, master, studying, ...> lectures, 302, <logic, core, 9, summer, 04,...> lectures, 303, <os, advanced, 6, summer, 01,...> lectures, 304, <os, advanced, 9, winter, 04,...> students, 03, <marc, 23, bachelor, graduated, ...> students, 04, <felix, male, 21, master, studying, ...> ..... students, 01, <tom, 25, bachelor, studying, ...> students, 02, <felix, 20, master, studying, ...> students, 03, <marc, 23, bachelor, graduated, ...> students, 04, <felix, 21, master, studying, ...> ..... lectures, 301, <database, core, 9, winter, 01,...> lectures, 302, <logic, core, 9, summer, 04,...> lectures, 303, <os, advanced, 6, summer, 01,...> lectures, 304, <os, advanced, 9, winter, .04,..> ..... lectures, 301, <database, core, 9, winter, 01,...> lectures, 302, <logic, core, 9, summer, 04,...> lectures, 304, <os, advanced, 9, winter, .04,..> .....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> .....

ticketsCold

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Storage Views

slide-19
SLIDE 19

May 31, 2010 Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 17 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Result Result Result Result Result Storage Views

slide-20
SLIDE 20

May 31, 2010 Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 17 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Result Result Result Result Result

Pick right Storage Views to: create, update, query and drop

Storage Views

slide-21
SLIDE 21

May 31, 2010 Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 17 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Motivation OctopusDB Holistic SV Optimizer Experiments Conclusion Result Result Result Result Result

Single Optimization Problem: “Storage View Selection”

Storage Views

slide-22
SLIDE 22

May 31, 2010 Alekh Jindal

Information Systems Group

Holistic Storage View Optimizer

  • Storage totally dynamic:


Any subset of data in any storage structure

  • Storage View selection
  • Storage View update maintenance
  • Pick physical execution plan
  • Combine results spanning several Storage

Views

18

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-23
SLIDE 23

May 31, 2010 Alekh Jindal

Information Systems Group

Adaptive SV Optimization

  • SV Rearrangement
  • e.g. Operator log-pushdown
  • Adaptive partial SVs
  • e.g. Partial Indexes
  • Stream transformation
  • e.g. OLTP to Streaming System

19

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-24
SLIDE 24

May 31, 2010 Alekh Jindal

Information Systems Group

Operator Log-Pushdown

20 Col SV Row SV Log SV

Result 2

σ

b a g = c u s t

  • m

e r s

Γ

b a g , k e y r e c e n t

γ

( ( ) )

σ

b a g = t i c k e t s

Γ

b a g , k e y r e c e n t

γ

( ( ) )

Result 3

π

price

)

σ

class=E

(

Result 1

π

email

Result 4

t i c k e t s . c u s t
  • m
e r _ i d

π

id

( ))

σ

class=E, a

1

=x

1

(

c u s t
  • m
e r . i d tickets.customer_id

π

name

( ))

σ

class=E, a2=x2

(

customer.id Primary Log Store

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-25
SLIDE 25

May 31, 2010 Alekh Jindal

Information Systems Group

Operator Log-Pushdown

21 Col SV Row SV Log SV

Result 2 Result 3

π

price

Result 1

π

email

Result 4

tickets.customer_id

π

id

( ))

σa1=x1

(

customer.id tickets.customer_id

π

name

( ))

σ

a2=x2

(

customer.id

σ

bag=tickets

Γ

bag,key recent

γ

( ( )))

πprice,

c u s t
  • m
e r _ i d(

σ

bag=customers

Γ

bag,key recent

γ

( ( )))

πname,

e m a i l , i d(

bag=customers | (bag=tickets & tickets.class=E)

σ

Primary Log Store

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-26
SLIDE 26

May 31, 2010 Alekh Jindal

Information Systems Group

Adaptive Partial SVs

22 Col SV Row SV Log SV

Result

σ

b a g = c u s t

  • m

e r s

Γ

b a g , k e y r e c e n t

γ

( ( ) )

σ

b a g = t i c k e t s

Γ

b a g , k e y r e c e n t

γ

( ( ) )

σ

time>=now-7days Col SV

σ

time<now-7days

Cold

Index SV Index SV

π id,rid πprice,rid

c

  • u

n t ( * ) > = 5 c u s t

  • m

e r _ i d

Γ

γ

( )

t i c k e t s . c u s t
  • m
e r _ i d

π

customer.*

( ))

σ

a1=x1..an=xn

(

c u s t
  • m
e r . i d

Frequent Fliers

customer.id tickets.customer_id

(Adaptive Partial Index) SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-27
SLIDE 27

May 31, 2010 Alekh Jindal

Information Systems Group

Stream Transformation

23 Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

b a g = t i c k e t s

Γ

b a g , k e y r e c e n t

γ

( ( ) )

σtime>=now-300sec

Col SV

σ

time<now-300sec Index SV customers Index

New Customers getting Cheapest Tickets in last 5 minutes

π

id,rid σregistered_time <now-600sec

( )

tickets Hot

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id CHEAPEST

γ

(

Primary Log Store

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-28
SLIDE 28

May 31, 2010 Alekh Jindal

Information Systems Group

Stream Transformation

24 Log SV

Result New Customers getting Cheapest Tickets in last 5 minutes

Primary Log Store

σ

Γ

γ

b a g = t i c k e t s b a g , k e y r e c e n t

σ time > now-300

( ( ) ) ) ) (

γ

C H E A P E S T

(

σ

Γ

γ

b a g = c u s t
  • m
e r s b a g , k e y r e c e n t

σ

r e g i s t e r e d _ t i m e > n
  • w
  • 6

( ( ) ) ) ) (

customers.id tickets.customer_id

Col SV Index SV

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-29
SLIDE 29

May 31, 2010 Alekh Jindal

Information Systems Group

Stream Transformation

24

Result New Customers getting Cheapest Tickets in last 5 minutes

Primary Log Store

σ

Γ

γ

b a g = t i c k e t s b a g , k e y r e c e n t

σ time > now-300

( ( ) ) ) ) (

γ

C H E A P E S T

(

σ

Γ

γ

b a g = c u s t
  • m
e r s b a g , k e y r e c e n t

σ

r e g i s t e r e d _ t i m e > n
  • w
  • 6

( ( ) ) ) ) (

customers.id tickets.customer_id

Col SV Index SV customerWindow ticketsWindow Data Stream

Motivation OctopusDB Storage Views Experiments Conclusion Holistic SV Optimizer

slide-30
SLIDE 30

May 31, 2010 Alekh Jindal

Information Systems Group

Related Work

  • Storage Views different from Materialized Views
  • do not always replicate
  • consider different layouts
  • Work on view matching, query containment
  • etc. operate on a higher level
  • Still, much of it could be adapted in

OctopusDB

25

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-31
SLIDE 31

May 31, 2010 Alekh Jindal

Information Systems Group

Simulations & Experiment

  • Goals
  • To show the viability of our approach
  • To show adaptability in OctopusDB
  • Method
  • Cost model
  • Prototype Implementation

26

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-32
SLIDE 32

May 31, 2010 Alekh Jindal

Information Systems Group

Simulation for Query-based Layout

5 10 15 20 0.0 0.13 0.25 0.38 0.5 0.63 0.75 0.88 1.0

Row Column Indexed Row Indexed Column # Referenced Attributes Selectivity (Fraction of Data)

Customers Tickets

27

Motivation OctopusDB Storage Views Holistic SV Optimizer Conclusion Experiments

slide-33
SLIDE 33

May 31, 2010 Alekh Jindal

Information Systems Group

Simulation for Update-based Layout

28

5 10 15 20 0.0 0.13 0.25 0.38 0.5 0.63 0.75 0.88 1.0

Row Column Indexed Row Indexed Column # Referenced Attributes Selectivity (Fraction of Data)

Customers Tickets

Motivation OctopusDB Storage Views Holistic SV Optimizer Conclusion Experiments

slide-34
SLIDE 34

May 31, 2010 Alekh Jindal

Information Systems Group

Simulation for Comparing DBMS Stores

29

0.2 0.4 0.6 0.8 1 Row Store Column Store Indexed Row Store Indexed Column Store Fractured Mirrors Indexed Fractured Mirrors OctopusDB workload time [seconds] Query Costs Update Costs

Motivation OctopusDB Storage Views Holistic SV Optimizer Conclusion Experiments

slide-35
SLIDE 35

May 31, 2010 Alekh Jindal

Information Systems Group

Experiment for Automatic Adaption

30

Primary Log SV Bag-partitioned Log SV Bag-partitioned Log SV Key-consolidated Log SV Column SV Indexed SV Key-consolidated Log SV Row SV Indexed SV

0.001 0.01 0.1 1 50000 100000 150000 200000 250000 workload time [sec] time [# log records] Tickets+Customers Tickets Customers Transform Tickets Transform Customers Cost Threshold

Main memory prototype implementation
 Tickets, Customers have 40 attributes each
 10 update queries
 30 scan queries with selectivity 0.01, projecting random attributes with skewness 4


Motivation OctopusDB Storage Views Holistic SV Optimizer Conclusion Experiments

slide-36
SLIDE 36

May 31, 2010 Alekh Jindal

Information Systems Group

Research Challenges

1.Mapping logical schema to physical layout 2.Automatically picking the right layout


  • dynamic partitioning, 2D cracking etc.

3.Storage View selection 4.Storage View update maintenance 5.OctopusDB Benchmarking and Evaluation 6.OctopusDB ideas with MapReduce

31

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-37
SLIDE 37

May 31, 2010 Alekh Jindal

Information Systems Group

Preliminary Conclusion

  • Existing DBMS Engines e.g. OLTP

, OLAP are application specific

  • We propose a one-size-fits-all database system
  • Storage View is a abstract storage concept and

gives flexibility to data layout

  • Holistic SV Optimizer for the single optimization

problem: storage view selection

  • Initial results look promising

32

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

slide-38
SLIDE 38

May 31, 2010 Alekh Jindal

Information Systems Group

Summary

33

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog 0.2 0.4 0.6 0.8 1 R
  • w
S t
  • r
e C
  • l
u m n S t
  • r
e I n d e x e d R
  • w
S t
  • r
e I n d e x e d C
  • l
u m n S t
  • r
e F r a c t u r e d M i r r
  • r
s I n d e x e d F r a c t u r e d M i r r
  • r
s O c t
  • p
u s D B workload time [seconds] Query Costs Update Costs

Motivation OctopusDB Storage Views Holistic SV Optimizer Experiments Conclusion

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Customers Tickets Col SV Row SV Log SV Result

!

b a g = c u s t
  • m
e r s ! b a g , k e y r e c e n t

"

( ( ) )

!

bag=tickets ! bag,key recent

"

( ( ))

!

t i m e > = n
  • w
  • 7
d a y s Col SV

!

time<now-7days t i c k e t s . c u s t
  • m
e r _ i d

"

c u s t
  • m
e r s . *

( ) )

!

a1 = x1 . . . an = xn

(

c u s t
  • m
e r . i d Index SV Index SV ticketsHotIndex customersIndex

" "

id,rid price, rid customers ticketsCold Posting Lists OLTP OLAP Streaming System Archival System Search Engine Real-time Transactions Stock Trading Older Transactions Customer Service Business Analytics Relational Table Relational Table Flat File Web Pages Relational Table Row Layout Column Layout Row Layout Raw Text Several Applications Evolving Applications

Zoo

ETL style data pipelines Licensing Cost DBA Cost Maintenance Cost Eventual Integration Hard-coded optimizations Hard-coded data layouts Scenario Engineering Cost Required Performance Integration Cost Drawbacks
slide-39
SLIDE 39

May 31, 2010 Alekh Jindal

Information Systems Group

Sources

  • http://www.cksinfo.com/signssymbols/

pointing/index.html

  • http://www.manywallpapers.com/nature-

wallpapers/spring/spring-landscape.html

34