The Mimicking Octopus Towards a one-size-fits-all Architecture for - - PowerPoint PPT Presentation

the mimicking octopus
SMART_READER_LITE
LIVE PREVIEW

The Mimicking Octopus Towards a one-size-fits-all Architecture for - - PowerPoint PPT Presentation

Information Systems Group The Mimicking Octopus Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal Supervisor: Prof. Dr. Jens Dittrich VLDB PhD Workshop September 13, 2010 Database Landscape Information Systems Group


slide-1
SLIDE 1

Information Systems Group

The Mimicking Octopus

Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal

Supervisor: Prof. Dr. Jens Dittrich

VLDB PhD Workshop

September 13, 2010

slide-2
SLIDE 2

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Database Landscape

2

Motivation

OLTP OLAP

Streaming System Archival System Search Engine

slide-3
SLIDE 3

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Database Landscape

2

Motivation

OLTP OLAP

Streaming System Archival System Search Engine

Company Information System

slide-4
SLIDE 4

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Database Landscape

2

Motivation

OLTP OLAP

Streaming System Archival System Search Engine

Airline Company

Several Applications Evolving Applications ETL style data pipelines Eventual Integration Licensing Cost DBA Cost Maintenance Cost Engineering Cost Integration Cost Hard-coded optimizations Hard-coded data layouts Reporting Cheap Fares Ticket Booking Booking Archives Flight Search

slide-5
SLIDE 5

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Problem Statement

  • Single database system
  • Automatic adaption
  • Improved performance
  • Lower cost
  • Better maintainability

3

Motivation

slide-6
SLIDE 6

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

OctopusDB Overview

  • One-size-fits-all architecture
  • Abstract storage concept: Storage Views(SV)
  • Single optimization problem: SV Selection
  • Holistic SV optimizer

4

OctopusDB

slide-7
SLIDE 7

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

System Architecture

5

  • No hard-coded store
  • All operations recorded as logical log entries

in a primary log on stable storage using WAL

OctopusDB

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog

slide-8
SLIDE 8

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

System Architecture

5

  • No hard-coded store
  • All operations recorded as logical log entries

in a primary log on stable storage using WAL

OctopusDB

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog

slide-9
SLIDE 9

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

System Architecture

5

  • No hard-coded store
  • All operations recorded as logical log entries

in a primary log on stable storage using WAL

OctopusDB

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog

slide-10
SLIDE 10

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage Views

  • Arbitrary physical representations of data
  • Different layouts under a single umbrella

6

Storage Views

slide-11
SLIDE 11

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage Views

  • Arbitrary physical representations of data
  • Different layouts under a single umbrella

6

Storage Views

Log SV Row SV Column SV Index SV

Primary

slide-12
SLIDE 12

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage Views

  • Arbitrary physical representations of data
  • Different layouts under a single umbrella

6

Storage Views

Log SV Row SV Column SV Index SV Partial Index SV Bag-partitioned SV Key-consolidated SV Vertically/Horizontally Partitioned SV

Primary Secondary

slide-13
SLIDE 13

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage Views

  • Arbitrary physical representations of data
  • Different layouts under a single umbrella

6

Storage Views

Log SV Row SV Column SV Index SV Partial Index SV Bag-partitioned SV Key-consolidated SV Vertically/Horizontally Partitioned SV

Primary Secondary

... any hybrid combination of the above

slide-14
SLIDE 14

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Use-case Scenario*

  • Flight booking system
  • Tables: Tickets, Customers
  • Tickets: several attributes, frequently updated
  • Customers: fewer attributes
  • Queries:


SELECT C.*
 FROM Tickets T, Customers C
 WHERE T.customer_id=C.id AND T.a1=x1 AND T.a2=x2 ... AND T.an=xn

7

* Inspired from Unterbrunner et al. in PVLDB, 2009.

Storage Views

slide-15
SLIDE 15

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Flight Booking System

Log SV

Result

tickets.customer_id

πcustomer.*

( ))

σ a1=x1 .... an=xn

(

customers.id

8 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

Customers Tickets

customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... .....

Storage Views

slide-16
SLIDE 16

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Bag-partitioning

Log SV Log SV Log SV

Result

σ

bag=customers

σ

bag=tickets

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id

tickets log customers log 9

Customers Tickets

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> .....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Storage Views

slide-17
SLIDE 17

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Key-consolidation

Log SV Log SV Log SV

Result

σ

bag=customers

Γbag,key

recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

10

Customers Tickets

customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> .....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

customers

Storage Views

slide-18
SLIDE 18

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage View Transformation

Col SV Row SV Log SV

Result

tickets customers 11

Customers Tickets

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

bag=customers

Γbag,key

recent

γ

( ( ))

customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> .....

Storage Views

slide-19
SLIDE 19

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Hot-Cold Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

Col SV

σ

time<now-7days ticketsHot ticketsCold 12

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

c u s t

  • m

e r s . *

( ) )

σ

a1=x1 ... an=xn

(

customer.id customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, .....

customers

Storage Views

time>=now-7days

slide-20
SLIDE 20

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

slide-21
SLIDE 21

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Isn’t this same as Materialized Views?

slide-22
SLIDE 22

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

NO!

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Isn’t this same as Materialized Views?

slide-23
SLIDE 23

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Materialized View knows what to materialize

slide-24
SLIDE 24

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Materialized View knows what to materialize Storage View also knows how to materialize

slide-25
SLIDE 25

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

σ

time>=now-7days

customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....

Index Storage Views

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

Col SV

σ

time<now-7days 13 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

tickets.customer_id

π

customers.*

( ))

σ

a1=x1 ... an=xn

(

customer.id

Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid

tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....

Storage Views

tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....

Materialized View knows what to materialize Storage View also knows how to materialize A Materialized View still needs a Storage View

slide-26
SLIDE 26

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 14 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Result Result Result Result Result

Optimizer

slide-27
SLIDE 27

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 14 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Result Result Result Result Result

Pick right Storage Views to: create, update, query and drop

Optimizer

slide-28
SLIDE 28

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Storage View Selection

Customers Tickets

Col SV Row SV Log SV

Result

σ

bag=customers

Γ

bag,key recent

γ

( ( ))

σ

bag=tickets

Γ

bag,key recent

γ

( ( ))

σ

time>=now-7days Col SV

σ

time<now-7days 14 SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex

π π

id,rid price, rid ticketsCold

Result Result Result Result Result

Single Optimization Problem: “Storage View Selection”

Optimizer

slide-29
SLIDE 29

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Holistic Storage View Optimizer

  • Storage totally dynamic:


Any subset of data in Any storage structure

  • Storage View selection
  • Storage View update maintenance
  • Pick physical execution plan
  • Combine results spanning several Storage

Views

15

Optimizer

slide-30
SLIDE 30

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Research Challenges

  • Single umbrella for different storage layouts

  • storage layer abstraction 

  • still layout specific specialization
  • Automatic adaptive bifurcation

  • monolithic system

  • right online algorithms
  • Simplicity vs Optimization

  • only as complex as required

  • mimic several specialized systems

16

Challenges & Related

slide-31
SLIDE 31

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Related Work

  • Materialized Views [Chirkova et. al. VLDBJ 2002]

  • as pointed before different from storage views
  • Dynamic materialized views [Zhou et. al. ICDE 2007]

  • horizontal dynamism, storage view still open
  • View matching, query containment [A.
  • Y. Halevy VLDBJ 2001]

  • again operate on a higher level
  • Cracked databases [Idreos et. al. CIDR 2007]

  • logical partitioning of data, only horizontal
  • Rodent store [Cudre-Mauroux et. al. CIDR 2009]

  • still assumes a store
  • GMAP [Tsatalos et. al. VLDB 1994]

  • does not adapt the stores


17

Challenges & Related

slide-32
SLIDE 32

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Optimizer Cost Model

18

Symbol Meaning Model Clog

scan(N)

Log SV scan cost ‰ PN

i=1 colsize(logi) m

ı · Crandom + ‰ PN

i=1 colsize(logi) pageSize

ı /BW Crow

scan(N)

Row SV scan cost ‰

N·P Ai∈A colsize(Ai) m

ı · Crandom + ‰

N·P Ai∈A colsize(Ai) pageSize

ı /BW Ccol

scan(N, S)

Col SV scan cost P

Ai∈S

„‰

N·P Ai∈S colsize(Ai) m

ı · Crandom + l N·colsize(Ai)

pageSize

m /BW « Cindex

lookup(N)

Index lookup cost Crandom · dlogF (N · (colsize(key) + pointerSize)/pageSize)e Crow cl. index

scan

(N, sel) Unclustered Indexed Row SV scan cost Cindex

lookup(N) + Crow scan(dsel · Ne)

  • Ccol. cl. index

scan

(N, S, sel) Unclustered Indexed Col SV scan cost Cindex

lookup(N) + Ccol scan(dsel · Ne, S)

Crow uncl. index

scan

(N, sel) Clustered Indexed Row SV scan cost Cindex

lookup + dsel · Ne · (Crandom + pageSize/BW)

  • Ccol. uncl. index

scan

(N, S, sel) Unclustered Indexed Col SV scan cost Cindex

lookup + dsel · Ne · |S| · (Crandom + pageSize/BW)

Symbol Meaning Model Clog

update(Nu)

Log SV update cost Clog

scan(Nu)

Crow

update(N, Nu)

Row SV update cost min “ Crandom + l

N Nc

m · Crow

scan(2 · Nc),

l

N Nc

m · Crow

scan(Nc) + Nu · (Crandom + pageSize/BW)

” Ccol

update(N, Nu, S)

Col SV update cost min “ Crandom + l

N Nc

m · Ccol

scan(2 · Nc),

l

N Nc

m · Ccol

scan(Nc) + Nu · |S| · (Crandom + pageSize/BW

” Cindex

split (d)

Index split cost “Pd

i=1 (psplit)i”

· Crandom Crow cl. index

update

(N, Nu, d)

  • Cl. Index Row SV update cost

Cindex

lookup(N) + 2 · Crow scan(Nu) + Cindex split (d)

  • Ccol. cl. index

update

(N, Nu, S, d)

  • Cl. Index Col SV update cost

Cindex

lookup(N) + 2 · Ccol scan(Nu, S) + Cindex split (d)

Crow uncl. index

update

(N, Nu, d)

  • Uncl. Index Row SV update cost

Cindex

lookup + Nu · (Crandom + pageSize/BW ) + Cindex split (d)

  • Ccol. uncl. index

update

(N, Nu, S, d)

  • Uncl. Index Col SV update cost

Cindex

lookup + Nu · |S| · (Crandom + pageSize/BW ) + Cindex split (d)

SV Transformation Cost Log SV ! Row SV Clog

scan(N) + Crow scan(N)

Log SV ! Col SV Clog

scan(N) + Ccol scan(N, A)

Row SV $ Col SV Crow

scan(N) + Ccol scan(N, A)

Row SV ! Index SV Crow

scan(N) +

F d+1−1 F −1

” · Crandom Col SV ! Index SV Ccol

scan(N, {key,rowID}) +

F d+1−1 F −1

” · Crandom

Query Cost Model Update Cost Model Transform Cost Model

Further Directions

slide-33
SLIDE 33

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Comparing Different Stores

19

Further Directions

0.2 0.4 0.6 0.8 1 Row Store Column Store Indexed Row Store Indexed Column Store Fractured Mirrors Indexed Fractured Mirrors OctopusDB workload time [seconds] Query Costs Update Costs

Tickets Customers

Tuples

100,000 20,000

Selectivity

0.9 0.1

Attributes Referenced

4/20 20/20

slide-34
SLIDE 34

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Next Steps

1.Automatically picking the right layout


  • row, column, partitioned, cracked, more?

2.Storage View compression


  • adaptive compression

3.Storage View maintenance


  • maintaining heterogenous SVs

4.OctopusDB benchmarking and evaluation


  • one-size-fits-all benchmark

20

Further Directions

slide-35
SLIDE 35

September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal

Information Systems Group

Summary

Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager

Result

Query Catalog SELECT C.* 
 FROM Tickets T , Customers C
 WHERE T.customer_id=C.id
 AND T.a1=x1 .... AND T.an=xn

Customers Tickets

Col SV Row SV Log SV

Result

!

b a g = c u s t

  • m

e r s

!

b a g , k e y r e c e n t

"

( ( ) )

!

bag=tickets

!

bag,key recent

"

( ( ))

!

t i m e > = n

  • w
  • 7

d a y s Col SV

!

time<now-7days

t i c k e t s . c u s t
  • m
e r _ i d

"

c u s t

  • m

e r s . *

( ) )

!

a1 = x1 . . . an = xn

(

c u s t
  • m
e r . i d

Index SV Index SV ticketsHotIndex customersIndex

" "

id,rid price, rid customers ticketsCold

Thanks!

0.2 0.4 0.6 0.8 1 Row Store Column Store Indexed Row Store Indexed Column Store Fractured Mirrors Indexed Fractured Mirrors OctopusDB workload time [seconds] Query Costs Update Costs Information Systems Group

Database Landscape

OLTP OLAP

Streaming System Archival System Search Engine

Airline Company

Several Applications Evolving Applications ETL style data pipelines Eventual Integration Licensing Cost DBA Cost Maintenance Cost Engineering Cost Integration Cost Hard-coded optimizations Hard-coded data layouts Reporting Cheap Fares Ticket Booking Booking Archives Flight Search