Information Systems Group
The Mimicking Octopus
Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal
Supervisor: Prof. Dr. Jens Dittrich
VLDB PhD Workshop
September 13, 2010
The Mimicking Octopus Towards a one-size-fits-all Architecture for - - PowerPoint PPT Presentation
Information Systems Group The Mimicking Octopus Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal Supervisor: Prof. Dr. Jens Dittrich VLDB PhD Workshop September 13, 2010 Database Landscape Information Systems Group
Information Systems Group
Towards a one-size-fits-all Architecture for Database Systems Alekh Jindal
Supervisor: Prof. Dr. Jens Dittrich
VLDB PhD Workshop
September 13, 2010
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
2
Motivation
OLTP OLAP
Streaming System Archival System Search Engine
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
2
Motivation
OLTP OLAP
Streaming System Archival System Search Engine
Company Information System
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
2
Motivation
OLTP OLAP
Streaming System Archival System Search Engine
Several Applications Evolving Applications ETL style data pipelines Eventual Integration Licensing Cost DBA Cost Maintenance Cost Engineering Cost Integration Cost Hard-coded optimizations Hard-coded data layouts Reporting Cheap Fares Ticket Booking Booking Archives Flight Search
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
3
Motivation
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
4
OctopusDB
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
5
OctopusDB
Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager
Result
Query Catalog
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
5
OctopusDB
Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager
Result
Query Catalog
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
5
OctopusDB
Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager
Result
Query Catalog
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
6
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
6
Storage Views
Log SV Row SV Column SV Index SV
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
6
Storage Views
Log SV Row SV Column SV Index SV Partial Index SV Bag-partitioned SV Key-consolidated SV Vertically/Horizontally Partitioned SV
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
6
Storage Views
Log SV Row SV Column SV Index SV Partial Index SV Bag-partitioned SV Key-consolidated SV Vertically/Horizontally Partitioned SV
... any hybrid combination of the above
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
SELECT C.* FROM Tickets T, Customers C WHERE T.customer_id=C.id AND T.a1=x1 AND T.a2=x2 ... AND T.an=xn
7
* Inspired from Unterbrunner et al. in PVLDB, 2009.
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Log SV
Result
tickets.customer_id
πcustomer.*
( ))
σ a1=x1 .... an=xn
(
customers.id
8 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
Customers Tickets
customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> tickets, 303, <tokyo, beijing, B,..> ..... .....
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Log SV Log SV Log SV
Result
σ
bag=customers
σ
bag=tickets
tickets.customer_id
π
c u s t
e r s . *
( ) )
σ
a1=x1 ... an=xn
(
customer.id
tickets log customers log 9
Customers Tickets
customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers 03, <felix, 20, felix@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> .....
SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Log SV Log SV Log SV
Result
σ
bag=customers
Γbag,key
recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
10
Customers Tickets
customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> .....
SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
c u s t
e r s . *
( ) )
σ
a1=x1 ... an=xn
(
customer.id customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
customers
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Col SV Row SV Log SV
Result
tickets customers 11
Customers Tickets
SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
c u s t
e r s . *
( ) )
σ
a1=x1 ... an=xn
(
customer.id
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
σ
bag=customers
Γbag,key
recent
γ
( ( ))
customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, tom@abc.com, ...> customers, 02, <marc, 23, marc@abc.com, ...> customers, 03, <felix, 20, felix@xyz.com, ...> .....
Storage Views
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
σ
Col SV
σ
time<now-7days ticketsHot ticketsCold 12
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
c u s t
e r s . *
( ) )
σ
a1=x1 ... an=xn
(
customer.id customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, .....
customers
Storage Views
time>=now-7days
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Isn’t this same as Materialized Views?
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Isn’t this same as Materialized Views?
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Materialized View knows what to materialize
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Materialized View knows what to materialize Storage View also knows how to materialize
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
σ
time>=now-7days
customers, 01, <tom, 25, customers, 02, <marc, 23, customers, 03, <felix, 20, ..... customers, 01, <tom, 25, customers, 02, <marc, 23, customers 03, <felix, 20, customers, 03, <felix, 20, .....
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
Col SV
σ
time<now-7days 13 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
tickets.customer_id
π
customers.*
( ))
σ
a1=x1 ... an=xn
(
customer.id
Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid
tickets, 303, <tokyo, beijing, B,..> ..... tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...>.....
Storage Views
tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> tickets, 303, <tokyo, beijing, B,..> ..... customers, 01, <tom, 25, customers, 02, <marc, 23, tickets, 301, <paris, rome, E,...> tickets, 302, <moscow, berlin, B,...> tickets, 303, <tokyo, beijing, E,...> customers 03, <felix, 20, customers, 03, <felix, 20, tickets, 303, <tokyo, beijing, B,..> ..... .....
Materialized View knows what to materialize Storage View also knows how to materialize A Materialized View still needs a Storage View
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
σ
time>=now-7days Col SV
σ
time<now-7days 14 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid ticketsCold
Result Result Result Result Result
Optimizer
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
σ
time>=now-7days Col SV
σ
time<now-7days 14 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid ticketsCold
Result Result Result Result Result
Pick right Storage Views to: create, update, query and drop
Optimizer
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Customers Tickets
Col SV Row SV Log SV
Result
σ
bag=customers
Γ
bag,key recent
γ
( ( ))
σ
bag=tickets
Γ
bag,key recent
γ
( ( ))
σ
time>=now-7days Col SV
σ
time<now-7days 14 SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn Index SV Index SV ticketsHotIndex customersIndex
π π
id,rid price, rid ticketsCold
Result Result Result Result Result
Single Optimization Problem: “Storage View Selection”
Optimizer
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
15
Optimizer
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
16
Challenges & Related
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
17
Challenges & Related
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
18
Symbol Meaning Model Clog
scan(N)
Log SV scan cost ‰ PN
i=1 colsize(logi) m
ı · Crandom + ‰ PN
i=1 colsize(logi) pageSize
ı /BW Crow
scan(N)
Row SV scan cost ‰
N·P Ai∈A colsize(Ai) m
ı · Crandom + ‰
N·P Ai∈A colsize(Ai) pageSize
ı /BW Ccol
scan(N, S)
Col SV scan cost P
Ai∈S
„‰
N·P Ai∈S colsize(Ai) m
ı · Crandom + l N·colsize(Ai)
pageSize
m /BW « Cindex
lookup(N)
Index lookup cost Crandom · dlogF (N · (colsize(key) + pointerSize)/pageSize)e Crow cl. index
scan
(N, sel) Unclustered Indexed Row SV scan cost Cindex
lookup(N) + Crow scan(dsel · Ne)
scan
(N, S, sel) Unclustered Indexed Col SV scan cost Cindex
lookup(N) + Ccol scan(dsel · Ne, S)
Crow uncl. index
scan
(N, sel) Clustered Indexed Row SV scan cost Cindex
lookup + dsel · Ne · (Crandom + pageSize/BW)
scan
(N, S, sel) Unclustered Indexed Col SV scan cost Cindex
lookup + dsel · Ne · |S| · (Crandom + pageSize/BW)
Symbol Meaning Model Clog
update(Nu)
Log SV update cost Clog
scan(Nu)
Crow
update(N, Nu)
Row SV update cost min “ Crandom + l
N Nc
m · Crow
scan(2 · Nc),
l
N Nc
m · Crow
scan(Nc) + Nu · (Crandom + pageSize/BW)
” Ccol
update(N, Nu, S)
Col SV update cost min “ Crandom + l
N Nc
m · Ccol
scan(2 · Nc),
l
N Nc
m · Ccol
scan(Nc) + Nu · |S| · (Crandom + pageSize/BW
” Cindex
split (d)
Index split cost “Pd
i=1 (psplit)i”
· Crandom Crow cl. index
update
(N, Nu, d)
Cindex
lookup(N) + 2 · Crow scan(Nu) + Cindex split (d)
update
(N, Nu, S, d)
Cindex
lookup(N) + 2 · Ccol scan(Nu, S) + Cindex split (d)
Crow uncl. index
update
(N, Nu, d)
Cindex
lookup + Nu · (Crandom + pageSize/BW ) + Cindex split (d)
update
(N, Nu, S, d)
Cindex
lookup + Nu · |S| · (Crandom + pageSize/BW ) + Cindex split (d)
SV Transformation Cost Log SV ! Row SV Clog
scan(N) + Crow scan(N)
Log SV ! Col SV Clog
scan(N) + Ccol scan(N, A)
Row SV $ Col SV Crow
scan(N) + Ccol scan(N, A)
Row SV ! Index SV Crow
scan(N) +
“
F d+1−1 F −1
” · Crandom Col SV ! Index SV Ccol
scan(N, {key,rowID}) +
“
F d+1−1 F −1
” · Crandom
Query Cost Model Update Cost Model Transform Cost Model
Further Directions
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
19
Further Directions
0.2 0.4 0.6 0.8 1 Row Store Column Store Indexed Row Store Indexed Column Store Fractured Mirrors Indexed Fractured Mirrors OctopusDB workload time [seconds] Query Costs Update Costs
Tickets Customers
Tuples
100,000 20,000
Selectivity
0.9 0.1
Attributes Referenced
4/20 20/20
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
20
Further Directions
September 13, 2010 Towards a one-size-fits-all Database Architecture - Alekh Jindal
Information Systems Group
Storage View Store Primary Log Store Log SV Storage View Catalog API Purging & Checkpointing Recovery Manager Holistic SV Optimizer Transaction Manager
Result
Query Catalog SELECT C.* FROM Tickets T , Customers C WHERE T.customer_id=C.id AND T.a1=x1 .... AND T.an=xn
Customers Tickets
Col SV Row SV Log SV
Result
!
b a g = c u s t
e r s
!
b a g , k e y r e c e n t
"
( ( ) )
!
bag=tickets
!
bag,key recent
"
( ( ))
!
t i m e > = n
d a y s Col SV
!
time<now-7days
t i c k e t s . c u s t"
c u s t
e r s . *
( ) )
!
a1 = x1 . . . an = xn(
c u s tIndex SV Index SV ticketsHotIndex customersIndex
" "
id,rid price, rid customers ticketsCold
Thanks!
0.2 0.4 0.6 0.8 1 Row Store Column Store Indexed Row Store Indexed Column Store Fractured Mirrors Indexed Fractured Mirrors OctopusDB workload time [seconds] Query Costs Update Costs Information Systems Group
Database Landscape
OLTP OLAP
Streaming System Archival System Search EngineAirline Company
Several Applications Evolving Applications ETL style data pipelines Eventual Integration Licensing Cost DBA Cost Maintenance Cost Engineering Cost Integration Cost Hard-coded optimizations Hard-coded data layouts Reporting Cheap Fares Ticket Booking Booking Archives Flight Search