Information Systems Group
Relax and Let the Database do the Partitioning Online
Alekh Jindal, Jens Dittrich
VLDB International Workshop on Real-Time Business Intelligence
- presented by Stefan Schuh
September 2, 2011
Thursday, September 1, 2011
Relax and Let the Database do the Partitioning Online Alekh Jindal, - - PowerPoint PPT Presentation
Information Systems Group Relax and Let the Database do the Partitioning Online Alekh Jindal, Jens Dittrich - presented by Stefan Schuh VLDB International Workshop on Real-Time Business Intelligence September 2, 2011 Thursday, September 1,
Information Systems Group
Alekh Jindal, Jens Dittrich
VLDB International Workshop on Real-Time Business Intelligence
September 2, 2011
Thursday, September 1, 2011
2
BI Applications Database Advisory Tools DBA
Thursday, September 1, 2011
3
Thursday, September 1, 2011
4
BI Applications Database Advisory Tools DBA
Thursday, September 1, 2011
5 BI Applications Database
Sub-Problem Proposed Solution Indexing Online Indexing Database Cracking Adaptive Indexing Materialized Views Dynamic Materialized Views Partitioning
Thursday, September 1, 2011
6
Thursday, September 1, 2011
6
Thursday, September 1, 2011
7
Thursday, September 1, 2011
7
Q1 Q2 Q3 Q4 Q5
Window Size = 5
Thursday, September 1, 2011
7
Q1 Q2 Q3 Q4 Q5 Q6
Window Size = 5
Thursday, September 1, 2011
7
Q1 Q2 Q3 Q4 Q5 Q6 Q7
Window Size = 5
Thursday, September 1, 2011
7
Q1 Q2 Q3 Q4 Q5 Q6 Q7
Window Size = 5
.....
Thursday, September 1, 2011
a1 a2 a3 a4 a5 a6
8
Thursday, September 1, 2011
a1 a2 a3 a4 a5 a6
8
a1 a2 a3 a4 a5 a6 a3 a2 a1 a5 a4 a6
(
(
(
(
(
Thursday, September 1, 2011
a1 a2 a3 a4 a5 a6
8
a1 a2 a3 a4 a5 a6
1 0 1 ] [
a1 a2 a3 a4 a5 a6 a3 a2 a1 a5 a4 a6
(
(
(
(
(
Thursday, September 1, 2011
a1 a2 a3 a4 a5 a6
8
a1 a2 a3 a4 a5 a6 1 0 1 ] [
a1 a2 a3 a4 a5 a6 a3 a2 a1 a5 a4 a6
(a1 , a2)
a1 a2 a3 a4 a5 a6
(
(
(
(
(
Thursday, September 1, 2011
a1 a2 a3 a4 a5 a6
8
a1 a2 a3 a4 a5 a6 1 0 1 ] [
scheme P(S,)
a1 a2 a3 a4 a5 a6 a3 a2 a1 a5 a4 a6
(a1 , a2)
a1 a2 a3 a4 a5 a6
(a1 , a2) (a3 , a4, a5) (a6) { }
a1 a2 a3 a4 a5 a6
(
(
(
(
(
Thursday, September 1, 2011
9
a1 a2 a3 a4 a5 a6 r6 r5 r4 r3 r2 r1 a1 a2 a3 a4 a5 a6
Thursday, September 1, 2011
10
scheme P(S,)
S⇥ = argmin
S
Cest.
⇥
ad Wtk
(
(
(
(
(
Thursday, September 1, 2011
11
Step 1: Finding Partitioning Unit Ordering
Thursday, September 1, 2011
12
⇥ ⇥ as M(⇥) = x
i=1
x
j=1 A(ai, aj)[A(ai, aj−1)+A(ai, aj+1)].
) = A(a , a ) = A(a , a ) = A(a , a )=0. For the
PartKey SuppKey Quantity PartKey 8 5 6 SuppKey 5 8 4 Quantity 6 4 9 PartKey Quantity SuppKey PartKey 8 6 5 Quantity 6 9 4 SuppKey 5 4 8 PartKey SuppKey Quantity PartKey 8 5 6 SuppKey 5 8 4 Quantity 6 4 9
e M(⇥) matrix has e M(⇥) matrix has
= 404 = 440
Thursday, September 1, 2011
13
SuppKey PartKey Quantity SuppKey 6 9 6 PartKey 4 6 9 Quantity 9 6 4 PartKey Quantity SuppKey PartKey 8 6 5 Quantity 6 9 4 SuppKey 5 4 8 PartKey Quantity SuppKey PartKey 9 6 6 Quantity 6 9 4 SuppKey 6 4 9
+48
PartKey Quantity SuppKey PartKey 9 6 6 Quantity 6 9 4 SuppKey 6 4 9
(PartKey, SuppKey)
Thursday, September 1, 2011
14
Step 2: Enumerating Split Vectors
Thursday, September 1, 2011
15
Technique 1: prune non-referenced partitioning units Technique 2: consider split vectors greedily Technique 3: save previous best split vectors using
Step 2: Enumerating Split Vectors
Thursday, September 1, 2011
16
Thursday, September 1, 2011
17
Thursday, September 1, 2011
18
Best Best
Thursday, September 1, 2011
19
Query Cost Analysis Cost Query Cost Analysis Cost
Thursday, September 1, 2011
20
Thursday, September 1, 2011
21
Thursday, September 1, 2011
22
Customer Lineitem Optimal Navathe O2P Optimal Navathe O2P Quality
100% 99.29% 92.76% 100% 97.45% 95.80%
Iterations
100% 14.60% 2.28% 100% 2.42% 0.14%
2
Thursday, September 1, 2011
23
0.001 0.01 0.1 1 10 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Analysis Time (sec) Fraction of OLAP against OLTP queries NV/HC O2Pp O2Ppg O2pgd
Thursday, September 1, 2011
24
17.8978616
10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
Workload Execution Time (sec) Fraction of OLAP against OLTP queries No Partitioning Full Vertical Partitioning AutoStore (O2Ppgd) AutoStore (O2Ppgdm) AutoStore (O2Ppgda)
* Constant-Time Query Processing, V. Raman et.al., ICDE 2008
Thursday, September 1, 2011
25 25.5781685
750 1500 2250 3000
No Partitioning Full Vertical Partitioning AutoStore
22.818 1324.88 2113.447 2290.16
Execution Time Analysis Time
Cumulative Workload Execution Time (sec)
Thursday, September 1, 2011
26
process
Thursday, September 1, 2011
27
Motivation: Online Physical Database Design
5 BI Applications Database Sub-Problem Proposed Solution Indexing Online Indexing Database Cracking Adaptive Indexing Materialized Views Dynamic Materialized Views PartitioningWE!
Partitioning Problem: What to Analyze?
10Find , such that:
set Pu
ctor S
artition pm,r(S,)
scheme P(S,)
S⇥ = argmin
SCest.
⇥
S⇥
(
ad Wtk
set Pu
(
Online Partitioning Unit Ordering
set Pu set Pu
(PartKey, SuppKey)How to Analyze the Workload?
15(brute force)
Partitioning (O2P) Algorithm
Step 2: Enumerating Split Vectors
How much is O2P Faster?
Setup: TPC-H Lineitem table, 10,000 queries in total
23 0.001 0.01 0.1 1 10 100 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1Analysis Time (sec) Fraction of OLAP against OLTP queries NV/HC O2Pp O2Ppg O2pgd
Can such a System Adapt to Changing Workload ?
Setup: Universal relation de-normalized from TPC-H schema, 11 attributes, SF 1
24 10 20 30 40 50 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 Workload Execution Time (sec) Fraction of OLAP against OLTP queries No Partitioning Full Vertical Partitioning AutoStore (O2Ppgd) AutoStore (O2Ppgdm) AutoStore (O2Ppgda)Thursday, September 1, 2011