Data Systems that are Easy to Design, Tune and Use Stratos Idreos - - PowerPoint PPT Presentation

data systems that are easy to design tune and use
SMART_READER_LITE
LIVE PREVIEW

Data Systems that are Easy to Design, Tune and Use Stratos Idreos - - PowerPoint PPT Presentation

Data Systems that are Easy to Design, Tune and Use Stratos Idreos applications api/sql algorithms/operators cpu data data data memory hierarchy data system kernel Stratos Idreos design space it all starts with how we store data every


slide-1
SLIDE 1

Data Systems that are Easy to Design, Tune and Use

Stratos Idreos

slide-2
SLIDE 2

Stratos Idreos

data system kernel

data data data

algorithms/operators

applications api/sql

cpu memory hierarchy

slide-3
SLIDE 3

Stratos Idreos

it all starts with how we store data every bit matters

design space

slide-4
SLIDE 4

Stratos Idreos

it all starts with how we store data every bit matters

design space no fixed decisions from static to dynamic designs

slide-5
SLIDE 5

Stratos Idreos

today

slide-6
SLIDE 6

Stratos Idreos

today tomorrow

slide-7
SLIDE 7

Stratos Idreos

soon everyone will need to be a “data scientist”

hmm, my data is too big :(

slide-8
SLIDE 8

Stratos Idreos

not always sure what we are looking for (until we find it)

data exploration

data has always been big

volume velocity variety veracity

slide-9
SLIDE 9

Stratos Idreos

years daily data

[IBMbigdata]

slide-10
SLIDE 10

Stratos Idreos

years daily data

[IBMbigdata]

years daily data

[StratosGuess]

data* skills

data system design, set-up, tune, use

slide-11
SLIDE 11

Stratos Idreos

design & build data systems that are easy to: (years)

slide-12
SLIDE 12

Stratos Idreos

design & build data systems that are easy to: (years) set-up & tune (months)

slide-13
SLIDE 13

Stratos Idreos

design & build data systems that are easy to: (years) set-up & tune (months) use (hours/days)

slide-14
SLIDE 14

Stratos Idreos

timeline storage indexing query

expert users - idle time - workload knowledge

too many preparation options lead to complex installation schema load

slide-15
SLIDE 15

Stratos Idreos

users/applications declarative interface

ask what you want

db system DBA

slide-16
SLIDE 16

Stratos Idreos

users/applications data system 1 data system 2 … need to choose the proper system & workloads/ applications change rapidly DBA2 DBA1

slide-17
SLIDE 17

Stratos Idreos

storage indexing query schema load

X X X

be able to query the data immediately & with good performance

slide-18
SLIDE 18

Stratos Idreos

storage indexing query schema load

X X X

be able to query the data immediately & with good performance

raw data

knowledge

slide-19
SLIDE 19

Stratos Idreos

tune= create proper indices offline performance 10-100X

load indexing query

indexing

storage

slide-20
SLIDE 20

Stratos Idreos

tune= create proper indices offline performance 10-100X

but it depends on workload!

which indices to build?

  • n which data parts?

and when to build them?

load indexing query

indexing

storage

slide-21
SLIDE 21

Stratos Idreos

query load indexing storage

slide-22
SLIDE 22

Stratos Idreos

timeline query load indexing storage

slide-23
SLIDE 23

Stratos Idreos

timeline

sample workload

query load indexing storage

slide-24
SLIDE 24

Stratos Idreos

timeline

sample workload analyze

query load indexing storage

slide-25
SLIDE 25

Stratos Idreos

timeline

sample workload analyze create indices

query load indexing storage

slide-26
SLIDE 26

Stratos Idreos

timeline

sample workload analyze create indices query

query load indexing storage

slide-27
SLIDE 27

Stratos Idreos

timeline

sample workload analyze create indices query

complex and time consuming process

query load indexing storage

slide-28
SLIDE 28

Stratos Idreos

timeline

sample workload analyze create indices query

complex and time consuming process

query load indexing storage

human administrators + auto-tuning tools

slide-29
SLIDE 29

Stratos Idreos

what can go wrong?

not enough idle time to finish proper tuning by the time we finish tuning, the workload changes

big data V’s

volume velocity variety veracity

not enough space to index all data not enough money - energy - resources

slide-30
SLIDE 30

Stratos Idreos

what can go wrong?

not enough idle time to finish proper tuning by the time we finish tuning, the workload changes

big data V’s

volume velocity variety veracity

not enough space to index all data not enough money - energy - resources

slide-31
SLIDE 31

Stratos Idreos

database cracking

slide-32
SLIDE 32

Stratos Idreos

database cracking

idle time

workload knowledge external tools

human control

slide-33
SLIDE 33

Stratos Idreos

database cracking

auto-tuning database kernels

incremental, adaptive, partial indexing

idle time

workload knowledge external tools

human control

slide-34
SLIDE 34

Stratos Idreos

database cracking

auto-tuning database kernels

incremental, adaptive, partial indexing indexing initialization querying

idle time

workload knowledge external tools

human control

slide-35
SLIDE 35

Stratos Idreos

database cracking

auto-tuning database kernels

incremental, adaptive, partial indexing indexing initialization querying

idle time

workload knowledge external tools

human control

slide-36
SLIDE 36

Stratos Idreos

database cracking

auto-tuning database kernels

incremental, adaptive, partial indexing indexing initialization querying

idle time

workload knowledge external tools

human control

every query is treated as an advice

  • n how data should be stored
slide-37
SLIDE 37

Stratos Idreos

A B C D

... ...

relation1/table1

column-store database a fixed-width and dense array per attribute

Database Cracking CIDR 2007

slide-38
SLIDE 38

Stratos Idreos

A B C D

... ...

relation1/table1

column-store database a fixed-width and dense array per attribute

Database Cracking CIDR 2007

slide-39
SLIDE 39

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

Database Cracking CIDR 2007

slide-40
SLIDE 40

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

Database Cracking CIDR 2007

slide-41
SLIDE 41

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

sort

1 2 3 4 6 7 8 9 11 12 13 14 16 19

Database Cracking CIDR 2007

slide-42
SLIDE 42

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

sort

1 2 3 4 6 7 8 9 11 12 13 14 16 19

binary search

Database Cracking CIDR 2007

slide-43
SLIDE 43

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

sort

1 2 3 4 6 7 8 9 11 12 13 14 16 19

binary search

result

Database Cracking CIDR 2007

slide-44
SLIDE 44

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

sort

1 2 3 4 6 7 8 9 11 12 13 14 16 19

binary search

result

time + knowledge

Database Cracking CIDR 2007

slide-45
SLIDE 45

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

Database Cracking CIDR 2007

slide-46
SLIDE 46

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10

Database Cracking CIDR 2007

slide-47
SLIDE 47

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14

Database Cracking CIDR 2007

slide-48
SLIDE 48

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14

Database Cracking CIDR 2007

slide-49
SLIDE 49

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 result

Database Cracking CIDR 2007

slide-50
SLIDE 50

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 result

gain knowledge on how data is organized

Database Cracking CIDR 2007

slide-51
SLIDE 51

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14

dynamically/on-the-fly within the select-operator

result

gain knowledge on how data is organized

Database Cracking CIDR 2007

slide-52
SLIDE 52

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

slide-53
SLIDE 53

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

slide-54
SLIDE 54

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 4 2 1 3 6 7 9 8 13 12 11 14 16 19 piece1: A<=7 piece2: 7<A<=10

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

slide-55
SLIDE 55

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 4 2 1 3 6 7 9 8 13 12 11 14 16 19 piece1: A<=7 piece2: 7<A<=10 piece3: 10<A<14

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

slide-56
SLIDE 56

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 4 2 1 3 6 7 9 8 13 12 11 14 16 19 piece1: A<=7 piece2: 7<A<=10 piece3: 10<A<14 piece4: 14<=A<=16 piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

Database Cracking CIDR 2007

slide-57
SLIDE 57

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 4 2 1 3 6 7 9 8 13 12 11 14 16 19 piece1: A<=7 piece2: 7<A<=10 piece3: 10<A<14 piece4: 14<=A<=16 piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

result

Database Cracking CIDR 2007

slide-58
SLIDE 58

Stratos Idreos

13 16 4 9 2 12 7 1 19 3 14 11 8 6

column A

Q1: select R.A from R where R.A>10 and R.A<14

4 9 2 7 1 3 8 6 13 12 11 16 19 14 piece1: A<=10 piece2: 10<A<14 piece3: A>=14 4 2 1 3 6 7 9 8 13 12 11 14 16 19 piece1: A<=7 piece2: 7<A<=10 piece3: 10<A<14 piece4: 14<=A<=16 piece5: A>16

Q2: select R.A from R where R.A>7 and R.A<=16

dynamically/on-the-fly within the select-operator

result

the more we crack, the more we learn

Database Cracking CIDR 2007

slide-59
SLIDE 59

Stratos Idreos

Database Cracking CIDR 2007

slide-60
SLIDE 60

Stratos Idreos

select [15,55]

Database Cracking CIDR 2007

slide-61
SLIDE 61

Stratos Idreos

select [15,55]

Database Cracking CIDR 2007

slide-62
SLIDE 62

Stratos Idreos

10 20 30 40 50 60 select [15,55]

Database Cracking CIDR 2007

slide-63
SLIDE 63

Stratos Idreos

10 20 30 40 50 60 select [15,55] select [15,55]

Database Cracking CIDR 2007

slide-64
SLIDE 64

Stratos Idreos

10 20 30 40 50 60 select [15,55] select [15,55]

Database Cracking CIDR 2007

slide-65
SLIDE 65

Stratos Idreos

10 20 30 40 50 60 select [15,55] select [15,55]

pieces become smaller and smaller touch at most two pieces at a time

Database Cracking CIDR 2007

slide-66
SLIDE 66

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

set-up Scan Full Index Crack

0.001 0.01 0.1 1 10 100 1000 1 10 100 1000 10000 100000 Response time (secs) Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

slide-67
SLIDE 67

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead set-up Scan Full Index Crack

0.001 0.01 0.1 1 10 100 1000 1 10 100 1000 10000 100000 Response time (secs) Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

slide-68
SLIDE 68

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead continuous improvement set-up Scan Full Index Crack

0.001 0.01 0.1 1 10 100 1000 1 10 100 1000 10000 100000 Response time (secs) Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

slide-69
SLIDE 69

Stratos Idreos

100K random selections random selectivity random value ranges in a 10 million integer column

almost no initialization overhead continuous improvement set-up Scan Full Index Crack

0.001 0.01 0.1 1 10 100 1000 1 10 100 1000 10000 100000 Response time (secs) Query sequence (x1000)

continuous adaptation

Database Cracking CIDR 2007

slide-70
SLIDE 70

Stratos Idreos

10K random selections selectivity 10% random value ranges in a 30 million integer column

Database Cracking CIDR 2007

set-up

0.004 200 0.001 0.01 0.1 1 10 100 1 10 100 1000 10000 Cumulative average response time (secs) Query sequence

Scan Full Index Crack

continuous adaptation

slide-71
SLIDE 71

Stratos Idreos

10K random selections selectivity 10% random value ranges in a 30 million integer column

Database Cracking CIDR 2007

set-up

0.004 200 0.001 0.01 0.1 1 10 100 1 10 100 1000 10000 Cumulative average response time (secs) Query sequence

Scan Full Index Crack

continuous adaptation

slide-72
SLIDE 72

Stratos Idreos

10K random selections selectivity 10% random value ranges in a 30 million integer column

10K queries later, Full Index still has not amortized the initialization costs Database Cracking CIDR 2007

set-up

0.004 200 0.001 0.01 0.1 1 10 100 1 10 100 1000 10000 Cumulative average response time (secs) Query sequence

Scan Full Index Crack

continuous adaptation

slide-73
SLIDE 73

Stratos Idreos

A

table1

slide-74
SLIDE 74

Stratos Idreos

A B C D

... ...

table1

slide-75
SLIDE 75

Stratos Idreos

A B C D

... ...

table1

select R.A from R where R.A>10 and R.A<14

slide-76
SLIDE 76

Stratos Idreos

A B C D

... ...

table1

select R.A from R where R.A>10 and R.A<14 select max(R.A),max(R.B),max(S.A),max(S.B) from R,S where v1 <R.C<v2 and v3 <R.D<v4 and v5 <R.E<v6 and k1 <S.C<k2 and k3 <S.D<k4 and k5 <S.E<k6 and R.F = S.F

slide-77
SLIDE 77

Stratos Idreos

A B C D

... ...

table1

select R.A from R where R.A>10 and R.A<14 select max(R.A),max(R.B),max(S.A),max(S.B) from R,S where v1 <R.C<v2 and v3 <R.D<v4 and v5 <R.E<v6 and k1 <S.C<k2 and k3 <S.D<k4 and k5 <S.E<k6 and R.F = S.F

updates joins concurrency control ...

slide-78
SLIDE 78

cracking databases

updates

(SIGMOD07)

>1 columns

storage- restrictions

(SIGMOD09)

benchmarking

(TPCTC10)

robustne

concurrency control

(PVLDB12)

algorithms

(PVLDB11)

basics

(CIDR07)

multi-cores

(SIGMOD15)

hadoop

(Yale/Saarland)

b-trees

(HP Labs)

>1 columns

(SIGMOD09)

robustness

(PVLDB12)

adaptive storage

(SIGMOD14)

time-series

(SIGMOD14)

slide-79
SLIDE 79

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive... base data

A B C D

slide-80
SLIDE 80

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-81
SLIDE 81

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-82
SLIDE 82

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-83
SLIDE 83

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-84
SLIDE 84

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation

A B C D

table 1

A B C D

table 2 base data

x x x

A B C D

slide-85
SLIDE 85

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-86
SLIDE 86

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-87
SLIDE 87

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-88
SLIDE 88

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

A B C D

slide-89
SLIDE 89

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

sort in caches

A B C D

slide-90
SLIDE 90

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

sort in caches crack joins

A B C D

slide-91
SLIDE 91

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

sort in caches crack joins

A B C D

lightweight locking

q1 q2

slide-92
SLIDE 92

Stratos Idreos

cracking tangram

A B C D

table 1 table 2 as queries arrive...

partial materialization partial indexing continuous adaptation storage adaptation no tuple reconstruction adaptive alignment

A B C D

table 1

A B C D

table 2 base data

sort in caches crack joins

A B C D

lightweight locking stochastic cracking

query random

slide-93
SLIDE 93

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted Response time (milli secs)

Sideways Cracking, SIGMOD 09

slide-94
SLIDE 94

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted Response time (milli secs)

normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-95
SLIDE 95

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted

preparation cost 3-14 minutes

Response time (milli secs)

presorted MonetDB normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-96
SLIDE 96

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted

preparation cost 3-14 minutes

Response time (milli secs)

presorted MonetDB MonetDB with sideways cracking normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-97
SLIDE 97

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted

preparation cost 3-14 minutes

Response time (milli secs)

presorted MonetDB MonetDB with sideways cracking normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-98
SLIDE 98

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted

preparation cost 3-14 minutes

Response time (milli secs)

presorted MonetDB MonetDB with sideways cracking normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-99
SLIDE 99

Stratos Idreos

70 330 100 150 200 250 300 5 10 15 20 25 30 Query sequence TPC-H Query 15 764 420 1000 10000

  • Sel. Crack
  • Sid. Crack

MonetDB Presorted MySQL Presorted

preparation cost 3-14 minutes

Response time (milli secs)

presorted MonetDB MonetDB with sideways cracking normal MonetDB selection cracking

Sideways Cracking, SIGMOD 09

slide-100
SLIDE 100

Stratos Idreos

cracking on Skyserver (4TB)

(Sloan Digital Sky Survey, www.sdss.org)

cracking answers 160.000 queries while full indexing is still half way creating one index

Stochastic Cracking, PVLDB 12

slide-101
SLIDE 101

Stratos Idreos

storage indexes query schema load

X X X

slide-102
SLIDE 102

Stratos Idreos

adaptive loading (NoDB, CIDR11/SIGMOD12)

storage indexes query schema load

X X X

slide-103
SLIDE 103

Stratos Idreos

adaptive storage (H20, SIGMOD14) adaptive loading (NoDB, CIDR11/SIGMOD12)

storage indexes query schema load

X X X

slide-104
SLIDE 104

Stratos Idreos

adaptive storage (H20, SIGMOD14) adaptive time series indexing (ADS, SIGMOD14) adaptive loading (NoDB, CIDR11/SIGMOD12)

storage indexes query schema load

X X X

slide-105
SLIDE 105

Stratos Idreos

data systems that are easy to design

(storage, data flow, algorithms, tuning, etc)

slide-106
SLIDE 106

Stratos Idreos

e.g., column-stores: first ideas in 80s, first advanced architectures in 90s, first rather complete designs in early 2000s, industry adoption 2010+ still no indexing, cost based optimizations, …

slide-107
SLIDE 107

Stratos Idreos

application requirements hardware budget energy profile performance

(hardware and requirements change continuously and rapidly)

conflicting goals moving target

slide-108
SLIDE 108

Stratos Idreos

data systems design (and research) is kind of an art

slide-109
SLIDE 109

Stratos Idreos

disk memory flash

slide-110
SLIDE 110

Stratos Idreos

data+queries+hardware

data system

self-designing data systems

ACM SIGMOD Blog, June’15

slide-111
SLIDE 111

Stratos Idreos

data+queries+hardware

data system

self-designing data systems

ACM SIGMOD Blog, June’15

easy to design adapt to environment

slide-112
SLIDE 112

Stratos Idreos

adaptivity across architecture borders

row-store column-store key-value store hybrid store

slide-113
SLIDE 113

Stratos Idreos

data systems that are easy to use

dbTouch

show me something interesting

Queriosity DATA

slide-114
SLIDE 114

Stratos Idreos

data systems today

allow us to answer queries fast

data systems tomorrow

should allow us to find fast which queries to ask

db

db explore

slide-115
SLIDE 115

Stratos Idreos

every query is treated as an advice

  • n how data should be stored

instead of making fixed decisions

slide-116
SLIDE 116

Stratos Idreos

http://daslab.seas.harvard.edu/

Martin Kersten Stefan Manegold Goetz Graefe Harumi Kuno Anastasia Ailamaki Themis Palpanas Eleni Petraki Ioannis Alagiannis Miguel Branco Renata Borovica Erietta Liarou Felix Halim Ronald Yap Panos Karras Kostas Zoumpatianos Manos Athanassoulis Lukas Maas Abdul Wasay Mike Kester Dhruv Gupta

slide-117
SLIDE 117

Stratos Idreos

http://daslab.seas.harvard.edu/

thank you!

Martin Kersten Stefan Manegold Goetz Graefe Harumi Kuno Anastasia Ailamaki Themis Palpanas Eleni Petraki Ioannis Alagiannis Miguel Branco Renata Borovica Erietta Liarou Felix Halim Ronald Yap Panos Karras Kostas Zoumpatianos Manos Athanassoulis Lukas Maas Abdul Wasay Mike Kester Dhruv Gupta