Making Cost-Based Query Optimization Asymmetry-Aware Daniel Bausch, - - PowerPoint PPT Presentation

making cost based query optimization asymmetry aware
SMART_READER_LITE
LIVE PREVIEW

Making Cost-Based Query Optimization Asymmetry-Aware Daniel Bausch, - - PowerPoint PPT Presentation

Making Cost-Based Query Optimization Asymmetry-Aware Daniel Bausch, Ilia Petrov, and Alejandro Buchmann {bausch, petrov, buchmann}@dvs.tu-darmstadt.de 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel


slide-1
SLIDE 1

Making Cost-Based Query Optimization Asymmetry-Aware

Daniel Bausch, Ilia Petrov, and Alejandro Buchmann {bausch, petrov, buchmann}@dvs.tu-darmstadt.de

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 1

slide-2
SLIDE 2

Asymmetry in new storage devices

Writing to Flash memory is slower than reading from it This also applies to emerging non-volatile memories (PCM, etc.) Small writes to random locations on Flash are even more slow Random reads from Flash are only 1

3 slower than sequential reads1

1on Intel X25-E using full command queue

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 2

slide-3
SLIDE 3

Cost-Based Query Optimization

estimation of run-time before real execution (e.g. in PostgreSQL) model comprised of functions like c(seqscan) = cs Rp + qR,0 + (˙ ccpu + ˙ qR) Rt

I/O cost CPU cost

statistics query planner plan A plan B . . . plan Z cost model cost value A cost value B . . . cost value Z “best” plan configuration

  • rder

and select

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 3

slide-4
SLIDE 4

Splitting Parameters

csr sequential page read accesses sequential page accesses cs csw sequential page write accesses crr random page read accesses random page accesses cr crw random page write accesses

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 4

slide-5
SLIDE 5

Cost functions for “pure load” algorithms

cost function kind

  • riginal

replacement sequential scan read only cs csr index scan read only cs csr cr crr bitmap scan read only cs csr cr crr TID scan read only cr crr e materialization write only cs csw re-scan read only cs csr

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 5

slide-6
SLIDE 6

Cost function of sort algorithm

cio(sort) = 2 Sp ⌈logmn⌉

3

4cs + 1 4cr

  • startup

blktrace stats write read s r s r external sort of unordered data external sort of ordered data sort-merge join cio(sort)rw = Sp

  • csw +
  • ⌈logmn⌉ − 1

csw+crw

2

+ ⌈logmn⌉ csr+crr

2

  • startup

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 6

slide-7
SLIDE 7

Cost function of hash join

cio(hashjoin) = Pip cs +

  • Pip + 2 Pop
  • cs

startup

blktrace stats write read s r s r hash join cio(hashjoin)rw = Pip crw + Pip csr + Pop (crw + csr)

startup

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 7

slide-8
SLIDE 8

System and Load

C L N O P PS R S iCn iLcd iLo iLoq iLp iLrd iLsd iLs iLsp iNr iOc iOod iPSp iPSs iSn iC iN iO iP iPS iR iS RAM e sb w 1000 2000 3000 4000 MB

  • bj

system with tight memory configuration

⇒ simulate data-intensive systems under high load

application class benchmark based on TPC-H specs

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 8

slide-9
SLIDE 9

Calibration

statistics query planner plan A plan B . . . plan Z cost model cost value A cost value B . . . cost value Z “best” plan configuration calibration calibration cache

  • rder

and select cost value, execution time based on simulated annealing[Kirk1983], accepts inferior configuration at decreasing probability modify by multiplication with logarithmic normally distributed random variable cooling cycle of 100 iterations, restarted 100 times

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 9

slide-10
SLIDE 10

Found “Optimal” Settings

SSD HDD

  • riginal model

asymmetric model

  • riginal model

asymmetric model cs = 1.00000 csr = 1.00000 cs = 1.00000 csr = 1.00000 csw = 49.91840 csw = 110.21139 cr = 6.77405 crr = 5.62724 cr = 29.04790 crr = 19.25494 crw = 19.08421 crw = 20.18467 ˙ ccpu = 0.00121 ˙ ccpu = 0.00003 ˙ ccpu = 0.00280 ˙ ccpu = 0.00082 ˙ ˆ ccpu = 0.03658 ˙ ˆ ccpu = 0.01608 ˙ ˆ ccpu = 0.03718 ˙ ˆ ccpu = 0.00045 cop = 0.00016 cop = 0.00008 cop = 0.00004 cop = 0.00119

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 10

slide-11
SLIDE 11

Comparison

500 1000 1500 2000 2500 TPC−H random seed total time [s] SSD 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 644961076 1231071980 3962584638 4133947234 1671380912 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782

  • rig. model / initial config
  • rig. model / calibrated config
  • asym. model / initial config
  • asym. model / calibrated config

5000 10000 15000 20000 25000 30000 TPC−H random seed 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 HDD

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 11

slide-12
SLIDE 12

Individual Query Speed-Up

01 02 03 04 05 06 07 08 09 10 11 12 13 14 16 17 18 19 20 21 22 tot

speed−up

TPC−H query faster with asym. model on SSD faster with orig. model on SSD faster with asym. model on HDD faster with orig. model on HDD

−1% 0% 5% 50% 10% 5% 36% 2% 22% 2% 4% 3% 7% −4% 7% 5% 14% −17% −1% 0% −1% 0% 16% 2% 2% 3% 2% 2% −1% −2% −1% −1% 140% −10% −4% 3% −0% −1% 549% 2609% 4% −74% 48% 153%

−100% 0% 100% 200%

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 12

slide-13
SLIDE 13

Discussion – Cause for Peaks ⇒ Query 9

100 200 300 400 500 600 700 seed time of query 9 [s] 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 644961076 1231071980 3962584638 4133947234 1671380912 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782

  • rig_default
  • rig_pgcalib

acm_default acm_pgcalib

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 13

slide-14
SLIDE 14

Discussion – Query 9 – the slow plan

Nested Loop 509541.375 Nested Loop 102642.716 Index Scan using pk_orders on orders

  • rders.o_orderkey=lineitem.l_orderkey

0.209*1934051=404216.659 Merge Join partsupp.ps_partkey=lineitem.l_partkey Join Filter: lineitem.l_suppkey=partsupp.ps_suppkey 88687.896 Index Scan using pk_supplier on supplier supplier.s_suppkey=lineitem.l_suppkey 0.006*1934051=11604.306 Index Scan using i_ps_partkey on partsupp 2986.769 Materialize 79808.663 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 14

slide-15
SLIDE 15

Discussion – Query 9 – the fast plan

Nested Loop 68231.426 Hash Join lineitem.l_suppkey=supplier.s_suppkey AND lineitem.l_partkey=partsupp.ps_partkey 54756.688 Index Scan using pk_orders on orders

  • rders.o_orderkey=lineitem.l_orderkey

0.008*1468202=11745.616 Seq Scan

  • n lineitem

13467.469 Hash (1 Batch) 6460.738 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 15

slide-16
SLIDE 16

Discussion – Biggest Improvement ⇒ Query 21

100 200 300 400 seed time of query 21 [s] 960637766 2888465430 2988248363 109994110 2090918519 80136562 742878636 3137579638 3972888868 47261513 644961076 1231071980 3962584638 4133947234 1671380912 747421973 4204648090 3759550337 1960789542 3615766274 1073327142 970312735 2473950213 1712708155 3744046782

  • rig_default
  • rig_pgcalib

acm_default acm_pgcalib

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 16

slide-17
SLIDE 17

Discussion – Query 21 – the slow plan

Nested Loop Anti Join Join Filter: l3.l_suppkey=l1.l_suppkey 327228.606 Hash Semi Join l1.l_orderkey=l2.l_orderkey Filter: l2.l_suppkey<>l1.l_suppkey 63120.425 Index Scan using i_l_orderkey on l3 l3.l_orderkey=l1.l_orderkey Filter: l3.l_receipdate>l3.l_commitdate 0.362*726869=263126.578 Hash Join l1.l_suppkey=supplier.s_suppkey 22225.914 Hash (128 Batches, 9271kB) 32415.030 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 17

slide-18
SLIDE 18

Discussion – Query 21 – the fast plan

Nested Loop Anti Join Join Filter: l3.l_suppkey<>l1.l_suppkey 62690.508 Merge Semi Join

  • rders.o_orderkey=l2.l_orderkey

Join Filter: l2.l_suppkey<>l1.l_suppkey 59574.038 Index Scan using i_l_orderkey on l3 l3.l_orderkey=l1.l_orderkey Filter: l3.l_receiptdate>l3.l_commitdate 0.008*350514=2804.112 Merge Join

  • rders.o_orderkey=l1.l_orderkey

31968.016 Index Scan using i_l_orderkey on l2 20118.627 Index Scan using pk_orders on orders Filter: o_orderstatus='F' 5666.513 Materialize 24770.922 2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 18

slide-19
SLIDE 19

Conclusion

Derived an asymmetry-aware model from facts and observable behavior

Storage properties Algorithm access behavior

Comparison shows improved performance on application-class benchmark

Average speed-up at 48% Individual query speed-up by up to 549%

Calibration is attracted by strong effects

TPC-H features properties not respected in PostgreSQL ’s optimizer Available degrees of freedom may got abused to compensate deficiencies

Additional experiments are required to decide upon original hypothesis

Using a special load (i.e. a custom micro-benchmark) Focussing on optimization problems in which asymmetry matters explicitely

Calibration may be a useful tool find optimal configurations in general

Typical queries need to be combined in a repeatable benchmark Can cope with any number of variables

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 19

slide-20
SLIDE 20

Bibliography

[Knut1973] Donald E. Knuth. The Art of Computer Programming, Volume III: Sorting and Searching. Addison-Wesley, 1973. [Kirk1983]

  • S. Kirkpatrick, C. D. Gelatt, and M. P

. Vecchi. Optimization by Simulated Annealing. In Science, vol. 220, no. 4598, pages 671–680, 1983. [YoBC2009] Terry Yoshii, Christian Black, and Sudip Chahal. Solid-state drives in the enterprise: A proof of concept. Intel white paper, Intel Corporation, March 2009.

2012-05-21 | Computer Science Department | Databases and Distributed Systems Group | Daniel Bausch, Ilia Petrov, Alejandro Buchmann | 20