Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia - - PowerPoint PPT Presentation

access paths
SMART_READER_LITE
LIVE PREVIEW

Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia - - PowerPoint PPT Presentation

Smooth Scan: Statistics-Oblivious Access Paths Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser Optimizers sensitivity to statistics Setting : TPC-H, SF10, DBMS-X, Tuning tool 5GB space 1000 400


slide-1
SLIDE 1

Smooth Scan: Statistics-Oblivious Access Paths

Renata Borovica-Gajic Stratos Idreos Anastasia Ailamaki Marcin Zukowski Campbell Fraser

slide-2
SLIDE 2

Degradation due to sub-optimal access paths

Optimizers’ sensitivity to statistics

2

0.1 1 10 100 1000

Q1 Q2 Q3 Q4 Q5 Q6 Q7 Q8 Q9 Q10 Q11 Q12 Q13 Q14 Q16 Q18 Q19 Q21 Q22

Normalized exec. time (log) TPC-H Query

Tuned Original

Setting: TPC-H, SF10, DBMS-X, Tuning tool 5GB space 400

slide-3
SLIDE 3

Re-optimization: risky

Access path selection problem

Selectivity Execution time

100% Full Scan

RISK

Index Scan Performance cliff Estimated Actual Full Scan

Statistics: unreliable advisor

3

Re-optimization

[MID’98, POP’04, RIO’05, BOU’14]

slide-4
SLIDE 4

No single path is optimal

Access paths under looking glass

4

...

HEAP PAGES INDEX

Sequential Access + (fast) sequential I/O

  • read everything

Index Access + read what you need

  • random (& repeated) I/O
slide-5
SLIDE 5

Near-optimal throughout entire selectivity range

Quest for robust access paths

5

100% Index Scan

Robust Execution

Selectivity Execution time

Full Scan Full Scan

RISK

slide-6
SLIDE 6

Smooth Scan in a nutshell

  • Statistics-oblivious access path
  • Learn result distribution at run-time
  • Adapt as you go

DESIGN GOALS

  • Avoid performance cliffs & risk
  • Continuous, gradual and smooth adaptation

6

slide-7
SLIDE 7

Adaptivity with Smooth Scan

7

Morph between Index and Sequential Scan

slide-8
SLIDE 8

Morphing mechanism

  • Modes:

1. Index Access: Traditional index access 2. Entire Page Probe: Index access probes entire page 3. Gradual Flattening Access: Probe adjacent region(s)

8

...

HEAP PAGES INDEX

Mode 1 Mode 2 Mode 3

slide-9
SLIDE 9

Morphing policies

  • Policies:

– Greedy – Selectivity Increase Driven – Elastic

Selectivity increase -> Mode Increase SEL_region > SEL_global Selectivity decrease -> Mode Decrease SEL_region < SEL_global

X X X XX X X X X X X X XX

INDEX

XX

SR:1 SR:1 SR:0.5 SR:0.75 SR:1 SR:1 SR:0.5 SG: 0

X: Page with result SR: Region selectivity SG: Global selectivity

1 0.81 0.66 0.7 0.75

Region snooping = Selectivity driven adaptation

HEAP PAGES

9

slide-10
SLIDE 10

Smooth Scan benefits

Index Scan Full Scan Sort Scan Smooth Scan Avoid repeated accesses

  

Fast sequential I/O

  

Avoid full table read

 

Tuples pipelining

 

10

slide-11
SLIDE 11

Experimental setup

11

Hardware:

2 Intel Xeon 6-core CPU @2.8 GHz, 48GB RAM HDD: I/O transfer rate 120 MB/s, Random vs. Sequential ratio = 10

Software:

PostgreSQL 9.2.1: Index Scan, Full Scan, Sort (Bitmap) Scan, Smooth Scan

Workload:

TPC-H: SF 10 Micro-benchmark: 400M tuples, 10 columns random (1 – 105), 25GB Q1: select * from relation where c2 >= 0 and c2< X% [order by c2];

Experimental Condition:

Cold file system cache

slide-12
SLIDE 12

TPC-H with Smooth Scan

Setting: TPC-H, SF10, PostgreSQL with Smooth Scan

Robust execution for all queries

200 400 600 800 1000 1200 1400

Q1 (Sort Scan) Q4 (Full Scan) Q6 (Index Scan) Q7 (Index Scan) Q14 (Index Scan)

Execution time (sec)

PostgreSQL PostgreSQL with Smooth Scan

15% 10x

12

High selectivity Low selectivity

slide-13
SLIDE 13

TPC-H breakdown

200 400 600 800 1000 1200

pSQL pSQL w. Smooth Scan pSQL pSQL w. Smooth Scan pSQL pSQL w. Smooth Scan pSQL pSQL w. Smooth Scan pSQL pSQL w. Smooth Scan Q1 (Sort Scan) Q4 (Full Scan) Q6 (Index Scan) Q7 (Index Scan) Q14 (Index Scan)

Execution time (sec) CPU Utilization I/O Wait time

Smooth Scan significantly decreases I/O wait time

Q1 Q4 Q6 Q7 Q14 pSQL Smooth S. pSQL Smooth S. pSQL Smooth S. pSQL Smooth S. pSQL Smooth S. # I/O Requests (K) 70 77 224 235 566 95 745 124 416 87

13

slide-14
SLIDE 14

Time (sec)

Index Scan

Snooping I/O access

Time (sec) Time (sec) Block address

Sequential Scan Smooth Scan Setting: TPC-H, Q1, Lineitem table, iosnoop tool

Smooth Scan reduces random I/O requests

14

slide-15
SLIDE 15

0.1 1 10 100 1000 10000 100000 0.001 0.01 0.1 1 20 50 75 100

Execution time (sec)

Selectivity(%) Full Scan Index Scan Sort Scan Smooth Scan 0.1 1 10 100 1000 10000 100000 0.001 0.01 0.1 1 20 50 75 100

Execution time (sec)

Selectivity(%) Full Scan Index Scan Sort Scan Smooth Scan

NO ORDER BY ORDER BY

Adaptivity over selectivity range

Setting: Micro-benchmark, Q1 (w. and w/o. order), Selectivity 0-100%

Near-optimal performance throughout entire range

115x

Robust Execution Robust Execution

15

slide-16
SLIDE 16

Conclusions

16

SMOOTH SCAN

  • Statistics-oblivious access path
  • Uses region snooping to morph between alternatives
  • Near-optimal performance for all selectivities

IMPACT

  • Removes access path selection decision
  • Robust execution for all query inputs
slide-17
SLIDE 17

Q & A Thank you!

renata.borovica@epfl.ch stratos@seas.harvard.edu natassa@epfl.ch marcin@snowflake.net campbellf@google.com