APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS - - PowerPoint PPT Presentation

apollo
SMART_READER_LITE
LIVE PREVIEW

APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS - - PowerPoint PPT Presentation

APOLLO AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS Jinho Jung , Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang* * APOLLO Holistic toolchain for debugging DBMS 1 AUTOMATICALLY FIND SQL QUERIES


slide-1
SLIDE 1

APOLLO

AUTOMATIC DETECTION AND DIAGNOSIS OF PERFORMANCE REGRESSIONS IN DATABASE SYSTEMS

Jinho Jung, Hong Hu, Joy Arulraj, Taesoo Kim, Woonhak Kang*

*

slide-2
SLIDE 2

APOLLO

  • Holistic toolchain for debugging DBMS
2

1

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

AUTOMATICALLY FIND SQL QUERIES EXHIBITING PERFORMANCE REGRESSIONS

2 AUTOMATICALLY DIAGNOSE THE ROOT CAUSE OF

PERFORMANCE REGRESSIONS

slide-3
SLIDE 3

MOTIVATION: DBMS COMPLEXITY

3

6.1 26.4 47.7 1.4 4.4 8.7

10 20 30 40 50 60 2000 2010 Present

Release Year

PostgreSQL SQLite

7x increase

Code Size (MB)

JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-4
SLIDE 4

MOTIVATION: PERFORMANCE REGRESSIONS

JINHO JUNG (JINHO.JUNG@GATECH.EDU) 4

CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE

slide-5
SLIDE 5

MOTIVATION: PERFORMANCE REGRESSIONS

  • Scenario: User upgrades a DBMS installation

▫ Query suddenly takes 10 times longer to execute ▫ Due to unexpected interactions between different components ▫ Refer to this behavior as a performance regression

  • Performance regression can hurt user productivity

▫ Can easily convert an interactive query to an overnight one

JINHO JUNG (JINHO.JUNG@GATECH.EDU) 5

CHALLENGING TO BUILD SYSTEM WITH PREDICTABLE PERFORMANCE

slide-6
SLIDE 6

MOTIVATION: PERFORMANCE REGRESSIONS

6

SELECT R0.S_DIST_06 FROM PUBLIC.STOCK AS R0 WHERE (R0.S_W_ID < CAST(LEAST(0, 1) AS INT8))

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

> 10,000x slowdown

LATEST VERSION OF POSTGRESQL

  • Due to a recent optimizer update

▫ New policy for choosing the scan algorithm ▫ Resulted in over-estimating the number of rows in the table ▫ Earlier version: Fast bitmap scan ▫ Latest version: Slow sequential scan

slide-7
SLIDE 7

MOTIVATION: DETECTING REGRESSIONS

7 SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL));

Query runs slower on latest version

SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL)); JINHO JUNG (JINHO.JUNG@GATECH.EDU)

1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS?

slide-8
SLIDE 8

MOTIVATION: REPORTING REGRESSIONS

8 JINHO JUNG (JINHO.JUNG@GATECH.EDU) SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL));

Query runs slower on latest version

SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL));

2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION?

slide-9
SLIDE 9

MOTIVATION: DIAGNOSING REGRESSIONS

9 JINHO JUNG (JINHO.JUNG@GATECH.EDU) SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL));

Query runs slower on latest version

SELECT NO FROM ORDER AS R0 WHERE EXISTS ( SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM HISTORY AS R2 WHERE (R0.INFO IS NOT NULL));

3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION?

slide-10
SLIDE 10

APOLLO TOOLCHAIN

10

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
JINHO JUNG (JINHO.JUNG@GATECH.EDU)

1 HOW TO DISCOVER QUERIES EXHIBITING REGRESSIONS?

SQLFUZZ: FEEDBACK-DRIVEN FUZZING

slide-11
SLIDE 11

APOLLO TOOLCHAIN

11 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function

2 HOW TO SIMPLIFY QUERIES FOR REPORTING REGRESSION?

SQLMIN: BI-DIRECTIONAL QUERY REDUCTION ALGORITHMS

slide-12
SLIDE 12

APOLLO TOOLCHAIN

12 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function

3 HOW TO DIAGNOSE THE ROOT CAUSE OF THE REGRESSION?

SQLDEBUG: STATISTICAL DEBUGGING + COMMIT BISECTION

slide-13
SLIDE 13

TALK OVERVIEW

13

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-14
SLIDE 14

#1: SQLFUZZ — DETECTING REGRESSIONS

14

OLD VERSION NEW VERSION

Query Generator Query Executor Bug Validator

SQLFuzz

Random queries Candidate queries Queries exhibiting performance regression Update SQL grammar probability table

1 2 3 JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-15
SLIDE 15

#1: SQLFUZZ — DETECTING REGRESSIONS

15

Query Generator

Retrieve schema SQL grammar probability table Valid queries Check complexity Queries for fuzzing

SELECT 0.3 LEFT JOIN 0.3 LIMIT 0.2 CAST 0.2 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

1 QUERY GENERATOR: RANDOM QUERY GENERATION

slide-16
SLIDE 16

#1: SQLFUZZ — DETECTING REGRESSIONS

16

OLD VERSION NEW VERSION Query Executor

Found Regression?

SELECT R0.S_DIST_06 FROM PUBLIC.STOCK AS R0 WHERE (R0.S_W_ID < CAST (LEAST(0, 1) AS INT8))

Update table

CASE LEFT JOIN LIMIT CAST +0.1 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

2 QUERY EXECUTOR: FEEDBACK-DRIVEN FUZZING

SQL grammar probability table

slide-17
SLIDE 17

#1: SQLFUZZ — DETECTING REGRESSIONS

17 1 Non-deterministic behavior? 2 Non-executed plan? 3 Usage of catalog statistics? 4 Enough memory? 5 Limit statement? 6 Query is too complex? 7 …

Developers

Updated filtering rules

Regression Query

Filtering rules

Report

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

3 REGRESSION VALIDATOR: REDUCING FALSE POSITIVES

slide-18
SLIDE 18

TALK OVERVIEW

18

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-19
SLIDE 19

#2: SQLMIN — REPORTING REGRESSIONS

  • Bottom-up Query Reduction

▫ Extract valid sub-query

  • Top-down Query Reduction

▫ Iteratively removes unnecessary expressions

19 JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-20
SLIDE 20

#2: SQLMIN — REPORTING REGRESSIONS

20

SELECT S1.C2 FROM ( SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL)) ) THEN S0.C0 END AS C2, FROM ( SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1, (SELECT ID FROM ITEM) AS C2 FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL OR (R0.PRICE IS NOT S1.C2) LIMIT 1000) AS S0) AS S1;

JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-21
SLIDE 21

#2: SQLMIN — REPORTING REGRESSIONS

21

BOTTOM-UP REDUCTION

EXTRACT SUB-QUERY Remove dependencies

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

SELECT S1.C2 FROM ( SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL)) ) THEN S0.C0 END AS C2, FROM ( SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1, (SELECT ID FROM ITEM) AS C2 FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL OR (R0.PRICE IS NOT S1.C2) LIMIT 1000) AS S0) AS S1;

slide-22
SLIDE 22

#2: SQLMIN — REPORTING REGRESSIONS

22

Remove condition Remove columns Remove sub-queries Remove clause

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

SELECT S1.C2 FROM ( SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE ((S0.C0 = 10) AND (S0.C1 IS NULL)) ) THEN S0.C0 END AS C2, FROM ( SELECT R0.I_PRICE AS C0, R0.I_DATA AS C1, (SELECT ID FROM ITEM) AS C2 FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL OR (R0.PRICE IS NOT S1.C2) LIMIT 1000) AS S0) AS S1;

TOP-DOWN REDUCTION

REMOVE ELEMENTS

slide-23
SLIDE 23

#2: SQLMIN — REPORTING REGRESSIONS

23 JINHO JUNG (JINHO.JUNG@GATECH.EDU)

SELECT CASE WHEN EXISTS ( SELECT S0.C0 FROM ORDER AS R1 WHERE ((S0.C0 = 10)) ) THEN S0.C0 END AS C2, FROM ( SELECT R0.I_PRICE AS C0, FROM ITEM AS R0 WHERE R0.PRICE IS NOT NULL) AS S0) AS S1;

slide-24
SLIDE 24

TALK OVERVIEW

24

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-25
SLIDE 25

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

25 Slow Fast

DBMS

Commit bisection SQLMIN

SQLDEBUG

Regression query

First commit exhibiting regression?

Statistical Debugger

Control-flow Graphs (Traces) Partially Reduced queries

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
slide-26
SLIDE 26

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

26

COMMIT 1 COMMIT 2 COMMIT 3 COMMIT 5

NEW VERSION (SLOW QUERY EXECUTION) OLD VERSION (FAST QUERY EXECUTION)

PROBLEM BEGINS HERE!

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

1 COMMIT BISECTION: FIND EARLIEST PROBLEMATIC COMMIT

slide-27
SLIDE 27

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

27

Partially reduced queries Minimized query Original query

SELECT NO FROM ORDER AS R0 WHERE EXISTS (SELECT CNT FROM SALES AS R1 WHERE EXISTS ( SELECT ID FROM SELECT CNT FROM SALES WHERE CNT > ID JINHO JUNG (JINHO.JUNG@GATECH.EDU)

2 QUERY REDUCTION: PARTIALLY REDUCED QUERIES

Collect set
  • f queries

Ready to use statistical debugging?

slide-28
SLIDE 28

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

28

Functions int func(){ if (cond1) work; } int func(){ if (cond1) work; }

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

OLD VERSION NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

slide-29
SLIDE 29

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

29

Functions Traces int func(){ if (cond1) work; } int func(){ if (cond1) work; }

0x400 0x420  TRUE 0x500 0x520  FALSE

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

OLD VERSION NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

slide-30
SLIDE 30

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

30

Functions Traces int func(){ if (cond1) work; } int func(){ if (cond1) work; }

0x400 0x420  TRUE 0x500 0x520  FALSE

Trace Alignment

func + 0x0 func + 0x20  TRUE func + 0x0 func + 0x20  FALSE

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

OLD VERSION NEW VERSION

3 CONTROL-FLOW GRAPH COMPARISON: ALIGN TRACES

slide-31
SLIDE 31

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

31

Statistical model

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Fast query execution traces Slow query execution traces

PRED. RESULT 1 TAKEN 2 TAKEN PRED. RESULT 1 TAKEN 2 NOT TAKEN

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

slide-32
SLIDE 32

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

32

Fast query execution traces Slow query execution traces

Final report

RANK FILE FUNCTION LINE 1 foo.c bar() 2 … … … … PRED. RESULT 1 TAKEN 2 TAKEN PRED. RESULT 1 TAKEN 2 NOT TAKEN

Statistical model

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

4 STATISTICAL DEBUGGING: FAST AND SLOW QUERY TRACES

slide-33
SLIDE 33

RECAP

33

OLD VERSION NEW VERSION

SQLFUZZ SQLMIN SQLDEBUG

APOLLO TOOLCHAIN

BUG REPORTS

  • Query
  • Commit
  • File list
  • Function
JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-34
SLIDE 34

EVALUATION

  • Tested database systems

▫ PostgreSQL, SQLite

  • Binary instrumentation to get control flow graphs

▫ DynamoRIO instrumentation tool

  • Evaluation

▫ Efficacy of SQLFuzz in detecting regressions? ▫ Efficacy of SQLMin in reducing queries? ▫ Accuracy of SQLDebug in diagnosing regressions?

34 JINHO JUNG (JINHO.JUNG@GATECH.EDU)
slide-35
SLIDE 35

#1: SQLFUZZ — DETECTING REGRESSIONS

35

218 201

50 100 150 200 250

PostgreSQL SQLite

200x performance drop

Mean Performance Drop (Ratio)

Lower is Better

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Discovered 10 previously unknown, unique performance regressions.

slide-36
SLIDE 36

#1: SQLFUZZ — FALSE POSITIVES

36

99 0.0044

0.001 0.01 0.1 1 10 100

Discovered Queries SQLFuzz

False Positives Queries (Percent)

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Lower is Better

Filtering rules remove almost all false positives

slide-37
SLIDE 37

#2: SQLMIN — REPORTING REGRESSIONS

37

1602 380

500 1000 1500 2000

Discovered Queries SQLMin

Query Size (Bytes)

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Lower is Better

Significant reduction in query size

slide-38
SLIDE 38

#3: SQLDEBUG — DIAGNOSING REGRESSIONS

38

5 2 3

FIRST RANKED BRANCH SECOND RANKED BRANCH THIRD RANKED BRANCH

10 regressions

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

Branch related to root cause Correctly identified in all cases (within top-3 ranked branches)

slide-39
SLIDE 39

CASE STUDY #1: OPTIMIZER UPDATE

39

SELECT COUNT (∗) FROM (SELECT R0.ID FROM CUSTOMER AS R0 LEFT JOIN STOCK AS R1 ON (R0.STREET = R1.DIST) WHERE R1.DIST IS NOT NULL) AS S0 WHERE EXISTS (SELECT ID FROM CUSTOMER);

  • Due to a bug fix (for a correctness bug)

▫ Breaks query optimization ▫ Optimizer no longer transforms the LEFT JOIN operator

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

> 1000x slow down

LATEST VERSION OF SQLITE

slide-40
SLIDE 40

CASE STUDY #2: EXECUTION ENGINE UPDATE

40

SELECT R0.ID FROM ORDER AS R0 WHERE EXISTS (SELECT COUNT(∗) FROM (SELECT DISTINCT R0.ENTRY FROM CUSTOMER AS R1 WHERE (FALSE)) AS S1);

  • Hashed aggregation executor update

▫ Resulted in redundantly building hash tables

JINHO JUNG (JINHO.JUNG@GATECH.EDU)

3x slow down

LATEST VERSION OF POSTGRESQL

slide-41
SLIDE 41

CONCLUSION

41 JINHO JUNG (JINHO.JUNG@GATECH.EDU)
  • APOLLO (v1.0)

▫ Toolchain for detecting & diagnosing regressions ▫ Open-sourced: https://github.com/sslab-gatech/apollo

  • Adding support for other types of bugs (v2.0)

▫ Correctness bugs ▫ Performance bugs ▫ Database corruption

slide-42
SLIDE 42

CONCLUSION

42 JINHO JUNG (JINHO.JUNG@GATECH.EDU)
  • Interested in integrating APOLLO with more DBMSs

▫ Discovered > 5 performance regressions in CockroachDB ▫ Improve the toolchain based on developer feedback

  • Automation will help reduce labor of developing DBMSs

▫ Developers get to focus on more important problems

slide-43
SLIDE 43

ACKNOWLEDGEMENTS

JINHO JUNG (JINHO.JUNG@GATECH.EDU) 43

Supported by: Developers:

slide-44
SLIDE 44

END

JINHO.JUNG@GATECH.EDU