@andy_pavlo Part #1 Background Part # 2 Engineering Part # 3 - - PowerPoint PPT Presentation

andy pavlo
SMART_READER_LITE
LIVE PREVIEW

@andy_pavlo Part #1 Background Part # 2 Engineering Part # 3 - - PowerPoint PPT Presentation

@andy_pavlo Part #1 Background Part # 2 Engineering Part # 3 Oracle Rant 3 AUTONOMOUS DBMSs SELF-ADAPTIVE DATABASES 1970-1990s Index Selection Self-Adaptive Partitioning / Sharding Databases Data Placement 3 AUTONOMOUS


slide-1
SLIDE 1

@andy_pavlo

slide-2
SLIDE 2

Part #1 Part # 2 Part # 3 Background Engineering Oracle Rant

slide-3
SLIDE 3 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

→Index Selection →Partitioning / Sharding →Data Placement

slide-4
SLIDE 4 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Admin
slide-5
SLIDE 5 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME Tuning Algorithm Admin
slide-6
SLIDE 6 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin
slide-7
SLIDE 7 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin
slide-8
SLIDE 8 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin
slide-9
SLIDE 9 AUTONOMOUS DBMSs

3

SELF-ADAPTIVE DATABASES

1970-1990s Self-Adaptive Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME Tuning Algorithm Admin

→Index Selection →Partitioning / Sharding →Data Placement

slide-10
SLIDE 10 AUTONOMOUS DBMSs

4

SELF-TUNING DATABASES

1990-2000s Self-Tuning Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Tuning Algorithm Admin

→Index Selection →Partitioning / Sharding →Data Placement

A.ID A.VAL B.ID B.NAME
slide-11
SLIDE 11 AUTONOMOUS DBMSs

4

SELF-TUNING DATABASES

1990-2000s Self-Tuning Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Tuning Algorithm Admin

AutoAdmin

A.ID A.VAL B.ID B.NAME Optimizer Cost Model
slide-12
SLIDE 12 AUTONOMOUS DBMSs

4

SELF-TUNING DATABASES

1990-2000s Self-Tuning Databases

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' Tuning Algorithm Admin

AutoAdmin

A.ID A.VAL B.ID B.NAME Optimizer Cost Model
slide-13
SLIDE 13

541 291

200 400 600 2000 2004 2008 2012 2016 Number of Knobs

AUTONOMOUS DBMSs

4

SELF-TUNING DATABASES

1990-2000s Self-Tuning Databases

→Knob Configuration

slide-14
SLIDE 14 AUTONOMOUS DBMSs

5

CLOUD MANAGED DATABASES

2010s Cloud Databases

slide-15
SLIDE 15 AUTONOMOUS DBMSs

5

CLOUD MANAGED DATABASES

2010s Cloud Databases

slide-16
SLIDE 16 AUTONOMOUS DBMSs

5

CLOUD MANAGED DATABASES

2010s Cloud Databases

→Initial Placement →Tenant Migration

slide-17
SLIDE 17

W hy is this previous work insufficient?

slide-18
SLIDE 18 AUTONOMOUS DBMSs

7

A BRIEF HISTORY

Problem #1 Human Judgements Problem #2 Reactionary Measures

slide-19
SLIDE 19

W hat is different this time?

slide-20
SLIDE 20 AUTONOMOUS DATABASES

Better hardware. Better machine learning tools. Better appreciation for data. We seek to complete the circle in autonomous databases.

WHY NOW?
slide-21
SLIDE 21 CARNEGIE MELLON UNIVERSITY

10

RESEARCH PROJECTS

OtterTune Existing Systems Peloton New System

slide-22
SLIDE 22

OtterTune

Database Tuning-as-a-Service → Automatically generate DBMS knob configurations. → Reuse data from previous tuning sessions.

  • ttertune.cs.cmu.edu

Supported Systems

slide-23
SLIDE 23 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

CONTROLLER

COLLECTOR
slide-24
SLIDE 24 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-25
SLIDE 25 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-26
SLIDE 26 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-27
SLIDE 27 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-28
SLIDE 28 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-29
SLIDE 29 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry
slide-30
SLIDE 30 OTTERTUNE

12

AUTOMATIC DBMS TUNING SERVICE

TARGET DATABASE

TUNING MANAGER CONTROLLER

COLLECTOR Configur igurat ation
  • n
Recommend nder Knob Analyzer Metric Analyzer Interna ernal Reposit sitory ry INSTALL AGENT
slide-31
SLIDE 31 OTTERTUNE

13

DEMO

Demonstration

Postgres v9.3 TPC-C Benchmark

slide-32
SLIDE 32 OTTERTUNE TPC-C TUNING AUTOMATIC DATABASE MANAGEMENT SYSTEM TUNING THROUGH LARGE-SCALE MACHINE LEARNING SIGMOD 2017 165 508 562 736 686

250 500 750 1000

426 845 714 843 946

250 500 750 1000

Throughput (txn/sec)

Default RDS DBA Scripts OtterTune

14

slide-33
SLIDE 33

Peloton

Self-Driving Database System → In-memory DBMS with integrated ML/RL framework. → Designed for autonomous

  • perations.

pelotondb.io

slide-34
SLIDE 34 PELOTON

16

THE SELF-DRIVING DBMS

TARGET DATABASE WORKLOAD HISTORY

slide-35
SLIDE 35 PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS TARGET DATABASE WORKLOAD HISTORY

slide-36
SLIDE 36

"THE BRAIN"

PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS

Search Tree ACTION CATALOG

TARGET DATABASE WORKLOAD HISTORY

slide-37
SLIDE 37

"THE BRAIN"

PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS

Search Tree ACTION CATALOG ACTION SEQUENCE

TARGET DATABASE WORKLOAD HISTORY

slide-38
SLIDE 38

"THE BRAIN"

PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS

Search Tree ACTION CATALOG ACTION SEQUENCE

TARGET DATABASE WORKLOAD HISTORY

slide-39
SLIDE 39

"THE BRAIN"

PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS

Search Tree ACTION CATALOG ACTION SEQUENCE

TARGET DATABASE WORKLOAD HISTORY

slide-40
SLIDE 40

"THE BRAIN"

PELOTON

16

THE SELF-DRIVING DBMS

FORECAST MODELS

Search Tree ACTION CATALOG ACTION SEQUENCE

TARGET DATABASE WORKLOAD HISTORY

? ? ?

slide-41
SLIDE 41 PELOTON BUS TRACKING APP WITH ONE-HOUR HORIZON QUERY-BASED WORKLOAD FORECASTING FOR SELF-DRIVING DATABASE MANAGEMENT SYSTEM SIGMOD 2018 15000 30000 45000 60000 9-Jan 11-Jan 13-Jan 15-Jan 17-Jan

Ensemble (LR+RNN)

17

Actual Predicted Queries Per Hour

slide-42
SLIDE 42 PELOTON ADMISSIONS APP WITH THREE-DAY HORIZON

5 10 15 26-Nov 30-Nov 4-Dec 8-Dec 12-Dec 16-Dec

Millions

Actual Predicted Ensemble (LR+RNN)

18

Queries Per Hour

slide-43
SLIDE 43 PELOTON ADMISSIONS APP WITH THREE-DAY HORIZON

5 10 15 26-Nov 30-Nov 4-Dec 8-Dec 12-Dec 16-Dec

Millions

Actual Predicted Ensemble (LR+RNN)

18

Queries Per Hour

slide-44
SLIDE 44 PELOTON ADMISSIONS APP WITH THREE-DAY HORIZON

5 10 15 26-Nov 30-Nov 4-Dec 8-Dec 12-Dec 16-Dec

Millions

Actual Predicted

5 10 15 26-Nov 30-Nov 4-Dec 8-Dec 12-Dec 16-Dec

Millions

Ensemble (LR+RNN) Hybrid (LR+RNN+KR)

18

Queries Per Hour

slide-45
SLIDE 45 OTTERTUNE

19

DEMO

Let's on check the demo…

slide-46
SLIDE 46

Design Considerations for Autonomous Operation

slide-47
SLIDE 47 AUTONMOUS DBMS

21

DESIGN CONSIDERATIONS

Configuration Knobs Internal Metrics Action Engineering

slide-48
SLIDE 48

Anything that requires a human value judgement should be marked as off-limits to autonomous components.

– File Paths – Network Addresses – Durability / Isolation Levels 22

UNTUNABLE KNOBS CONFIGURATION KNOBS
slide-49
SLIDE 49 CONFIGURATION KNOBS

The autonomous components need hints about how to change a knob

– Min/max ranges. – Separate knobs to enable/disable a feature. – Non-uniform deltas. 23

HOW TO CHANGE
slide-50
SLIDE 50 CONFIGURATION KNOBS

The autonomous components need hints about how to change a knob

– Min/max ranges. – Separate knobs to enable/disable a feature. – Non-uniform deltas. 23

HOW TO CHANGE

1 KB 1 MB 1 GB 1 TB

+10 KB +10 MB +10 GB
slide-51
SLIDE 51 CONFIGURATION KNOBS

The autonomous components need hints about how to change a knob

– Min/max ranges. – Separate knobs to enable/disable a feature. – Non-uniform deltas. 23

HOW TO CHANGE
slide-52
SLIDE 52 CONFIGURATION KNOBS

Indicate which knobs are constrained by hardware resources.

– The sum of all buffers cannot exceed the total amount of available memory.

The problem is that sometimes it makes sense to overprovision.

24

HARDWARE RESOURCES
slide-53
SLIDE 53 INTERNAL METRICS

Expose DBMS's hardware capabilities:

– CPU, Memory, Disk, Network 25

HARDWARE INFORMATION Configu figura rati tion
  • n
Reco commender nder
slide-54
SLIDE 54 INTERNAL METRICS

Expose DBMS's hardware capabilities:

– CPU, Memory, Disk, Network

Otherwise you have to come up with clever ways to approximate this…

25

HARDWARE INFORMATION

Microbenchmark Threads

slide-55
SLIDE 55 INTERNAL METRICS

26

HARDWARE MICROBENCHMARKS c3.large i3.large m3.lar… r3.large c3.xlarge r3.xlarge i2.xlarge i3.xlarge m3.xlar… d2.xlar… c3.2xlarge d2.2xlarge h1.2xlarge i2.2xlarge i3.2xlarge m3.2xlar… r3.2xlarge c3.4xlarge d2.4xlar… h1.4xlarge i2.4xlarge i3.4xlarge r3.4xlarge c3.8xlarge h1.8xlarge
  • 0.3
  • 0.2
  • 0.1
0.1 0.2 0.3 0.60 0.64 0.68 0.72 0.76 0.80

Factor 2 Factor 1 2 vCPUs 4 vCPUs 8 vCPUs 16 vCPUs 32 vCPUs Factor Analysis

slide-56
SLIDE 56 INTERNAL METRICS

If the DBMS has sub-components that are tunable, then it must expose separate metrics for those components. Bad Example:

27

SUB-COMPONENTS
slide-57
SLIDE 57 INTERNAL METRICS

28

SUB-COMPONENTS

RocksDB Column Family Knobs Column Family Metrics

Missing: Reads Writes

slide-58
SLIDE 58 INTERNAL METRICS

28

SUB-COMPONENTS

RocksDB Column Family Knobs Global Metrics

Aggregated Metrics

slide-59
SLIDE 59 ACTION ENGINEERING

No action should ever require the DBMS to restart in order for it to take affect. The commercial systems are much better than this than the open-source systems.

29

NO SHUTDOWN
slide-60
SLIDE 60 ACTION ENGINEERING

Provide a notification callback to indicate when an action starts and when it completes. Harder for changes that can be used before the action completes.

30

NOTIFICATIONS
slide-61
SLIDE 61 ACTION ENGINEERING

Support executing the same action with different resource usage levels.

31

RESOURCE USAGE
slide-62
SLIDE 62 ACTION ENGINEERING

Allow replica configurations to diverge from each other.

32

REPLICA EXPLORATION

Master Replicas

slide-63
SLIDE 63 ACTION ENGINEERING

Allow replica configurations to diverge from each other.

32

REPLICA EXPLORATION

Master Replicas

slide-64
SLIDE 64 ACTION ENGINEERING

Allow replica configurations to diverge from each other.

32

REPLICA EXPLORATION

Master Replicas

slide-65
SLIDE 65 ACTION ENGINEERING

Allow replica configurations to diverge from each other.

32

REPLICA EXPLORATION

Master Replicas

slide-66
SLIDE 66 ACTION ENGINEERING

Allow replica configurations to diverge from each other.

32

REPLICA EXPLORATION

Master Replicas

slide-67
SLIDE 67

W hat About Oracle's Self-Driving DBMS?

slide-68
SLIDE 68 ORACLE SELF-DRIVING DBMS

34 September 2017 January 2017

slide-69
SLIDE 69 ORACLE

Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

SELF-DRIVING DBMS

34 September 2017

slide-70
SLIDE 70 ORACLE

Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

SELF-DRIVING DBMS

34

Problem #2 Reactionary Measures

September 2017

slide-71
SLIDE 71 ORACLE

Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

SELF-DRIVING DBMS

34

Problem #2 Reactionary Measures

September 2017

slide-72
SLIDE 72 ORACLE

Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

SELF-DRIVING DBMS

34

Problem #2 Reactionary Measures

September 2017

slide-73
SLIDE 73 CONCLUSION

True autonomous DBMSs are achievable in the next decade. You should think about how each new feature can be controlled by a machine.

MAIN TAKEAWAYS

35

slide-74
SLIDE 74 OTTERTUNE

36

DEMO

Demo Results

slide-75
SLIDE 75

END

@andy_pavlo

slide-76
SLIDE 76