ADVANCED DATABASE SYSTEMS Self-Driving Database Management - - PowerPoint PPT Presentation

advanced database systems
SMART_READER_LITE
LIVE PREVIEW

ADVANCED DATABASE SYSTEMS Self-Driving Database Management - - PowerPoint PPT Presentation

Lect ure # 25 ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721 (Spring 2019) 2 Autonomous DBMS History Self-Driving DBMSs Learned Components CMU 15-721 (Spring 2019) 3


slide-1
SLIDE 1

Self-Driving Database Management Systems

@ Andy_Pavlo // 15- 721 // Spring 2019

ADVANCED DATABASE SYSTEMS

Lect ure # 25

slide-2
SLIDE 2 CMU 15-721 (Spring 2019)

Autonomous DBMS History Self-Driving DBMSs Learned Components

2

slide-3
SLIDE 3 CMU 15-721 (Spring 2019)

M OTIVATIO N

Personnel is ~50% of the TOC of a DBMS. Average DBA Salary (2017): $89,050 The scale and complexity of DBMS installations have surpassed humans.

3

Source: https://www.highbeam.com/doc/1P3- 1149052351.html Source: https://www.bls.gov/oes/current/oes151141.htm

slide-4
SLIDE 4 CMU 15-721 (Spring 2019)

SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)

Index Selection Partitioning / Sharding Keys Data Placement

4

slide-5
SLIDE 5 CMU 15-721 (Spring 2019)

SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)

4

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin

slide-6
SLIDE 6 CMU 15-721 (Spring 2019)

SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)

4

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin

slide-7
SLIDE 7 CMU 15-721 (Spring 2019)

SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)

4

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

+100 +200 +50

Tuning Algorithm Admin

slide-8
SLIDE 8 CMU 15-721 (Spring 2019)

SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)

5

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

Tuning Algorithm Admin

AutoAdmin

Optimizer Cost Model

slide-9
SLIDE 9 CMU 15-721 (Spring 2019)

SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)

5

SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME

Tuning Algorithm Admin

AutoAdmin

Optimizer Cost Model

slide-10
SLIDE 10 CMU 15-721 (Spring 2019)

SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)

6

75 541 53 291

200 400 600 2000 2004 2008 2012 2016 Number of Knobs

Source: Dana Van Aken

Number of Configuration Knobs Per Release

slide-11
SLIDE 11 CMU 15-721 (Spring 2019)

CLO UD- M AN AGED DATABASES (20 10 S)

Initial Placement Tenant Migration

7

slide-12
SLIDE 12 CMU 15-721 (Spring 2019)

CLO UD- M AN AGED DATABASES (20 10 S)

7

slide-13
SLIDE 13 CMU 15-721 (Spring 2019)

CLO UD- M AN AGED DATABASES (20 10 S)

7

slide-14
SLIDE 14 CMU 15-721 (Spring 2019)

O BSERVATIO N

People have been working on autonomous database systems for 45 years. Why is this previous work insufficient?

8

slide-15
SLIDE 15 CMU 15-721 (Spring 2019)

PREVIO US WO RK

Problem #1: Human Judgements

→ User has to make final decision on whether to apply recommendations.

Problem #2: Reactionary Measures

→ Can only solve previous problems. Cannot anticipate upcoming usage trends / issues.

Problem #3: No Transfer Learning

→ Tunes each DBMS instance in isolation. Cannot apply knowledge learned about one DBMS to another.

9

slide-16
SLIDE 16 CMU 15-721 (Spring 2019)

O BSERVATIO N

Just like there are different levels of autonomy in cars, there are different levels for databases.

→ SAE (J3016) Automation Levels

We need to reason about the autonomous systems to understand their capabilities and limitations.

→ This will help us reason about how much a human needs to be involved in its administration.

10

slide-17
SLIDE 17 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual

11

System only does what humans tell it to do.

slide-18
SLIDE 18 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual Level #1: Assistant

11

Recommendation tools that suggest improvements. Human makes final decisions.

slide-19
SLIDE 19 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual Level #1: Assistant Level #2: Mixed Management

11

DBMS and humans work together to mange the system. Human guides the process.

slide-20
SLIDE 20 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations

11

Subsystems can adapt without human guidance. No higher-level coordination.

slide-21
SLIDE 21 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Level #4: Direct Optimizations

11

Human only provides high- level direction + hints. System can identify when it needs to ask humans for help.

slide-22
SLIDE 22 CMU 15-721 (Spring 2019)

AUTO N O M O US DBM S TAXO N O MY

Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Level #4: Direct Optimizations Level #5: Self-Driving

11

slide-23
SLIDE 23 CMU 15-721 (Spring 2019)

SELF- DRIVIN G DATABASE

A DBMS that can deploy, configure, and tune itself automatically without any human intervention.

→ Select actions to improve some objective function (e.g., throughput, latency, cost). → Choose when to apply an action. → Learn from these actions and refine future decision making processes.

12

SELF- DRIVING DATABASE MANAGEMENT SYSTEMS

CIDR 2 2017

slide-24
SLIDE 24 CMU 15-721 (Spring 2019)

ARCH ITECTURE OVERVIEW

13

Workload Forecasts SQL Statements Internal Metrics Component Models

Deploy & Observe Search & Planning Modeling

Where to Deploy? How to Deploy? When to Deploy? Why?

Actions

slide-25
SLIDE 25 CMU 15-721 (Spring 2019)

SELF- DRIVIN G EN GIN EERIN G

Environment Observations

→ How the DBMS collects training data.

Action Meta-Data

→ How the DBMS implements and exposes methods for controlling and modifying the system's configuration.

Action Engineering

→ How the DBMS deploys actions either for training or

  • ptimization.

14

MAKE YOUR DATABASE DREAM OF ELECTRIC SHEEP: ENGINEERING FOR SELF- DRIVING OPERATION

UNPUBLISHED M MANUSCRIPT 2019

slide-26
SLIDE 26 CMU 15-721 (Spring 2019)

EN VIRO N M EN T O BSERVATIO N S

Logical Workload History

→ SQL queries with their execution context. → Need to compress to reduce storage size.

Runtime Metrics

→ Internal measurements about the DBMS's runtime behavior.

Database Contents

→ Succinct representation/encoding of the database tables.

15

slide-27
SLIDE 27 CMU 15-721 (Spring 2019)

SUB- CO M PO N EN T M ETRICS

If the DBMS has sub-components that are tunable, then it must expose separate metrics for those components. Bad Example:

16

slide-28
SLIDE 28 CMU 15-721 (Spring 2019)

SUB- CO M PO N EN T M ETRICS

17

RocksDB Column Family Knobs

slide-29
SLIDE 29 CMU 15-721 (Spring 2019)

SUB- CO M PO N EN T M ETRICS

17

RocksDB Column Family Knobs Column Family Metrics

Missing: Reads Writes

slide-30
SLIDE 30 CMU 15-721 (Spring 2019)

SUB- CO M PO N EN T M ETRICS

17

RocksDB Column Family Knobs Global Metrics

Aggregated Metrics

slide-31
SLIDE 31 CMU 15-721 (Spring 2019)

ACTIO N M ETA- DATA

Configuration Knobs

→ Untunable flags → Value ranges

Dependencies

→ No hidden dependencies → Dynamic actions (i.e., an action creates new actions).

18

slide-32
SLIDE 32 CMU 15-721 (Spring 2019)

UN TUN ABLE KN O BS

Anything that requires a human value judgement should be marked as off-limits to autonomous components.

→ File Paths → Network Addresses → Durability / Isolation Levels

19

slide-33
SLIDE 33 CMU 15-721 (Spring 2019)

KN O B H IN TS

The autonomous components need hints about how to change a knob.

→ Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas.

20

slide-34
SLIDE 34 CMU 15-721 (Spring 2019)

KN O B H IN TS

The autonomous components need hints about how to change a knob.

→ Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas.

20

1 KB 1 MB 1 GB 1 TB

+10 KB +10 MB +10 GB

slide-35
SLIDE 35 CMU 15-721 (Spring 2019)

ACTIO N EN GIN EERIN G

No Downtime Notifications Replicated Training

21

slide-36
SLIDE 36 CMU 15-721 (Spring 2019)

N O DOWN TIM E

The DBMS must be able to deploy any action without incurring downtime.

→ Restart vs. Unavailability

Without this, the system has to include the downtime in its cost model estimations.

→ Bad Example: MySQL Log File Size

22

slide-37
SLIDE 37 CMU 15-721 (Spring 2019)

N OTIFICATIO N S

Provide a notification to indicate when an action starts and when it completes.

→ Need to know whether degradation is due to deployment

  • r bad decision.

Harder for changes that can be used before the action completes.

23

slide-38
SLIDE 38 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

ML models need lots of training data. But getting this data is expensive in a DBMS.

→ We don't want to slow down a production DBMS. → Building a simulator for the DBMS is too hard.

Ongoing Research: How to use the DBMS's replicas to explore configurations and train its models.

24

slide-39
SLIDE 39 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Replica Replica

Actions Actions

Self-Driving Manager

slide-40
SLIDE 40 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Replica Replica

Actions Actions Actions Actions

Self-Driving Manager

slide-41
SLIDE 41 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Replica Replica

Actions Actions

???

Revert Actions Wait Time

Self-Driving Manager

slide-42
SLIDE 42 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Reads Writes Writes Writes

Replica Replica

App Server

Self-Driving Manager

slide-43
SLIDE 43 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Reads Writes Physical Log SQL Statements

Replica Replica

Physical Log SQL Statements

App Server

??? ???

Self-Driving Manager

slide-44
SLIDE 44 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Reads Writes Physical Log SQL Statements

Replica Replica

Physical Log SQL Statements

App Server

??? ???

Self-Driving Manager

Component Models Component Models

slide-45
SLIDE 45 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Reads Writes Physical Log SQL Statements Reads Reads

Replica Replica

Physical Log SQL Statements

App Server

Self-Driving Manager

Component Models Component Models

slide-46
SLIDE 46 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Replica Replica

App Server

Actions

Self-Driving Manager

Component Models Component Models

slide-47
SLIDE 47 CMU 15-721 (Spring 2019)

REPLICATED TRAIN IN G

25

Master

Replica Replica

App Server

??? ???

Actions

Self-Driving Manager

Component Models Component Models

slide-48
SLIDE 48 CMU 15-721 (Spring 2019)

O RACLE SELF- DRIVIN G DBM S

26

September 2017 January 2017

slide-49
SLIDE 49 CMU 15-721 (Spring 2019)

O RACLE SELF- DRIVIN G DBM S

Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

26

September 2017

slide-50
SLIDE 50 CMU 15-721 (Spring 2019)

O RACLE SELF- DRIVIN G DBM S

Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

26

September 2017

Reactionary Measures No Transfer Learning

slide-51
SLIDE 51 CMU 15-721 (Spring 2019)

O RACLE SELF- DRIVIN G DBM S

Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning

26

September 2017

Reactionary Measures No Transfer Learning

slide-52
SLIDE 52 CMU 15-721 (Spring 2019)

O BSERVATIO N

There are many places in the DBMS that use human-engineered components to make decisions about the behavior of the system.

→ Optimizer Cost Models → Compression Algorithms → Data Structures → Scheduling Policies

What if the DBMS could "learn" these policies based on the data.

27

slide-53
SLIDE 53 CMU 15-721 (Spring 2019)

LEARN ED CO M PO N EN TS

Replace DBMS components with ML models trained at runtime.

28

slide-54
SLIDE 54 CMU 15-721 (Spring 2019)

PARTIN G TH O UGH TS

True autonomous DBMSs are achievable in the next decade. You should think about how each new feature can be controlled by a machine.

29

slide-55
SLIDE 55 CMU 15-721 (Spring 2019)

N EXT CLASS

SAP HANA Guest Lecture

30