Self-Driving Database Management Systems
@ Andy_Pavlo // 15- 721 // Spring 2019
ADVANCED DATABASE SYSTEMS Self-Driving Database Management - - PowerPoint PPT Presentation
Lect ure # 25 ADVANCED DATABASE SYSTEMS Self-Driving Database Management Systems @ Andy_Pavlo // 15- 721 // Spring 2019 CMU 15-721 (Spring 2019) 2 Autonomous DBMS History Self-Driving DBMSs Learned Components CMU 15-721 (Spring 2019) 3
@ Andy_Pavlo // 15- 721 // Spring 2019
Autonomous DBMS History Self-Driving DBMSs Learned Components
2
M OTIVATIO N
Personnel is ~50% of the TOC of a DBMS. Average DBA Salary (2017): $89,050 The scale and complexity of DBMS installations have surpassed humans.
3
Source: https://www.highbeam.com/doc/1P3- 1149052351.html Source: https://www.bls.gov/oes/current/oes151141.htm
SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)
Index Selection Partitioning / Sharding Keys Data Placement
4
SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)
4
SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME
+100 +200 +50
Tuning Algorithm Admin
SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)
4
SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME
+100 +200 +50
Tuning Algorithm Admin
SELF- ADAPTIVE DATABASES (19 70 s- 19 9 0 s)
4
SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME
+100 +200 +50
Tuning Algorithm Admin
SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)
5
SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME
Tuning Algorithm Admin
AutoAdmin
Optimizer Cost Model
SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)
5
SELECT * FROM A JOIN B ON A.ID = B.ID WHERE A.VAL > 123 AND B.NAME LIKE 'XY%' A.ID A.VAL B.ID B.NAME
Tuning Algorithm Admin
AutoAdmin
Optimizer Cost Model
SELF- TUN ING DATABASES (19 9 0 s- 20 0 0 s)
6
75 541 53 291
200 400 600 2000 2004 2008 2012 2016 Number of Knobs
Source: Dana Van Aken
Number of Configuration Knobs Per Release
CLO UD- M AN AGED DATABASES (20 10 S)
Initial Placement Tenant Migration
7
CLO UD- M AN AGED DATABASES (20 10 S)
7
CLO UD- M AN AGED DATABASES (20 10 S)
7
O BSERVATIO N
People have been working on autonomous database systems for 45 years. Why is this previous work insufficient?
8
PREVIO US WO RK
Problem #1: Human Judgements
→ User has to make final decision on whether to apply recommendations.
Problem #2: Reactionary Measures
→ Can only solve previous problems. Cannot anticipate upcoming usage trends / issues.
Problem #3: No Transfer Learning
→ Tunes each DBMS instance in isolation. Cannot apply knowledge learned about one DBMS to another.
9
O BSERVATIO N
Just like there are different levels of autonomy in cars, there are different levels for databases.
→ SAE (J3016) Automation Levels
We need to reason about the autonomous systems to understand their capabilities and limitations.
→ This will help us reason about how much a human needs to be involved in its administration.
10
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual
11
System only does what humans tell it to do.
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual Level #1: Assistant
11
Recommendation tools that suggest improvements. Human makes final decisions.
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual Level #1: Assistant Level #2: Mixed Management
11
DBMS and humans work together to mange the system. Human guides the process.
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations
11
Subsystems can adapt without human guidance. No higher-level coordination.
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Level #4: Direct Optimizations
11
Human only provides high- level direction + hints. System can identify when it needs to ask humans for help.
AUTO N O M O US DBM S TAXO N O MY
Level #0: Manual Level #1: Assistant Level #2: Mixed Management Level #3: Local Optimizations Level #4: Direct Optimizations Level #5: Self-Driving
11
SELF- DRIVIN G DATABASE
A DBMS that can deploy, configure, and tune itself automatically without any human intervention.
→ Select actions to improve some objective function (e.g., throughput, latency, cost). → Choose when to apply an action. → Learn from these actions and refine future decision making processes.
12
SELF- DRIVING DATABASE MANAGEMENT SYSTEMS
CIDR 2 2017
ARCH ITECTURE OVERVIEW
13
Workload Forecasts SQL Statements Internal Metrics Component Models
Deploy & Observe Search & Planning Modeling
Where to Deploy? How to Deploy? When to Deploy? Why?
Actions
SELF- DRIVIN G EN GIN EERIN G
Environment Observations
→ How the DBMS collects training data.
Action Meta-Data
→ How the DBMS implements and exposes methods for controlling and modifying the system's configuration.
Action Engineering
→ How the DBMS deploys actions either for training or
14
MAKE YOUR DATABASE DREAM OF ELECTRIC SHEEP: ENGINEERING FOR SELF- DRIVING OPERATION
UNPUBLISHED M MANUSCRIPT 2019
EN VIRO N M EN T O BSERVATIO N S
Logical Workload History
→ SQL queries with their execution context. → Need to compress to reduce storage size.
Runtime Metrics
→ Internal measurements about the DBMS's runtime behavior.
Database Contents
→ Succinct representation/encoding of the database tables.
15
SUB- CO M PO N EN T M ETRICS
If the DBMS has sub-components that are tunable, then it must expose separate metrics for those components. Bad Example:
16
SUB- CO M PO N EN T M ETRICS
17
RocksDB Column Family Knobs
SUB- CO M PO N EN T M ETRICS
17
RocksDB Column Family Knobs Column Family Metrics
SUB- CO M PO N EN T M ETRICS
17
RocksDB Column Family Knobs Global Metrics
ACTIO N M ETA- DATA
Configuration Knobs
→ Untunable flags → Value ranges
Dependencies
→ No hidden dependencies → Dynamic actions (i.e., an action creates new actions).
18
UN TUN ABLE KN O BS
Anything that requires a human value judgement should be marked as off-limits to autonomous components.
→ File Paths → Network Addresses → Durability / Isolation Levels
19
KN O B H IN TS
The autonomous components need hints about how to change a knob.
→ Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas.
20
KN O B H IN TS
The autonomous components need hints about how to change a knob.
→ Min/max ranges. → Separate knobs to enable/disable a feature. → Non-uniform deltas.
20
1 KB 1 MB 1 GB 1 TB
+10 KB +10 MB +10 GB
ACTIO N EN GIN EERIN G
No Downtime Notifications Replicated Training
21
N O DOWN TIM E
The DBMS must be able to deploy any action without incurring downtime.
→ Restart vs. Unavailability
Without this, the system has to include the downtime in its cost model estimations.
→ Bad Example: MySQL Log File Size
22
N OTIFICATIO N S
Provide a notification to indicate when an action starts and when it completes.
→ Need to know whether degradation is due to deployment
Harder for changes that can be used before the action completes.
23
REPLICATED TRAIN IN G
ML models need lots of training data. But getting this data is expensive in a DBMS.
→ We don't want to slow down a production DBMS. → Building a simulator for the DBMS is too hard.
Ongoing Research: How to use the DBMS's replicas to explore configurations and train its models.
24
REPLICATED TRAIN IN G
25
Master
Replica Replica
Actions Actions
Self-Driving Manager
REPLICATED TRAIN IN G
25
Master
Replica Replica
Actions Actions Actions Actions
Self-Driving Manager
REPLICATED TRAIN IN G
25
Master
Replica Replica
Actions Actions
Revert Actions Wait Time
Self-Driving Manager
REPLICATED TRAIN IN G
25
Master
Reads Writes Writes Writes
Replica Replica
App Server
Self-Driving Manager
REPLICATED TRAIN IN G
25
Master
Reads Writes Physical Log SQL Statements
Replica Replica
Physical Log SQL Statements
App Server
Self-Driving Manager
REPLICATED TRAIN IN G
25
Master
Reads Writes Physical Log SQL Statements
Replica Replica
Physical Log SQL Statements
App Server
Self-Driving Manager
Component Models Component Models
REPLICATED TRAIN IN G
25
Master
Reads Writes Physical Log SQL Statements Reads Reads
Replica Replica
Physical Log SQL Statements
App Server
Self-Driving Manager
Component Models Component Models
REPLICATED TRAIN IN G
25
Master
Replica Replica
App Server
Actions
Self-Driving Manager
Component Models Component Models
REPLICATED TRAIN IN G
25
Master
Replica Replica
App Server
Actions
Self-Driving Manager
Component Models Component Models
O RACLE SELF- DRIVIN G DBM S
26
September 2017 January 2017
O RACLE SELF- DRIVIN G DBM S
Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning
26
September 2017
O RACLE SELF- DRIVIN G DBM S
Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning
26
September 2017
Reactionary Measures No Transfer Learning
O RACLE SELF- DRIVIN G DBM S
Automatic Patching Automatic Indexing Automatic Recovery Automatic Scaling Automatic Query Tuning
26
September 2017
Reactionary Measures No Transfer Learning
O BSERVATIO N
There are many places in the DBMS that use human-engineered components to make decisions about the behavior of the system.
→ Optimizer Cost Models → Compression Algorithms → Data Structures → Scheduling Policies
What if the DBMS could "learn" these policies based on the data.
27
LEARN ED CO M PO N EN TS
Replace DBMS components with ML models trained at runtime.
28
PARTIN G TH O UGH TS
True autonomous DBMSs are achievable in the next decade. You should think about how each new feature can be controlled by a machine.
29
N EXT CLASS
SAP HANA Guest Lecture
30