Reducing the Storage Overhead of Main-Memory OLTP Databases with
Hybrid Indexes
Huanchen Zhang
David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen
PARALLEL DATA LABORATORY
Carnegie Mellon University
Hybrid Indexes Huanchen Zhang David G. Andersen, Andrew Pavlo, - - PowerPoint PPT Presentation
Reducing the Storage Overhead of Main-Memory OLTP Databases with Hybrid Indexes Huanchen Zhang David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen PARALLEL DATA LABORATORY Carnegie Mellon University 2 3 4 Part I Initial
David G. Andersen, Andrew Pavlo, Michael Kaminsky, Lin Ma, Rui Shen
PARALLEL DATA LABORATORY
Carnegie Mellon University
2
3
4
[SIGMOD’16]
5
You are running out of memory
6
You are running out of memory
6
Buy more
You are running out of memory
6
2M 4M 6M 8M 10M 20K 60K
7
Disk tuples In-memory tuples Indexes
4 8 Memory Limit = 5GB
8
9
Benchmark % space for index
10
11
2M 4M 6M 8M 10M 20K 60K
TPC-C on
12
13
dynamic stage static stage
14
dynamic stage static stage write merge
15
dynamic stage static stage
16
dynamic stage static stage read
17
dynamic stage static stage read write merge Memory-efficient Skew-aware
18
dynamic stage static stage merge
19
dynamic stage static stage merge
19
20
20
2 4 4 1 2 a b 6 8 10 3 4 c d 5 5 e f 5 6 g h 7 8 i j 9 10 k l 11 12 m n 21
2 4 4 1 2 a b 6 8 10 3 c 4 d 5 5 e f 5 g 6 h 7 i 8 j 9 k 10 l 11 m 12 n
21
1 2 3 a b c 4 5 6 d h 7 8 9 i j k 10 11 12 l m n e f g 3 6 9 21
1 2 3 a b c 4 5 6 d h 7 8 9 i j k 10 11 12 l m n e f g 3 6 9
22
1 2 3 a b c 4 5 6 d h 7 8 9 i j k 10 11 12 l m n e f g 3 6 9 22
2 4 4 1 2 a b 6 8 10 3 4 c d 5 5 e f 5 6 g h 7 8 i j 9 10 k l 11 12 m n 1 2 3 a b c 4 5 6 d h 7 8 9 i j k 10 11 12 l m n e f g 3 6 9 22
dynamic stage static stage merge
23
dynamic stage static stage merge
23
Size %
2M 4M 6M 8M 10M 20K 60K
TPC-C on
B+tree
24
2M 4M 6M 8M 10M 20K 60K
60K 20K
TPC-C on
B+tree Hybrid
24
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
Transactions Executed Throughput (txn/s)
20K 60K 20K 60K
Memory (GB)
2M 4M 6M 8M 10M
4 4 8 8
B+tree Hybrid B+tree Hybrid Disk tuples In-memory tuples Indexes
25
26
27
dynamic stage static stage write merge
28
dynamic stage static stage write merge
28
29
dynamic stage static stage write merge
30
dynamic stage static stage write merge
31
dynamic stage static stage write merge
32
dynamic stage static stage write merge
33
dynamic stage static stage merge write
33
static stage merge dynamic stage write
34
static stage merge dynamic stage write Intermediate stage freeze
34
freeze static stage merge dynamic stage write Intermediate stage
34
static stage merge Intermediate stage
35
static stage merge Intermediate stage
36
37
Our Solution: Incremental Copy-on-write with Rapid GC
new parent
38
new parent When can we safely reclaim the garbage?
38
Our Solution: Incremental Copy-on-write with Rapid GC
new parent When can we safely reclaim the garbage?
38
Our Solution: Incremental Copy-on-write with Rapid GC
new parent When no thread still holds a reference to it!
38
Our Solution: Incremental Copy-on-write with Rapid GC
new parent When no thread still holds a reference to it!
38
Our Solution: Incremental Copy-on-write with Rapid GC
Thread-local counters
C1 C2 C3 Cn
new parent When no thread still holds a reference to it! Thread-local counters
Cmax Cmax Cmin GC Condition: Cmin > garbage tag ++Ci = MAX(Ci , Cmax) + 1
38
Our Solution: Incremental Copy-on-write with Rapid GC
C1 C2 C3 Cn
39
40
40
40
40
40
40
41
42
43
a $ a b $ l n r $ a $ i i
i $ $ $ $
44
200 400 600 800 1000
ART Our Encoding
50M email keys with average length = 20 bytes
45
44
Workload: insert, read/update(50/50) Key: email Value: 64-bit unsigned integer (pointer) Single thread 50M entries, 10M queries (Zipf distributed)
B+tree Masstree Skip List ART 4 8
Original Hybrid
8M 16M
Original Hybrid Read/Update (50/50)
4M 2M
Insert-only