MyFlashSQL:
Flash is more than faster-harddisk
Sang-Won Lee SKKU, Korea
Contributors:
Gihwan Oh, Dasom Whang, Mijin Ahn, Donghyun Kang, and Samsung Electronic Memory Division
MyFlashSQL StarLab
Sang-Won Lee: Who am I? Professor at SKKU(Sungkyunkwan Univ.), Korea - - PowerPoint PPT Presentation
MyFlashSQL : Flash is more than faster-harddisk Sang-Won Lee SKKU, Korea Contributors: Gihwan Oh, Dasom Whang, Mijin Ahn, Donghyun Kang, and Samsung Electronic Memory Division MyFlashSQL StarLab Sang-Won Lee: Who am I? Professor at
Sang-Won Lee SKKU, Korea
Contributors:
Gihwan Oh, Dasom Whang, Mijin Ahn, Donghyun Kang, and Samsung Electronic Memory Division
MyFlashSQL StarLab
2
Sang-Won Lee (swlee@skku.edu)
Postgres, MySQL, Couchbase, SQLite)
3
Sang-Won Lee (swlee@skku.edu)
4
Sang-Won Lee (swlee@skku.edu)
5
Sang-Won Lee (swlee@skku.edu)
5
6
Sang-Won Lee (swlee@skku.edu)
Storage Mgmt Buffer Mgmt Index & QP Transaction Mgmt Cache Mgmt File Consistency / DB Space Mgmt.
Asymmetric Read/Write IPL
[SIGMOD07]
CFLRU tIPL
[ICDE2011]
No overwrite / Addr. Mapping Layer X-FTL
[SIGMOD13],
SHARE
[SIGMOD2016]
X-FTL, SHARE No mechanics (Seq RD ~ Rand RD)
SIDX / IDX-based QP
Sequential Write >> Random Write
SFS
[FAST12]
FaCE
[VLDB12]
SSD Architecture (Parallelism et. al.) Psync
[VLDB12]
DuraSSD
[SIGMOD2014]
SSD Architecture (Beyond block device) Trim, X-FTL, Share; In-Storage Computing; Unit of IO in DB; Multi-streamed IO, NVMe Multi-Queue
7
Sang-Won Lee (swlee@skku.edu)
– e.g. 8 cores and NVMe
8
Sang-Won Lee (swlee@skku.edu)
9
Sang-Won Lee (swlee@skku.edu)
– Do not believe IOSTAT metrics.
abundant parallelism in SSDs.
5/14/2017 9
10
Sang-Won Lee (swlee@skku.edu)
10
Primary key
Secondary index tree (Non-clustered index) Primary index tree (Clustered index) Primary index tree (Clustered index)
Level 0 Level 0
Example) SELECT * FROM tab WHERE a between 10 and 13;
https://blog.jcole.us/2013/01/10/btree-index-structures-in-innodb/
11
Sang-Won Lee (swlee@skku.edu)
Secondary index tree Primary index tree Primary index tree Primary index tree
Submit asynchronous I/Os (sorted, for prefetching)
Level 0 Level 0
2 4 7 18 53 83
12
Sang-Won Lee (swlee@skku.edu)
SELECT * FROM table FORCE INDEX (idx) WHERE colum_a BETWEEN min AND MAX;
13
Sang-Won Lee (swlee@skku.edu)
14
Sang-Won Lee (swlee@skku.edu)
15
Sang-Won Lee (swlee@skku.edu)
16
Sang-Won Lee (swlee@skku.edu)
mode
17
Sang-Won Lee (swlee@skku.edu)
18
Sang-Won Lee (swlee@skku.edu)
Database
Database Buffer
Tail Head D D D
Main LRU List Free list
Dirty Page Set D D
Scan LRU List from tail
Double Write Buffer
Flush Dirty Pages
D
19
Sang-Won Lee (swlee@skku.edu)
20
Sang-Won Lee (swlee@skku.edu)
LPN A B C D E
(L2P)
B C D E
Physical Address in Flash Memory
A B C D E
Applications (LPN)
SHARE (A_LPN, D_LPN)
21
Sang-Won Lee (swlee@skku.edu)
21
241 118 60 578 271 131
200 300 400 500 600 700 4KB 8KB 16KB TPS Page Size
Original SHARE
1000 2000 3000 4000 5000 6000 4kb 8kb 16kb Written Bytes(MB) Page size DWB on Share
(a) Throughput (b) Total amount of written data
22
Sang-Won Lee (swlee@skku.edu)
22
2000 4000 6000 8000 10000 12000 4KB
Operations Per Second (OPS)
DWB-on DWB-SHARE
2.4x
23
Sang-Won Lee (swlee@skku.edu)
consistency
Compaction)
24
Sang-Won Lee (swlee@skku.edu)
25
Sang-Won Lee (swlee@skku.edu)
26
Sang-Won Lee (swlee@skku.edu)
Database
Database Buffer
Tail
Head D D D
Main LRU List Free list
Dirty Page Set D D
Scan LRU List from tail
Double Write Buffer
D
fully utilize its performance because of reads blocked by write operation
27
Sang-Won Lee (swlee@skku.edu)
Single page flush
CPU/IO utilization, throughput ↓
28
Sang-Won Lee (swlee@skku.edu)
Database
Database Buffer
Tail
Head D D D
Main LRU List Free list
Dirty Page Set D D
Scan LRU List from tail
Double Write Buffer
D
29
Sang-Won Lee (swlee@skku.edu)
– 850 PRO SSD / 960 PRO NVMe / PM961 NVMe (Samsung) – 845 DC battery-backed SSD
30
Sang-Won Lee (swlee@skku.edu)
121 123 20 40 60 80 100 120 140
Original RAW TpmC (Transactions per minute Count)
HDD
31
Sang-Won Lee (swlee@skku.edu)
8468 20269 5000 10000 15000 20000 25000
Original RAW
Samsung 850 PRO SSD
26544 33305 5000 10000 15000 20000 25000 30000 35000
Original RAW
DC SSD (battery-backed)
14023 32070 5000 10000 15000 20000 25000 30000 35000
Original RAW TpmC (Transactions per minute Count)
NVMe SSD
2.3x 1.3x 2.4x
32
Sang-Won Lee (swlee@skku.edu)
33
Sang-Won Lee (swlee@skku.edu)
34
Sang-Won Lee (swlee@skku.edu)
35
Sang-Won Lee (swlee@skku.edu)
850 Pro SSD PM961 NVMe SSD
36
Sang-Won Lee (swlee@skku.edu)
P99 Latency
37
Sang-Won Lee (swlee@skku.edu)
186 662 1038 200 400 600 800 1000 1200
Original RAW Optimized RAW
Transactions per Second (TPS)
3.6x 5.5x
38
Sang-Won Lee (swlee@skku.edu)
39
Sang-Won Lee (swlee@skku.edu)
MySQL/FaCE
– E-mail: swlee@skku.edu MyFlashSQL StarLab