PrincetonUniversity
From application requests to Virtual IOPs: Provisioned key-value storage with Libra
David Shue* and Michael J. Freedman
(*now at Google)
From application requests to Virtual IOPs: Provisioned key-value - - PowerPoint PPT Presentation
PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google) Shared Cloud Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Shared
PrincetonUniversity
David Shue* and Michael J. Freedman
(*now at Google)
Tenant A Tenant B
VM VM VM VM VM VM
Tenant C
VM VM VM
Tenant A Tenant B
VM VM VM VM VM VM
Key-Value Storage Block Storage SQL Database
Tenant C
VM VM VM
Tenant A Tenant B
VM VM VM VM VM VM
Key-Value Storage Block Storage SQL Database
Tenant C
VM VM VM
Disk IO-bound Tenants SSD-backed storage Key-Value Storage
Tenant A
VM VM VM
Tenant B
VM VM VM
Tenant C
VM VM VM
Application Requests
Shared Key-Value Storage
Low-level IO
SSD SSD SSD SSD SSD GET/s PUT/s IOPs BW
(1KB normalized)
w/ high utilization.
Virtual IOPs.
6
7
Storage Type App- requests Work Conserving Media Maestro Block N N HDD mClock Block N Y HDD FlashFQ Block N Y SSD DynamoDB Key-Value Y N SSD
Tenant A Tenant B
VM VM VM VM VM VM
Tenant B
VM VM VM
Storage Node N Storage Node 1
Shared Key-Value Storage
Global Reservation Problem Local Reservation Problem [Pisces OSD1 ’12]
Storage Node N
9
Storage Node 1
data partitions demand
Tenant A
VM VM VM
Tenant B
VM VM VM
Storage Node N Storage Node 1
10
Storage Node N Storage Node 1
11
ResAn ResBn ResA1 ResB1
12
Key-Value Protocol Persistence Engine IO Scheduler Physical Disk
Retrieve K Read l337 IO operation
GET K
GET 1001100
Libra IO Scheduler
blah blah
13
Libra Provisioning Policy Libra IO Scheduler
GET PUT
How much IO to consume?
Reservation Distribution Policy
How much IO to provision?
Persistence Engine Physical Disk
DRR
14
IO Amplification IO Interference Non-linear IO Performance 1 KB PUT ≥ 1 KB Write Variable IO throughput Underestimate provisionable IO Non-linear cost per KB Model IO with Virtual IOPs Track tenant app-request resource profiles
15
PUT PUT K,V3
V3 V2
FLUSH
V3 index
A M LevelDB (LSM-Tree)
5 10 15 20 25 30
1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B
IO Throughput (kop/s) GET/PUT Request Size
PUT write IO
COMPACT
V1 index
H V
V0 index
F O
V3
index
A O
16
PUT FLUSH COMPACT PUT K,V3
V3 V2 V3 index
A M
V1 index
H V
V0 index
F O
V3
index
A O LevelDB (LSM-Tree)
5 10 15 20 25 30
1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B
IO Throughput (kop/s) GET/PUT Request Size
PUT write IO FLUSH write IO
17
PUT FLUSH COMPACT PUT K,V3
V3 V2 V3 index
A M
V1 index
H V
V0 index
F O
V3
index
A O LevelDB (LSM-Tree)
5 10 15 20 25 30
1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B
IO Throughput (kop/s) GET/PUT Request Size
PUT write IO FLUSH write IO COMPACT write IO COMPACT read IO
18
GET K
index
A M
index
H V
index
F O
K
index
G Z
5 10 15 20 25 30
1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B
IO Throughput (kop/s) GET/PUT Request Size
GET read IO PUT write IO FLUSH write IO COMPACT write IO COMPACT read IO
19
5 GET 25 PUT FLUSH COMPACT Per-PUT Per-GET
Compute app-request IO profiles
GET
x x
PUT
IO
= Tenant A 5 IO units PUT
FLUSH
Track IO consumption Provision IO allocations
blah blah
Libra Provisioning Policy
IO Tenant A
500 IO/s
5 100 50 1
+
6
1 0.5 80 70 500
1:1 Pure Read/Pure Write
1 2 4 8 16 32 64 128 256
Read IOP Size (KB)
1 2 4 8 16 32 64 128 256
Write IOP Size (KB)
50 60 70 80 90 100
Pct of Ideal Throughput
20
Die-level parallelism, low latency IOPs 4 read/4 write tenants Shared-controller and bus contention Erase-before-write
FTL and read-modify-write garbage colleciton
1:1 Pure Read/Pure Write
1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256
Write IOP Size (KB)
75:25 Read/Write Ratio
1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256
50 60 70 80 90 100
Pct of Ideal Throughput
50:50 Read/Write Ratio
1 2 4 8 16 32 64 128 256
Read IOP Size (KB)
1 2 4 8 16 32 64 128 256
25:75 Read/Write Ratio
1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256
50 60 70 80 90 100
21
22
Provisionable IO throughput = floor(workloads) (18 Kop/s)
Provisionable IO limit
0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput
75:25 Read/Write 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write
23
Provisionable IO throughput = floor(workloads) (18 Kop/s)
Provisionable IO limit
0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput
75:25 Read/Write 75:25 = 4K 75:25 = 32K 75:25 = 256K 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write
24
Provisionable IO throughput = floor(workloads) (18 Kop/s)
Provisionable IO limit
0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput
75:25 = 256K 50:50 = 256K 25:75 = 256K 1:1 Pure Read/Pure Write
25
50 100 150 200 250 1 2 4 8 16 32 64 128 256 Bandwidth (MB/s)
IOP Size (KB) IO Bandwidth
5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOP (kop/s)
IOP Size (KB) IOP Throughput
Max BW Max IOP/s
26
50 100 150 200 250 1 2 4 8 16 32 64 128 256 Bandwidth (MB/s)
IOP Size (KB) IO Bandwidth
Read Rand Read Seq Write Rand Write Seq
5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOP (kop/s)
IOP Size (KB) IOP Throughput
Max BW Max IOP/s
27
VOPCPB(IOP-size) = Max-IOP Achieved-IOP(IOP-size) × IOP-size
Unifies IO cost into a single metric Captures non-linear IO performance Provides IO insulation
256
IOP Throughput at 1/2 Max VOPs
0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256 Virtual IOP Cost (op/KB)
IOP Size (KB) Libra IO Cost Model
Read IO cost Write IO cost
2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write
28
2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write
5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOPs (kop/s)
IOP Size (KB) IOP Throughput at 1/2 Max VOPs
Libra Read IO Model Libra Write IO Model Max Read Max Write
0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256 Virtual IOP Cost (op/KB)
IOP Size (KB) Libra IO Cost Model
Read IO cost Write IO cost
VOPCPB(IOP-size) = Max-IOP Achieved-IOP(IOP-size) × IOP-size
blah blah
29
Libra Provisioning Policy Libra IO Scheduler Persistence Engine Physical Disk
Charge tenant IOPs based on VOP cost Track app-request VOP consumption Provision VOPs within provisionable limit Update tenant VOP allocations
resource allocations?
while achieving high utilization?
30
31
Interference-free Ideal
Throughput Ratio = Actual / Expected (IO Insulation)
Read 1 KB
0.2 0.4 0.6 0.8 1 1.2
W 1KB W 4KB W 8KB W 16KB W 32KB W 64KB W 128KB W 256KB
Throughput Ratio
Read-Write IOP Throughput Ratio
Read Tenants Write Tenants
even
32
Interference-free Ideal
0.2 0.4 0.6 0.8 1 1.2
Throughput Ratio
Read-Write IOP Throughput Ratio
R 1KB R 4KB R 8KB R 16KB R 32KB R 64KB R 128KB R 256KB Read Tenants Write Tenants
Write 1-256 KB
Throughput Ratio = Actual / Expected (IO Insulation)
0.2 0.4 0.6 0.8 1 1.2 W 1 K B W 4 K B W 8 K B W 1 6 K B W 3 2 K B W 6 4 K B W 1 2 8 K B W 2 5 6 K B Throughput Ratio Read-Write IOP Throughput Ratio Read Tenants Write Tenants33
0.2 0.4 0.6 0.8 1 1.2 1 2 4 8 16 32 64 128 256
VOP Cost (op/KB) IOP Size (KB) Read IO Cost Models
0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256
VOP Cost (op/KB) IOP Size (KB) Write IO Cost Models
libra constant fixed
34
0.2 0.4 0.6 0.8 1 1.2 1 2 4 8 16 32 64 128 256
VOP Cost (op/KB) IOP Size (KB) Read IO Cost Models
0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256
VOP Cost (op/KB) IOP Size (KB) Write IO Cost Models
libra constant fixed linear
35
Min-Max Ratio = Min Throughput Ratio / Max Throughput Ratio
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
rw rr wwAccuracy (MMR) IOP Insulation Accuracy Write-Write Read-Read Read-Write
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
rw rr wwAccuracy (MMR) Virtual IOP Allocation Accuracy
libra linear constant fixed
Write-Write Read-Read Read-Write
36
75:25 GET-PUT, Variance 4K
1 2 4 8 16 32 64 128 256
GET Request Size (KB)
1 2 4 8 16 32 64 128 256
PUT Request Size (KB)
50:50 GET-PUT, Variance 4K
1 2 4 8 16 32 64 128
GET Request Size (KB)
1 2 4 8 16 32 64 128 256
25:75 GET-PUT, Variance 4K
1 2 4 8 16 32 64 128 256
GET Request Size (KB)
1 2 4 8 16 32 64 128 256
16 18 20 22 24 26 28 30
VOP (kop/s)
37
< 10th percentile covered by SLA and higher-level policies
Workload Percenti ercentile 10th 50th 80th All
99:1 25:75 1:99
1.6% 30.5% 40.5% 45.8% 1.4% 14.9% 25.0% 34.7% 0.7% 12.2% 19.5% 28.1%
Unprovisionable Throughput As a Percentage of Total Throughput
38
Read Heavy Mixed Write Heavy
1 2 3 4 5 6 7 Throughput (kreq/s) Libra Normalized GET (1KB) 1 2 3 4 5 6 7 Libra Normalized PUT (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Throughput (kreq/s) Time (s) No Profile Normalized GET (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Time (s) No Profile Normalized PUT (1KB)
1.5x 0.5x
Work-conserving consumption of unprovisioned resources Fully provisioned allocations
39
1 2 3 4 5 6 7 Throughput (kreq/s) Libra Normalized GET (1KB) 1 2 3 4 5 6 7 Libra Normalized PUT (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Throughput (kreq/s) Time (s) No Profile Normalized GET (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Time (s) No Profile Normalized PUT (1KB)
Read Heavy Mixed Write Heavy
Virtual IOPs.
40