From application requests to Virtual IOPs: Provisioned key-value - - PowerPoint PPT Presentation

from application requests to virtual iops provisioned key
SMART_READER_LITE
LIVE PREVIEW

From application requests to Virtual IOPs: Provisioned key-value - - PowerPoint PPT Presentation

PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google) Shared Cloud Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Shared


slide-1
SLIDE 1

PrincetonUniversity

From application requests to Virtual IOPs: Provisioned key-value storage with Libra

David Shue* and Michael J. Freedman

(*now at Google)

slide-2
SLIDE 2

Shared Cloud

Tenant A Tenant B

VM VM VM VM VM VM

Tenant C

VM VM VM

slide-3
SLIDE 3

Shared Cloud Storage

Tenant A Tenant B

VM VM VM VM VM VM

Key-Value Storage Block Storage SQL Database

Tenant C

VM VM VM

slide-4
SLIDE 4

Unpredictable Shared Cloud Storage

Tenant A Tenant B

VM VM VM VM VM VM

Key-Value Storage Block Storage SQL Database

Tenant C

VM VM VM

Disk IO-bound Tenants SSD-backed storage Key-Value Storage

slide-5
SLIDE 5

Provisioned Shared Key-Value Storage

Tenant A

VM VM VM

Tenant B

VM VM VM

Tenant C

VM VM VM

Application Requests

Shared Key-Value Storage

Low-level IO

SSD SSD SSD SSD SSD GET/s PUT/s IOPs BW

ReservationA ReservationB ReservationC

(1KB normalized)

slide-6
SLIDE 6

Libra Contributions

  • Libra IO Scheduler
  • Provisions low-level IO allocations for app-request reservations

w/ high utilization.

  • Supports arbitrary object distributions and workloads.
  • 2 key mechanisms
  • Track per-tenant app-request resource profiles.
  • Model IO resources with

Virtual IOPs.

6

slide-7
SLIDE 7

Related Work

7

Storage Type App- requests Work Conserving Media Maestro Block N N HDD mClock Block N Y HDD FlashFQ Block N Y SSD DynamoDB Key-Value Y N SSD

slide-8
SLIDE 8

Provisioned Distributed Key-Value Storage

Tenant A Tenant B

VM VM VM VM VM VM

ReservationA ReservationB

Tenant B

VM VM VM

Storage Node N Storage Node 1

...

Shared Key-Value Storage

Global Reservation Problem Local Reservation Problem [Pisces OSD1 ’12]

slide-9
SLIDE 9

Storage Node N

Provisioned Distributed Key-Value Storage

9

ReservationA ReservationB

Storage Node 1

data partitions demand

...

Tenant A

VM VM VM

Tenant B

VM VM VM

slide-10
SLIDE 10

Storage Node N Storage Node 1

Provisioned Distributed Key-Value Storage

10

ReservationA ReservationB

...

ReservationA ReservationB

slide-11
SLIDE 11

Storage Node N Storage Node 1

Provisioned Distributed Key-Value Storage

11

...

ResAn ResBn ResA1 ResB1

ReservationA ReservationB ReservationA ReservationB

slide-12
SLIDE 12

12

Key-Value Protocol Persistence Engine IO Scheduler Physical Disk

Retrieve K Read l337 IO operation

Provisioned Local Key-Value Storage

GET K

GET 1001100

Libra IO Scheduler

slide-13
SLIDE 13

blah blah

Libra Design

13

Libra Provisioning Policy Libra IO Scheduler

GET PUT

How much IO to consume?

Reservation Distribution Policy

How much IO to provision?

Persistence Engine Physical Disk

DRR

slide-14
SLIDE 14

14

Provisioning App-request Reservations is Hard

IO Amplification IO Interference Non-linear IO Performance 1 KB PUT ≥ 1 KB Write Variable IO throughput Underestimate provisionable IO Non-linear cost per KB Model IO with Virtual IOPs Track tenant app-request resource profiles

slide-15
SLIDE 15

Workload-dependent IO Amplification

15

PUT PUT K,V3

V3 V2

FLUSH

V3 index

A M LevelDB (LSM-Tree)

5 10 15 20 25 30

1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B

IO Throughput (kop/s) GET/PUT Request Size

PUT write IO

COMPACT

V1 index

H V

V0 index

F O

V3

index

A O

slide-16
SLIDE 16

Workload-dependent IO Amplification

16

PUT FLUSH COMPACT PUT K,V3

V3 V2 V3 index

A M

V1 index

H V

V0 index

F O

V3

index

A O LevelDB (LSM-Tree)

5 10 15 20 25 30

1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B

IO Throughput (kop/s) GET/PUT Request Size

PUT write IO FLUSH write IO

slide-17
SLIDE 17

Workload-dependent IO Amplification

17

PUT FLUSH COMPACT PUT K,V3

V3 V2 V3 index

A M

V1 index

H V

V0 index

F O

V3

index

A O LevelDB (LSM-Tree)

5 10 15 20 25 30

1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B

IO Throughput (kop/s) GET/PUT Request Size

PUT write IO FLUSH write IO COMPACT write IO COMPACT read IO

slide-18
SLIDE 18

Workload-dependent IO Amplification

18

GET K

index

A M

index

H V

index

F O

K

index

G Z

5 10 15 20 25 30

1 K B 4 K B 8 K B 1 6 K B 3 2 K B 6 4 K B 1 2 8 K B

IO Throughput (kop/s) GET/PUT Request Size

GET read IO PUT write IO FLUSH write IO COMPACT write IO COMPACT read IO

slide-19
SLIDE 19

Libra Tracks App-request IO Consumption to Determine IO Allocations

19

5 GET 25 PUT FLUSH COMPACT Per-PUT Per-GET

Compute app-request IO profiles

GET

x x

PUT

IO

= Tenant A 5 IO units PUT

FLUSH

Track IO consumption Provision IO allocations

blah blah

Libra Provisioning Policy

IO Tenant A

500 IO/s

5 100 50 1

+

6

1 0.5 80 70 500

slide-20
SLIDE 20

1:1 Pure Read/Pure Write

1 2 4 8 16 32 64 128 256

Read IOP Size (KB)

1 2 4 8 16 32 64 128 256

Write IOP Size (KB)

50 60 70 80 90 100

Pct of Ideal Throughput

Unpredictable IO Interference

20

Die-level parallelism, low latency IOPs 4 read/4 write tenants Shared-controller and bus contention Erase-before-write

  • verhead

FTL and read-modify-write garbage colleciton

slide-21
SLIDE 21

1:1 Pure Read/Pure Write

1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256

Write IOP Size (KB)

75:25 Read/Write Ratio

1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256

50 60 70 80 90 100

Pct of Ideal Throughput

50:50 Read/Write Ratio

1 2 4 8 16 32 64 128 256

Read IOP Size (KB)

1 2 4 8 16 32 64 128 256

25:75 Read/Write Ratio

1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256

50 60 70 80 90 100

Unpredictable IO Interference

21

slide-22
SLIDE 22

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

22

Provisionable IO throughput = floor(workloads) (18 Kop/s)

Provisionable IO limit

0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput

75:25 Read/Write 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write

slide-23
SLIDE 23

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

23

Provisionable IO throughput = floor(workloads) (18 Kop/s)

Provisionable IO limit

0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput

75:25 Read/Write 75:25 = 4K 75:25 = 32K 75:25 = 256K 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write

slide-24
SLIDE 24

Libra Underestimates IO Capacity to Ensure Provisionable Throughput

24

Provisionable IO throughput = floor(workloads) (18 Kop/s)

Provisionable IO limit

0.2 0.4 0.6 0.8 1 15 20 25 30 35 40 45 Pct of Read/Write Experiments Normalized IO Throughput

75:25 = 256K 50:50 = 256K 25:75 = 256K 1:1 Pure Read/Pure Write

slide-25
SLIDE 25

Non-linear IO Performance

25

50 100 150 200 250 1 2 4 8 16 32 64 128 256 Bandwidth (MB/s)

IOP Size (KB) IO Bandwidth

5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOP (kop/s)

IOP Size (KB) IOP Throughput

Max BW Max IOP/s

slide-26
SLIDE 26

Non-linear IO Performance

26

50 100 150 200 250 1 2 4 8 16 32 64 128 256 Bandwidth (MB/s)

IOP Size (KB) IO Bandwidth

Read Rand Read Seq Write Rand Write Seq

5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOP (kop/s)

IOP Size (KB) IOP Throughput

Max BW Max IOP/s

slide-27
SLIDE 27

Libra Uses Virtual IOPs to Model IO Resources

27

VOPCPB(IOP-size) = Max-IOP Achieved-IOP(IOP-size) × IOP-size

Unifies IO cost into a single metric Captures non-linear IO performance Provides IO insulation

256

IOP Throughput at 1/2 Max VOPs

0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256 Virtual IOP Cost (op/KB)

IOP Size (KB) Libra IO Cost Model

Read IO cost Write IO cost

2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write

slide-28
SLIDE 28

Libra Uses Virtual IOPs to Model IO Resources

28

2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write

5 10 15 20 25 30 35 40 1 2 4 8 16 32 64 128 256 IOPs (kop/s)

IOP Size (KB) IOP Throughput at 1/2 Max VOPs

Libra Read IO Model Libra Write IO Model Max Read Max Write

0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256 Virtual IOP Cost (op/KB)

IOP Size (KB) Libra IO Cost Model

Read IO cost Write IO cost

VOPCPB(IOP-size) = Max-IOP Achieved-IOP(IOP-size) × IOP-size

slide-29
SLIDE 29

blah blah

Libra Design

29

Libra Provisioning Policy Libra IO Scheduler Persistence Engine Physical Disk

Charge tenant IOPs based on VOP cost Track app-request VOP consumption Provision VOPs within provisionable limit Update tenant VOP allocations

slide-30
SLIDE 30

Evaluation

  • Does Libra's IO resource model achieve accurate

resource allocations?

  • Does Libra's IO threshold make an acceptable tradeoff
  • f performance for predictability in a real storage stack?
  • Can Libra ensure per-tenant app-request reservations

while achieving high utilization?

30

slide-31
SLIDE 31

31

Interference-free Ideal

Libra Achieves Accurate IO Allocations

Throughput Ratio = Actual / Expected (IO Insulation)

Read 1 KB

0.2 0.4 0.6 0.8 1 1.2

W 1KB W 4KB W 8KB W 16KB W 32KB W 64KB W 128KB W 256KB

Throughput Ratio

Read-Write IOP Throughput Ratio

Read Tenants Write Tenants

even

slide-32
SLIDE 32

32

Interference-free Ideal

0.2 0.4 0.6 0.8 1 1.2

Throughput Ratio

Read-Write IOP Throughput Ratio

R 1KB R 4KB R 8KB R 16KB R 32KB R 64KB R 128KB R 256KB Read Tenants Write Tenants

Libra Achieves Accurate IO Allocations

Write 1-256 KB

Throughput Ratio = Actual / Expected (IO Insulation)

0.2 0.4 0.6 0.8 1 1.2 W 1 K B W 4 K B W 8 K B W 1 6 K B W 3 2 K B W 6 4 K B W 1 2 8 K B W 2 5 6 K B Throughput Ratio Read-Write IOP Throughput Ratio Read Tenants Write Tenants
slide-33
SLIDE 33

33

Libra Achieves Accurate IO Allocations

0.2 0.4 0.6 0.8 1 1.2 1 2 4 8 16 32 64 128 256

VOP Cost (op/KB) IOP Size (KB) Read IO Cost Models

0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256

VOP Cost (op/KB) IOP Size (KB) Write IO Cost Models

libra constant fixed

slide-34
SLIDE 34

34

Libra Achieves Accurate IO Allocations

0.2 0.4 0.6 0.8 1 1.2 1 2 4 8 16 32 64 128 256

VOP Cost (op/KB) IOP Size (KB) Read IO Cost Models

0.5 1 1.5 2 2.5 3 3.5 1 2 4 8 16 32 64 128 256

VOP Cost (op/KB) IOP Size (KB) Write IO Cost Models

libra constant fixed linear

slide-35
SLIDE 35

35

Libra Achieves Accurate IO Allocations

Min-Max Ratio = Min Throughput Ratio / Max Throughput Ratio

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

rw rr ww

Accuracy (MMR) IOP Insulation Accuracy Write-Write Read-Read Read-Write

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

rw rr ww

Accuracy (MMR) Virtual IOP Allocation Accuracy

libra linear constant fixed

Write-Write Read-Read Read-Write

slide-36
SLIDE 36

Libra Trades-off Nominal IO Throughput For Predictability

36

75:25 GET-PUT, Variance 4K

1 2 4 8 16 32 64 128 256

GET Request Size (KB)

1 2 4 8 16 32 64 128 256

PUT Request Size (KB)

50:50 GET-PUT, Variance 4K

1 2 4 8 16 32 64 128

GET Request Size (KB)

1 2 4 8 16 32 64 128 256

25:75 GET-PUT, Variance 4K

1 2 4 8 16 32 64 128 256

GET Request Size (KB)

1 2 4 8 16 32 64 128 256

16 18 20 22 24 26 28 30

VOP (kop/s)

slide-37
SLIDE 37

Libra Trades-off Nominal IO Throughput For Predictability

37

< 10th percentile covered by SLA and higher-level policies

Workload Percenti ercentile 10th 50th 80th All

99:1 25:75 1:99

1.6% 30.5% 40.5% 45.8% 1.4% 14.9% 25.0% 34.7% 0.7% 12.2% 19.5% 28.1%

Unprovisionable Throughput As a Percentage of Total Throughput

slide-38
SLIDE 38

Libra Achieves App-request Reservations

38

Read Heavy Mixed Write Heavy

1 2 3 4 5 6 7 Throughput (kreq/s) Libra Normalized GET (1KB) 1 2 3 4 5 6 7 Libra Normalized PUT (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Throughput (kreq/s) Time (s) No Profile Normalized GET (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Time (s) No Profile Normalized PUT (1KB)

1.5x 0.5x

Work-conserving consumption of unprovisioned resources Fully provisioned allocations

slide-39
SLIDE 39

Libra Achieves App-request Reservations

39

1 2 3 4 5 6 7 Throughput (kreq/s) Libra Normalized GET (1KB) 1 2 3 4 5 6 7 Libra Normalized PUT (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Throughput (kreq/s) Time (s) No Profile Normalized GET (1KB) 1 2 3 4 5 6 7 100 150 200 250 300 Time (s) No Profile Normalized PUT (1KB)

Read Heavy Mixed Write Heavy

slide-40
SLIDE 40
  • Libra IO Scheduler
  • Provisions IO allocations for app-request reservations w/ high utilization.
  • Supports arbitrary object distributions and workloads.
  • 2 key mechanisms
  • Track per-tenant app-request resource profiles.
  • Model IO resources with

Virtual IOPs.

  • Evaluation
  • Achieves accurate low-level IO allocations.
  • Provisions the majority of IO resources over a wide range of workloads
  • Satisfies app-request reservations w/ high utilization.

Conclusion

40