From application requests to Virtual IOPs: Provisioned key-value - PowerPoint PPT Presentation

PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google)

Shared Cloud Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM

Shared Cloud Storage Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Key-Value Block Storage SQL Database Storage

Unpredictable Shared Cloud Storage Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Key-Value Key-Value Block Storage SQL Database Storage Storage Disk IO-bound Tenants SSD-backed storage

Provisioned Shared Key-Value Storage Reservation A Reservation B Reservation C Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Application Requests GET/s PUT/s (1KB normalized) Shared Key-Value Storage IOPs Low-level IO BW SSD SSD SSD SSD SSD

Libra Contributions • Libra IO Scheduler - Provisions low-level IO allocations for app-request reservations w/ high utilization. - Supports arbitrary object distributions and workloads. • 2 key mechanisms - Track per-tenant app-request resource profiles. - Model IO resources with Virtual IOPs. 6

Related Work Storage App- Work Conserving Media Type requests Maestro Block N N HDD mClock Block N Y HDD FlashFQ Block N Y SSD DynamoDB Key-Value Y N SSD 7

Provisioned Distributed Key-Value Storage Global Reservation Problem [Pisces OSD1 ’12] Reservation A Reservation B Tenant A Tenant B Tenant B VM VM VM VM VM VM VM VM VM Local Reservation Problem Storage Storage Shared Key-Value Storage ... Node 1 Node N

Provisioned Distributed Key-Value Storage Reservation A Reservation B demand Tenant A Tenant B VM VM VM VM VM VM data partitions Storage Storage ... Node 1 Node N 9

Provisioned Distributed Key-Value Storage Reservation A Reservation A Reservation B Reservation B Storage Storage ... Node 1 Node N 10

Provisioned Distributed Key-Value Storage Reservation B Reservation B Reservation A Reservation A Res A1 Res B1 Res An Res Bn Storage Storage ... Node 1 Node N 11

Provisioned Local Key-Value Storage GET 1001100 GET K Key-Value Protocol Retrieve K Persistence Engine Read l337 Libra IO Scheduler IO Scheduler IO operation Physical Disk 12

Libra Design Persistence Engine Reservation Distribution Policy blah PUT GET Libra How much IO How much IO to Libra IO Provisioning Scheduler to consume? provision? Policy DRR blah Physical Disk 13

Provisioning App-request Reservations is Hard 1 KB PUT ≥ 1 KB Write Track tenant app-request resource profiles IO Amplification IO Interference Underestimate provisionable IO Variable IO throughput Non-linear IO Model IO with Non-linear cost per KB Virtual IOPs Performance 14

Workload-dependent IO Amplification LevelDB (LSM-Tree) PUT K,V 3 PUT PUT write IO 30 IO Throughput (kop/s) V 2 V 3 25 FLUSH M A 20 index V 3 15 H V 10 index V 1 F O 5 COMPACT index V 0 0 1 4 8 1 3 6 1 K K K 6 2 4 2 K K K 8 B B B K B B B B index V 3 GET/PUT Request Size O A 15

Workload-dependent IO Amplification LevelDB (LSM-Tree) PUT K,V 3 PUT FLUSH write IO 30 PUT write IO IO Throughput (kop/s) V 2 V 3 25 FLUSH M A 20 index V 3 15 H V 10 index V 1 F O 5 COMPACT index V 0 0 1 4 8 1 3 6 1 K K K 6 2 4 2 K K K 8 B B B K B B B B index V 3 GET/PUT Request Size O A 16

Workload-dependent IO Amplification LevelDB (LSM-Tree) PUT K,V 3 PUT COMPACT read IO 30 COMPACT write IO IO Throughput (kop/s) V 2 V 3 FLUSH write IO 25 FLUSH PUT write IO M A 20 index V 3 15 H V 10 index V 1 F O 5 COMPACT index V 0 0 1 4 8 1 3 6 1 K K K 6 2 4 2 K K K 8 B B B K B B B B index V 3 GET/PUT Request Size O A 17

Workload-dependent IO Amplification GET K COMPACT read IO 30 COMPACT write IO IO Throughput (kop/s) FLUSH write IO 25 PUT write IO M A GET read IO 20 index 15 H V 10 index F O 5 index 0 1 4 8 1 3 6 1 K K K 6 2 4 2 K K K 8 G Z B B B K B B B B index K GET/PUT Request Size 18

Libra Tracks App-request IO Consumption to Determine IO Allocations IO Tenant A 500 IO/s blah Track IO Compute app-request Provision IO consumption IO profiles allocations Libra 5 5 GET 80 Provisioning GET 100 x Policy Per-GET 1 25 PUT = IO + 1 PUT x 6 Per-PUT FLUSH 50 0.5 PUT Tenant A 70 500 blah 5 IO units COMPACT FLUSH 19

Unpredictable IO Interference 4 read/4 write tenants 1:1 Pure Read/Pure Write Pct of Ideal Throughput 256 100 Write IOP Size (KB) Die-level parallelism, low 128 90 latency IOPs 64 32 80 Shared-controller and bus 16 70 contention 8 4 60 Erase-before-write 2 overhead 50 1 1 2 4 8 16 32 64 128 256 FTL and read-modify-write Read IOP Size (KB) garbage colleciton 20

Unpredictable IO Interference 1:1 Pure Read/Pure Write 75:25 Read/Write Ratio 100 256 256 128 128 90 64 64 80 32 32 16 16 Pct of Ideal Throughput 70 8 8 Write IOP Size (KB) 4 4 60 2 2 50 1 1 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 50:50 Read/Write Ratio 25:75 Read/Write Ratio 100 256 256 128 128 90 64 64 32 32 80 16 16 70 8 8 4 4 60 2 2 50 1 1 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 Read IOP Size (KB) 21

Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) 1 Pct of Read/Write Experiments Provisionable IO limit 0.8 0.6 0.4 75:25 Read/Write 0.2 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write 0 15 20 25 30 35 40 45 Normalized IO Throughput 22

Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) 1 Pct of Read/Write Experiments Provisionable IO limit 0.8 0.6 75:25 Read/Write 75:25 � = 4K 0.4 75:25 � = 32K 75:25 � = 256K 0.2 50:50 Read/Write 25:75 Read/Write 1:1 Pure Read/Pure Write 0 15 20 25 30 35 40 45 Normalized IO Throughput 23

Libra Underestimates IO Capacity to Ensure Provisionable Throughput Provisionable IO throughput = floor(workloads) (18 Kop/s) 1 Pct of Read/Write Experiments Provisionable IO limit 0.8 0.6 0.4 75:25 � = 256K 0.2 50:50 � = 256K 25:75 � = 256K 1:1 Pure Read/Pure Write 0 15 20 25 30 35 40 45 Normalized IO Throughput 24

Non-linear IO Performance IO Bandwidth IOP Throughput 40 Max BW Max IOP/s 250 35 Bandwidth (MB/s) 200 30 IOP (kop/s) 25 150 20 100 15 10 50 5 0 0 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 IOP Size (KB) IOP Size (KB) 25

Non-linear IO Performance IO Bandwidth IOP Throughput 40 Max BW Max IOP/s 250 35 Bandwidth (MB/s) 200 30 IOP (kop/s) 25 150 20 100 15 Read Rand Read Seq 10 50 Write Rand 5 Write Seq 0 0 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 IOP Size (KB) IOP Size (KB) 26

Libra Uses Virtual IOPs to Model IO Resources Max-IOP VOP CPB (IOP-size) = Achieved-IOP(IOP-size) × IOP-size IOP Throughput at 1/2 Max VOPs Libra IO Cost Model 3.5 Read IO cost Virtual IOP Cost (op/KB) 3 Write IO cost Unifies IO cost into a single 2.5 metric 2 Captures non-linear IO 1.5 performance 1 Provides IO insulation 0.5 0 256 1 2 4 8 16 32 64 128 256 2 equal-allocation tenants IOP Size (KB) IO Insulation = 1/2 Max Read/Write 27

Libra Uses Virtual IOPs to Model IO Resources Max-IOP VOP CPB (IOP-size) = Achieved-IOP(IOP-size) × IOP-size IOP Throughput at 1/2 Max VOPs Libra IO Cost Model 40 3.5 Libra Read IO Model Read IO cost Virtual IOP Cost (op/KB) 35 3 Libra Write IO Model Write IO cost Max Read 30 2.5 Max Write IOPs (kop/s) 25 2 20 1.5 15 1 10 0.5 5 0 0 1 2 4 8 16 32 64 128 256 1 2 4 8 16 32 64 128 256 IOP Size (KB) IOP Size (KB) 2 equal-allocation tenants IO Insulation = 1/2 Max Read/Write 28

Libra Design Persistence Engine Update tenant VOP blah allocations Libra Libra IO Charge tenant IOPs Provision VOPs within Provisioning Scheduler based on VOP cost provisionable limit Policy blah Track app-request VOP consumption Physical Disk 29

Evaluation • Does Libra's IO resource model achieve accurate resource allocations? • Does Libra's IO threshold make an acceptable tradeoff of performance for predictability in a real storage stack? • Can Libra ensure per-tenant app-request reservations while achieving high utilization? 30

Libra Achieves Accurate IO Allocations Read-Write IOP Throughput Ratio 1.2 Read Tenants Interference-free Ideal Throughput Ratio 1 Write Tenants 0.8 even 0.6 0.4 0.2 0 W 1KB W 4KB W 8KB W 16KB W 32KB W 64KB W 128KB W 256KB Read 1 KB Throughput Ratio = Actual / Expected (IO Insulation) 31

From application requests to Virtual IOPs: Provisioned key-value - PowerPoint PPT Presentation

PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google) Shared Cloud Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Shared

IOPS Technical Committee Meeting, 22 February 2018 Dublin, Ireland www.iopsweb.org 1.

IOPS Technical Committee Meeting, 22 February 2018 Dublin, Ireland www.iopsweb.org 1.

Service Requirements for Provider Provisioned Virtual Private Networks (PPVPN)

GROUPS Virtual Group Topics Overview of Virtual Groups Participating as a Virtual Group in

Optimal Distribution of Video Stream in Large Under-provisioned Peer-to-peer Networks Jinhua Zhao

Try before you buy User provisioned sites targeting multiple providers Murray Woodman, Marji

in Switzerland IOPS/IAIS PCG/Financial Supervisory Authority of Iceland (FME) Seminar 28 February

Challenges for the Icelandic pension funds: Investment opportunities IOPS/ IAIS PCG/ Financial

Longevity Risk and Regulation in Switzerland IOPS/IAIS PCG/Financial Supervisory Authority of

Solving the Linux storage scalability bottlenecks Jens Axboe Software Engineer Vault 2016 What

Back on the high frequency measurements produced by tethered balloon during 12 IOPs of BLLAST

CAPITAL OUTLAY PROJECT REQUESTS FY 2018 - 2019 M ID F EBRUARY 2018: C APITAL O UTLAY REQUESTS ARE

EXPERIENCE VIRTUAL REALITY VIRTUAL REALITY MARKET VR will be bigger than TV Virtual

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

Operating Systems Design and Implementation Chapter 02 (version January 30, 2008 ) Melanie

Lab 3 tutorial Instructor: Youngjin Kwon 1 Lab 1 Lab 2 Initialize page metadata (struct

SE350: Operating Systems Lecture 3: Process Management Outline Safe control transfer How

Control Timing Example D 3 T 4 : SC <= 0 Flaxer Eli - Computer Architecture Ch 2 - 1 Register

Announcements PA2 available soon, due 02/__, 11:59p. Today: BST remove code AVL intro

SUD Treatment Continuum in Colorado Regional Capacity Meetings Presented by: Kim McConnell

Parallel I/O Characterisation Based on Server-Side Performance Counters Member of the

iPad Use in the Elementary Schools: An Overview Agenda Overview of iPad use Scheduling

From application requests to Virtual IOPs: Provisioned key-value - PowerPoint PPT Presentation

PrincetonUniversity From application requests to Virtual IOPs: Provisioned key-value storage with Libra David Shue * and Michael J. Freedman (*now at Google) Shared Cloud Tenant C Tenant A Tenant B VM VM VM VM VM VM VM VM VM Shared

IOPS Technical Committee Meeting, 22 February 2018 Dublin, Ireland www.iopsweb.org 1.

IOPS Technical Committee Meeting, 22 February 2018 Dublin, Ireland www.iopsweb.org 1.

Service Requirements for Provider Provisioned Virtual Private Networks (PPVPN)

GROUPS Virtual Group Topics Overview of Virtual Groups Participating as a Virtual Group in

Optimal Distribution of Video Stream in Large Under-provisioned Peer-to-peer Networks Jinhua Zhao

Try before you buy User provisioned sites targeting multiple providers Murray Woodman, Marji

in Switzerland IOPS/IAIS PCG/Financial Supervisory Authority of Iceland (FME) Seminar 28 February

Challenges for the Icelandic pension funds: Investment opportunities IOPS/ IAIS PCG/ Financial

Longevity Risk and Regulation in Switzerland IOPS/IAIS PCG/Financial Supervisory Authority of

Solving the Linux storage scalability bottlenecks Jens Axboe Software Engineer Vault 2016 What

Back on the high frequency measurements produced by tethered balloon during 12 IOPs of BLLAST

CAPITAL OUTLAY PROJECT REQUESTS FY 2018 - 2019 M ID F EBRUARY 2018: C APITAL O UTLAY REQUESTS ARE

EXPERIENCE VIRTUAL REALITY VIRTUAL REALITY MARKET VR will be bigger than TV Virtual

Virtual Memory and Virtual Memory and Demand Paging Demand Paging Virtual Memory Illustrated

Lecture 19: Virtual Memory Virtual Memory concept, Virtual- physical translation, page table,

3/9/2020 The Virtual The Virtual The Virtual The Virtual Certification Certification

Operating Systems Design and Implementation Chapter 02 (version January 30, 2008 ) Melanie

Lab 3 tutorial Instructor: Youngjin Kwon 1 Lab 1 Lab 2 Initialize page metadata (struct

SE350: Operating Systems Lecture 3: Process Management Outline Safe control transfer How

Control Timing Example D 3 T 4 : SC &lt;= 0 Flaxer Eli - Computer Architecture Ch 2 - 1 Register

Announcements PA2 available soon, due 02/__, 11:59p. Today: BST remove code AVL intro

SUD Treatment Continuum in Colorado Regional Capacity Meetings Presented by: Kim McConnell

Parallel I/O Characterisation Based on Server-Side Performance Counters Member of the

iPad Use in the Elementary Schools: An Overview Agenda Overview of iPad use Scheduling

Control Timing Example D 3 T 4 : SC <= 0 Flaxer Eli - Computer Architecture Ch 2 - 1 Register