rocksteady fast migration for low latency in memory
play

Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay - PowerPoint PPT Presentation

Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay Kulkarni , Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman 1 Introduction Distributed low-latency in-memory key-value stores are emerging Predictable response


  1. Rocksteady: Fast Migration for Low-Latency In-memory Storage Chinmay Kulkarni , Aniraj Kesavan, Tian Zhang, Robert Ricci, Ryan Stutsman 1

  2. Introduction • Distributed low-latency in-memory key-value stores are emerging • Predictable response times ~ 10 µs median, ~60 µs 99.9 th -tile • Problem: Must migrate data between servers • Minimize performance impact of migration → go slow? • Quickly respond to hot spots, skew shifts, load spikes → go fast? • Solution: Fast data migration with low impact • Early ownership transfer of data, leverage workload skew • Low priority, parallel and adaptive migration • Result: Migration protocol for RAMCloud in-memory key-value store • Migrates 256 GB in 6 minutes, 99.9 th -tile latency less than 250 µs • Median latency recovers from 40 µs to 20 µs in 14 s 2

  3. Why Migrate Data? Client 1 Client 2 multiGet( ) multiGet( ) A B C D 6 Million No Locality Fanout=7 A C B D Server 1 Server 2 Poor spatial locality → High multiGet() fan-out → More RPCs 3

  4. Migrate To Improve Spatial Locality Client 1 Client 2 multiGet( ) multiGet( ) A B C D 6 Million No Locality Fanout=7 B A C B D Server 1 Server 2 C 4

  5. Spatial Locality Improves Throughput Client 1 Client 2 25 Million multiGet( ) multiGet( ) A B C D 6 Million Full Locality No Locality Fanout=1 Fanout=7 A B C D Server 1 Server 2 Better spatial locality → Fewer RPCs → Higher throughput Benefits multiGet(), range scans 5

  6. The RAMCloud Key-Value Store Client Client Client Client All Data in RAM Kernel Bypass/ DPDK Data Center Fabric Coordinator 10 µs reads Master Master Master Master Backup Backup Backup Backup 6

  7. The RAMCloud Key-Value Store Client Client Client Client Write RPC Data Center Fabric Coordinator Master Master Master Master Backup Backup Backup Backup 7

  8. The RAMCloud Key-Value Store Client Client Client Client Write RPC 1x in DRAM Data Center Fabric Coordinator 3x on Disk Master Master Master Master Backup Backup Backup Backup 8

  9. Fault-toler eranc nce & e & Rec ecover ery I In n RAM AMCl Cloud ud Client Client Client Client Data Center Fabric Coordinator Master Master Master Master Backup Backup Backup Backup 9

  10. Fault-toler eranc nce & e & Rec ecover ery I In n RAM AMCl Cloud ud Client Client Client Client 2 seconds to Data Center Fabric Coordinator recover Master Master Master Master Backup Backup Backup Backup 10

  11. Performance Goals For Migration • Maintain low access latency • 10 µsec median latency → System extremely sensitive • Tail latency matters at scale → Even more sensitive • Migrate data fast • Workloads dynamic → Respond quickly • Growing DRAM storage: 512 GB per server • Slow data migration → Entire day to scale cluster 11

  12. Rocksteady Overview: Early Ownership Transfer Problem: Loaded source can bottleneck migration Solution: Instantly shift ownership and all load to target Client 1 Client 2 Client 3 Client 4 Reads and Writes Source Server Target Server 12

  13. Rocksteady Overview: Early Ownership Transfer Problem: Loaded source can bottleneck migration Solution: Instantly shift ownership and all load to target Client 1 Client 2 Client 3 Client 4 Instantly All future operations Redirected serviced at Target Creates “headroom” to speed migration Source Server Target Server 13

  14. Rocksteady Overview: Leverage Skew Problem: Data has not arrived at source yet Solution: On demand migration of unavailable data Client 1 Client 2 Client 3 Client 4 Read On-demand Pull Source Server Target Server 14

  15. Rocksteady Overview: Leverage Skew Problem: Data has not arrived at source yet Solution: On demand migration of unavailable data Client 1 Client 2 Client 3 Client 4 Read Hot keys move early Median Latency recovers to 20 µs in 14 s Source Server Target Server 15

  16. Rocksteady Overview: Adaptive and Parallel Problem: Old single-threaded protocol limited to 130 MB/s Solution: Pipelined and parallel at source and target Client 1 Client 2 Client 3 Client 4 On-demand Pull Parallel Pulls Source Server Target Server 16

  17. Rocksteady Overview: Adaptive and Parallel Problem: Old single-threaded protocol limited to 130 MB/s Solution: Pipelined and parallel at source and target Client 1 Client 2 Client 3 Client 4 On-demand Pull Target Driven Yields to On-demand Pulls Parallel Pulls Moves 758 MB/s Source Server Target Server 17

  18. Rocksteady Overview: Eliminate Sync Replication Problem: Synchronous replication bottleneck at target Solution: Safely defer replication until after migration Client 1 Client 2 Client 3 Client 4 Replication On-demand Pull Parallel Pulls Source Server Target Server 18

  19. Rocksteady Overview: Eliminate Sync Replication Problem: Synchronous replication bottleneck at target Solution: Safely defer replication until after migration Client 1 Client 2 Client 3 Client 4 Replication Source Server Target Server 19

  20. Rocksteady: Putting it all together • Instantaneous ownership transfer • Immediate load reduction at overloaded source • Creates “headroom” for migration work • Leverage skew to rapidly migrate hot data • Target comes up to speed with little data movement • Adaptive parallel, pipelined at source and target • All cores avoid stalls, but yield to client-facing operations • Safely defer replication at target • Eliminates replication bottleneck and contention 20

  21. Rocksteady • Instantaneous ownership transfer • Leverage skew to rapidly migrate hot data • Adaptive parallel, pipelined at source and target • Safely defer synchronous replication at target 21

  22. Evaluation Setup Client Client Client Client YCSB-B (95/5) YCSB-B (95/5) YCSB-B (95/5) YCSB-B (95/5) Skew=0.99 Skew=0.99 Skew=0.99 Skew=0.99 300 Million Records 45 GB Source Server Target Server 22

  23. Evaluation Setup Client Client Client Client YCSB-B (95/5) YCSB-B (95/5) YCSB-B (95/5) YCSB-B (95/5) Skew=0.99 Skew=0.99 Skew=0.99 Skew=0.99 150 Million Records 150 Million 22.5 GB Records 150 Million 22.5 GB Records 22.5 GB Target Server 23

  24. Instantaneous Ownership Transfer Source CPU Utilization 80% Created 55% Source CPU Headroom 25% Before Ownership Immediately After Transfer Transfer Before migration: Source over-loaded, Target under-loaded Ownership transfer creates Source headroom for migration 24

  25. Rocksteady • Instantaneous ownership transfer • Leverage skew to rapidly migrate hot data • Adaptive parallel, pipelined at source and target • Safely defer synchronous replication at target 25

  26. Leverage Skew To Move Hot Data Before Migration: 99.9th Latency Median Latency 245µs 240µs Median=10 µs 99.9 th = 60 µs 155µs 75µs 28µs 17µs Uniform (Low) Skew=0.99 Skew=1.5 (High) After ownership transfer, hot keys pulled on-demand More skew → Median restored faster (migrate fewer hot keys) 26

  27. Rocksteady • Instantaneous ownership transfer • Leverage skew to rapidly migrate hot data • Adaptive parallel, pipelined at source and target • Safely defer synchronous replication at target 27

  28. Parallel, Pipelined, & Adaptive Pulls 0 8 16 24 Target Hash Table Per-Core Buffers Worker Cores replay replay read(B) Dispatch Migration Core Manager Pull Buffers NIC Polling pulling • Target driven, migration manager • Co-partitioned hash tables, pull from partitions in parallel • Replay pulled data into per-core buffers 28

  29. Parallel, Pipelined, & Adaptive Pulls 0 8 16 24 Source Hash Table Copy Addresses read(A) pull(11) pull(17) Dispatch Core Gather Gather List List NIC Polling • Stateless passive Source • Granular 20 KB pulls 29

  30. Parallel, Pipelined, & Adaptive Pulls • Redirect any idle CPU for migration • Migration yields to regular requests, on-demand pulls 30

  31. Rocksteady • Instantaneous ownership transfer • Leverage skew to rapidly migrate hot data • Adaptive parallel, pipelined at source and target • Safely defer synchronous replication at target 31

  32. Naïve Fault Tolerance During Migration Each server has a recovery log distributed across the cluster Source Target A A B C Backup Backup Backup Backup Backup Source C A B C A A C B B Recovery Log Target Recovery Log 32

  33. Naïve Fault Tolerance During Migration Migrated data needs to be triplicated to target’s recovery log Source Target A B C A Backup Backup Backup Backup Backup Source C A B C A A C B B Recovery Log Target Recovery Log 33

  34. Naïve Fault Tolerance During Migration Migrated data needs to be triplicated to target’s recovery log Source Target A B C A Backup Backup Backup Backup Backup Source C A B C A A C B B Recovery Log Target A A A Recovery Log 34

  35. Synchronous Replication Bottlenecks Migration Synchronous replication hits migration speed by 34% Source Target A B C A B Backup Backup Backup Backup Backup Source C A B C A A C B B Recovery Log Target B A B A B A Recovery Log 35

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend