fighting with unknowns estimating the performance of
play

Fighting with Unknowns: Estimating the Performance of Scalable - PowerPoint PPT Presentation

Fighting with Unknowns: Estimating the Performance of Scalable Distributed Storage Systems with Minimal Measurement Data Moo-Ryong Ra and Hee Won Lee 1 AT&T Labs Research May 23, 2019 1 Presenter at MSST 2019 Motivation Goal To


  1. Fighting with Unknowns: Estimating the Performance of Scalable Distributed Storage Systems with Minimal Measurement Data Moo-Ryong Ra and Hee Won Lee 1 AT&T Labs Research May 23, 2019 1 Presenter at MSST 2019

  2. Motivation � Goal ◮ To estimate the performance of scalable distributed storage systems (e.g., Ceph and Swift) that use consistent hashing to distribute the workload as evenly as possible across all available compute resources � Problem ◮ Mathematical modeling or black-box approach needs a significant amount of efforts and data collection processes � Our Approach ◮ We propose a simple, yet accurate performance estimation technique for scalable distributed storage systems ◮ Our technique aims to identify max IOPS for an arbitrary read/write ratio with a minimal evaluation process

  3. Our Model Claim: If HW/SW/workload settings remain unchanged, the total processing capability ( C ) of a distributed storage system is invariant for a given IO size. C = T read + T write · f rw We can acquire f rw value with just two data points: f rw = T 100% read T 100% write T read (read IOPS) T 100%read Our Model: (Linear) 2 3 2 1 T write (write IOPS) T 100%write

  4. Our Model: arbitrary read/write ratio Given that read/write ratio = R read : R write , ◮ read IOPS: T read = k · R read ◮ write IOPS: T write = k · R write k · R read + k · R write · f rw = C T 100% read k = R read + { 100 − R read } · f rw Once we get the value of k , it is trivial to obtain T read and T write .

  5. Our Model: mixed IO sizes Suppose that we have heterogeneous IO sizes, S 1 , S 2 , · · · , S N and know the proportion of each IO size to the total IOs, P 1 , P 2 , · · · , P N where � N i =0 P i = 1. P 1 · T S 1 k S 1 = ¯ rw = P 1 · k S 1 100% read R read + { 100 − R read }· f S 1 . . . SN P N · T ¯ k S N = = P N · k S N . 100% read SN R read + { 100 − R read }· f rw Total IOPS can be obtained by: N N { R S i read + R S i write } · ¯ k S i = 100 · � � P i · k S i T total = i =1 i =1

  6. Evaluation We set up two different distributed storage systems: � Ceph ◮ Block Storage, Strong Consistency, 3x Replication ◮ FIO: 104 OpenStack VMs, each running 8 FIO jobs … Host 1 Host 2 Host 3 Host 4 Host 8 Host 9 vm-01 vm-02 vm-03 vm-04 vm-08 Logging vm-09 vm-10 vm-11 vm-12 … … … … vm-16 … Monitoring Alerting vm-97 vm-98 vm-99 vm-100 vm-104 Ceph-mon Ceph-mon Ceph-mon OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD OSD 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB 1.6TB NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe NVMe 25 GbE Network Link Converged Server VMs Ceph Service Daemons � Swift ◮ Object Storage, Eventual Consistency, 3x Replication ◮ COSBench: 32 workers Host 1 Host 2 Host 3 Host 4 Host 5 Host 6 Object Object Object Object Server Server Server Server Proxy Client Server 480GB 480GB 480GB 480GB 480GB 480GB 480GB 480GB SSD SSD SSD SSD SSD SSD SSD SSD 10 GbE Network Link 10 GbE Network Link Swift Service Daemons Swift Client Daemons

  7. Meaning of f rw [Our Model] Total processing cap.( C ) is invariant per IO size: C = T read + T write · f rw where f rw = T 100% read T 100% write . Ceph (block size) Swift (object size) 10 4KB 9 8 7 6 f_rw 512KB 5 1MB 2MB 4 4MB 3 4KB 2MB 4MB 1MB 2 512KB 1 0 0 1000 2000 3000 4000 5000 IO Size Note: ◮ f rw reflects the load difference b/w read and write operations ◮ The amount of work required for read and write operations can be very different per storage system implementation and their configurations

  8. Total Processing Capacity ( C ) per IO Size [Our Model] Total processing cap.( C ) is invariant per IO size: C = T read + T write · f rw where f rw = T 100% read T 100% write . 4KB 8KB 16KB 32KB 64KB 4KB 8KB 16KB 32KB 64KB 128KB 256KB 512KB 1MB 2MB 128KB 256KB 512KB 1MB 2MB 2000 800000 1800 700000 1600 600000 1400 500000 1200 1000 400000 C C 800 300000 600 200000 400 100000 200 0 0 0 10 20 30 40 50 60 70 80 90 100 0 10 20 30 40 50 60 70 80 90 100 Read Ratio (%) Read Ratio (%) (a) Ceph (b) Swift Figure: C value

  9. Performance Estimation For obj size S i , when read/write ratio = R read : R write : T S i k S i = 100% read R read + { 100 − R read } · f S i rw For mixed obj sizes ( P i = proportion of obj size S i to total objs): N � P i · k S i T total = 100 · i =1 Measured Estimated Measured Estimated 1600 1200 1400 1000 T_total (IOPS) T_total (IOPS) 1200 800 1000 800 600 600 400 400 200 200 0 0 10 30 50 70 90 10 30 50 70 90 The ratio of 16KB objects (%) The ratio of 16KB objects (%) (a) 16KB Read+1MB Read (b) 16KB RW+512KB RW Figure: IO workloads with mixed object sizes on Swift cluster

  10. Performance Estimation Error The errors between estimated and measured total IOPS are less than 9%. 16KB Rs+1MB Rs 64KB Ws+1MB Ws 16 KB R/Ws+512KB R/Ws 100 80 Error (%) 60 40 20 0 10 30 50 70 90 The ratio of the first objects (16KB/64KB/16KB) (%) Figure: Estimation error on Swift Cluster

  11. Conclusion 1. We proposed a novel technique to accurately estimate the performance of an arbitrarily mixed workload, in terms of read/write ratio and IO size 2. Our simple technique requires only a few data points – i.e., 100% read IOPS and 100% write IOPS for each IO size 3. Our technique can be applicable to any distributed storage systems that distribute the load evenly across the available hardware resources

  12. Any Questions? We are hiring a couple of systems researchers: ◮ Senior Inventive Scientist (for fresh PhDs) ◮ Principal Inventive Scientist (for mid-career professionals) Contact: Hee Won Lee, PhD Email: knowpd@research.att.com Location: Bedminster, New Jersey

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend