cloud benchmarking
play

Cloud Benchmarking Estimating Cloud Application Performance Based - PowerPoint PPT Presentation

software evolution & architecture lab Department of Informatics s.e.a.l. Cloud Benchmarking Estimating Cloud Application Performance Based on Micro Benchmark Profiling Joel Scheuner 2017-06-15 Page 1 Master Thesis Defense software


  1. software evolution & architecture lab Department of Informatics – s.e.a.l. Cloud Benchmarking Estimating Cloud Application Performance Based on Micro Benchmark Profiling Joel Scheuner 2017-06-15 Page 1 Master Thesis Defense

  2. software evolution & architecture lab Department of Informatics – s.e.a.l. Problem Number of Instance Types in 90 t2.nano 80 0.05-1 vCPU 0.5 GB RAM 70 $0.006 hourly 60 50 x1.32xlarge 40 128 vCPUs 1952 GB RAM 30 $16.006 hourly 20 10 0 Aug-06 Aug-07 Aug-08 Aug-09 Aug-10 Aug-11 Aug-12 Aug-13 Aug-14 Aug-15 Aug-16 à Unpractical to Test all Instance Types 2017-06-15 Page 2

  3. software evolution & architecture lab Department of Informatics – s.e.a.l. Motivation Cloud Micro Benchmarks Applications Memory CPU I/O Overall performance Network (e.g., response time) Generic Specific ? Artificial Real-World How relevant? Resource-specific Resource- heterogeneous 2017-06-15 Page 3

  4. software evolution & architecture lab Department of Informatics – s.e.a.l. Research Questions RQ1 – Performance Variability within Instance Types Does the performance of equally configured cloud instances vary relevantly? RQ2 – Application Performance Estimation across Instance Types Can a set of micro benchmarks estimate application performance for cloud instances of different configurations? RQ2.1 – Estimation Accuracy How accurate can a set of micro benchmarks estimate application performance? RQ2.2 – Micro Benchmark Selection Which subset of micro benchmarks estimates application performance most accurately? 2017-06-15 Page 4

  5. software evolution & architecture lab Department of Informatics – s.e.a.l. Methodology Benchmark Benchmark Data Pre- Data Design Execution Processing Analyses � � 50 40 Relative Standard Deviation (RSD) [%] 30 � � � � � � 20 � � � � � � � � � � � � 10 � � � � � � � � � � � � � � � � � � � � 6.83 � � � � � � � � � 5 � � 4.41 � 4.3 � � � � � � � � � � � � � 3.16 3.32 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � 0 � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � � m1.small (eu) m1.small (us) m3.medium (eu) m3.medium (us) m3.large (eu) Configuration [Instance Type (Region)] 2017-06-15 Page 5

  6. software evolution & architecture lab Department of Informatics – s.e.a.l. Performance Data Set * Instance Type vCPU ECU RAM [GiB] Virtualization Network Performance eu + us m1.small 1 1 1.7 PV Low m1.medium 1 2 3.75 PV Moderate eu + us RQ1 m3.medium 1 3 3.75 PV /HVM Moderate m1.large 2 4 7.5 PV Moderate eu m3.large 2 6.5 7.5 HVM Moderate RQ2 m4.large 2 6.5 8.0 HVM Moderate c3.large 2 7 3.75 HVM Moderate c4.large 2 8 3.75 HVM Moderate c3.xlarge 4 14 7.5 HVM Moderate c4.xlarge 4 16 7.5 HVM High c1.xlarge 8 20 7 PV High * ECU := Elastic Compute Unit (i.e., Amazon’s metric for CPU performance) >240 Virtual Machines (VMs) à 3 Iterations à ~750 VM hours >60’000 Measurements 2017-06-15 Page 6

  7. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ1 – Approach RQ1 – Performance Variability within Instance Types Does the performance of equally configured cloud instances vary relevantly? VM 1 VM 3 VM 33 VM 2 Same instance type … iter 1 iter 2 iter 3 … *38 selected metrics 𝐵𝑤𝑕(𝑊𝑁 ' ) 𝐵𝑤𝑕(𝑊𝑁 ) ) 𝐵𝑤𝑕(𝑊𝑁 * ) 𝐵𝑤𝑕(𝑊𝑁 ** ) 𝑆𝑓𝑚𝑏𝑢𝑗𝑤𝑓 𝑇𝑢𝑏𝑜𝑒𝑏𝑠𝑒 𝐸𝑓𝑤𝑗𝑏𝑢𝑗𝑝𝑜 (𝑆𝑇𝐸) = 100 ∗ 𝜏 = 𝑛 ? 𝜏 = := absolute standard deviation 𝑛 ? := mean of metric m 2017-06-15 Page 7

  8. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ1 – Results 50 Relative Standard Deviation (RSD) [%] 40 Threads Latency Fileio Random 30 20 10 ⧫ mean 6.83 5 4.41 4.3 Network 3.32 3.16 Fileio Seq. 0 m1.small (eu) m1.small (us) m3.medium (eu) m3.medium (us) m3.large (eu) 2017-06-15 Configuration [Instance Type (Region)] Page 8

  9. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ1 – Implications Hardware heterogeneity exploiting approaches are not worthwhile anymore [OZL+13, OZN+12, FJV+12] Smaller sample size required to confidently assess instance type performance Fair offer [OZL+13] Z. Ou, H. Zhuang, A. Lukyanenko, J. K. Nurminen, P. Hui, V. Mazalov, and A. Ylä- Jääski. Is the same instance type created equal? exploiting heterogeneity of public clouds . IEEE Transactions on Cloud Computing , 1(2):201–214, 2013 [OZN+12] Zhonghong Ou, Hao Zhuang, Jukka K. Nurminen, Antti Ylä-Jääski, and Pan Hui. Exploiting hardware heterogeneity within the same instance type of amazon ec2 . In Proceedings of the 4th USENIX Conference on Hot Topics in Cloud Computing (HotCloud’12) , 2012 [FJV+12] Benjamin Farley, Ari Juels, Venkatanathan Varadarajan, Thomas Ristenpart, Kevin D. Bowers, and Michael M. Swift. More for your money: Exploiting performance heterogeneity in public clouds . In Proceedings of the 3 rd ACM Symposium on Cloud Computing (SoCC ’12) , pages 20:1–20:14, 2012 2017-06-15 Page 9

  10. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ2 – Approach RQ2 – Application Performance Estimation across Instance Types Can a set of micro benchmarks estimate application performance for cloud instances of different configurations? micro 1 , micro 2 , …, micro N Instance Type 1 (m1.small) app 1 , app 2 Linear Regression Model Instance Type 2 app 1 … micro 1 Instance Type 12 (c1.xlarge) 2017-06-15 Page 10

  11. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ2.1 – Results RQ2.1 – Estimation Accuracy How accurate can a set of micro benchmarks estimate application performance? m1.small m3.medium (pv) Relative Error (RE) = 12.5% WPBench Read − Response Time [ms] m3.medium (hvm) 𝑆 ) = 99.2% m1.medium 2000 m3.large m1.large c3.large m4.large c4.large 1000 c3.xlarge c4.xlarge c1.xlarge Group 0 test 25 50 75 100 train Sysbench − CPU Multi Thread Duration [s] 2017-06-15 Page 11

  12. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ2.2 – Results RQ2.2 – Micro Benchmark Selection Which subset of micro benchmarks estimates application performance most accurately? Estimation Results for WPBench Read – Response Time Relative Error [%] R 2 [%] Benchmark Sysbench – CPU Multi Thread 12.5 99.2 Sysbench – CPU Single Thread 454.0 85.1 Baseline vCPUs 616.0 68.0 ECU 359.0 64.6 2017-06-15 Page 12

  13. software evolution & architecture lab Department of Informatics – s.e.a.l. RQ2 – Implications Suitability of selected micro benchmarks to estimate application performance Benchmarks cannot be used interchangeable à Configuration is important Baseline metrics vCPU and ECU are insufficient Repeat benchmark execution during benchmark design à Check for variations between iterations 2017-06-15 Page 13

  14. software evolution & architecture lab Department of Informatics – s.e.a.l. Related Work Application Performance Application Performance Profiling Prediction • System-level resource monitoring • Trace and reply with Cloud-Prophet [ECA+16, CBMG16] [LZZ+11, LZK+11] • Bayesian cloud configuration refinement for • Compiler-level program similarity [HPE+06] big data analytics [ALC+17] [ECA+16] Athanasia Evangelinou, Michele Ciavotta, Danilo Ardagna, Aliki [LZZ+11] Ang Li, Xuanran Zong, Ming Zhang, Srikanth Kandula, and Xiaowei Yang. Kopaneli, George Kousiouris, and Theodora Varvarigou. Cloud-prophet: predicting web application performance in the cloud . ACM Enterprise applications cloud rightsizing through a joint benchmarking SIGCOMM Poster , 2011 and optimization approach . Future Generation Computer Systems , 2016 [LZK+11] Ang Li, Xuanran Zong, Srikanth Kandula, Xiaowei Yang, and Ming Zhang. Cloud-prophet: Towards application performance prediction in cloud . [CBMG16] Mauro Canuto, Raimon Bosch, Mario Macias, and Jordi Guitart. In Proceedings of the ACM SIGCOMM 2011 Conference (SIGCOMM ’11) , pages A methodology for full-system power modeling in heterogeneous data 426–427, 2011 centers . In Proceedings of the 9th International Conference on Utility and Cloud [ALC+17] Omid Alipourfard, Hongqiang Harry Liu, Jianshu Chen, Shivaram Computing (UCC ’16) , pages 20–29, 2016 Venkataraman, Minlan Yu, and Ming Zhang. Cherrypick: Adaptively unearthing the best cloud configurations for big data [HPE+06] Kenneth Hoste, Aashish Phansalkar, Lieven Eeckhout, Andy analytics . Georges, Lizy K. John, and Koen De Bosschere. In 14th USENIX Symposium on Networked Systems Design and Implementation Performance prediction based on inherent program similarity . (NSDI 17) , 2017 In Proceedings of the 15th International Conference on Parallel Architectures and Compilation Techniques (PACT ’06) , pages 114–122, 2006 2017-06-15 Page 14

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend