outline
play

Outline Background Research Questions Experimental Workloads - PDF document

IC2E 2017 Wes J. Lloyd 4/6/2017 Outline Background Research Questions Experimental Workloads Experiments/Evaluation Wes Lloyd, Shrideep Pallickara, Olaf David, Conclusions Mazdak Arabi, Ken Rojas April 6, 2017 Institute


  1. IC2E 2017 – Wes J. Lloyd 4/6/2017 Outline  Background  Research Questions  Experimental Workloads  Experiments/Evaluation Wes Lloyd, Shrideep Pallickara, Olaf David,  Conclusions Mazdak Arabi, Ken Rojas April 6, 2017 Institute of Technology, University of Washington, Tacoma, Washington USA IC2E 2017 : IEEE International Conference on Cloud Engineering April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 2 Rosetta Protein Folding Outline  Computational methods for accurate design of  Background new hyperstable constrained peptides  Research Questions  In 53 hours, using 5,904 EC2 compute cores:  Experimental Workloads  Generated 5.2 million peptide structures  Experiments/Evaluation  $3,400 spot instances  Upfront cost of physical cluster to  Conclusions achieve same result in ~53 hours: $857,752  Cloud enables adhoc large-scale experimentation April 6, 2017 3 April 6, 2017 4 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services VM-type heterogeneity- Amazon EC2 Research Challenges From: Is The Same Instance Type Created Equal 2013 IEEE Transactions on Cloud Computing  How can we improve performance and costs for hosting scientific application workloads on the cloud?  Resource heterogeneity  Resource contention  Relative to:  HPC  Compute clusters April 6, 2017 5 April 6, 2017 6 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in 1 Public Clouds for Scientific Modeling Services

  2. IC2E 2017 – Wes J. Lloyd 4/6/2017 Trial-and-better VM-Scaler Resource Provisioning  Z. Ou et al., 2013 IEEE Trans. on Cloud Computing  Using Amazon EC2 1. Provision instances 2. Perform trial(s) - - VM testing 3. Keep desired instances 4. Replace undesirable instances  Test: Underlying CPU Type future April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 7 April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 8 Trial and Better – VM-Scaler VM-Scaler  Harness this approach for VM-Pools  Ensure every VM has same backing CPU • Web services application  Provide more consistent test results • Rest-based/JSON • Harnesses EC2 API • Manages virtual cloud infrastructure • Supports scientific modeling-as-a-service • Supports Amazon, Eucalyptus clouds future April 6, 2017 9 April 6, 2017 10 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Resource Utilization Data Collection CpuSteal  Profile resource utilization for Disk scientific workloads running - dsr: disk sector reads  CpuSteal : VM’s CPU core is ready to execute - dsreads: disk sector reads completed across many VMs - drm: merged adjacent disk reads but the physical CPU core is busy  Sensor on every VM - readtime: time spent reading from disk - dsw: disk sector writes  Transmits data to VM-Scaler - dswrites: disk sector writes completed  Symptom of over provisioning physical servers - dwm: merged adjacent disk writes CPU - writetime: time spent writing to disk - CPU time  Factors which cause CpuSteal : - cpu usr: CPU time in user mode Network - cpu krn:CPU time in kernel mode 1. Processors shared by too many busy VMs - cpu_idle: CPU idle time - nbr: network bytes sent - contextsw: # of context switches - nbs: network bytes received 2. Hypervisor kernel (Xen dom0) is occupying the CPU - cpu_io_wait: CPU time waiting for I/O 3. VM’s CPU time share <100% for 1 or more cores, - cpu_sint_time: CPU time serving soft interrupts - loadavg: (# proc / 60 secs) and 100% is needed for a CPU intensive workload. - cpuSteal: VM CPU ready, physical CPU unavailable April 6, 2017 11 April 6, 2017 12 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in 2 Public Clouds for Scientific Modeling Services

  3. IC2E 2017 – Wes J. Lloyd 4/6/2017 Outline Research Questions RQ1: How common is public cloud VM-type  Background implementation heterogeneity?  Research Questions  Experimental Workloads RQ2: What performance implications result from VM-type heterogeneity for hosting scientific  Experiments/Evaluation application workloads?  Conclusions April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 13 April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 14 Research Questions - 2 Outline RQ3: How effective is cpuSteal at identifying VMs with high resource contention due to multi-tenancy  Background (e.g. noisy neighbor VMs) in a public cloud?  Research Questions  Experimental Workloads RQ4: What are the performance implications of hosting  Experiments/Evaluation scientific modeling workloads on worker VMs with  Conclusions consistently high cpuSteal measurements in a public cloud? Is there a pattern to cpuSteal behavior across worker VMs over time? April 6, 2017 15 April 6, 2017 16 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Scientific CSIP Model Services Application Workloads  Cloud Services Innovation Platform  Rusle2  Java-based framework to support development  Soil erosion from water of scientific model services (modeling-as-a-service)  Median runtime ~1.89s  Increase availability and throughput of models  Harness scalable cloud infrastructure  WEPS  Cloud virtualization supports variety of legacy  Soil erosion from wind software required for scientific applications  Median runtime ~55s  (e.g. FORTRAN, Visual C++ 6.0, etc.)  Years weather data * Years of crop rotation April 6, 2017 17 April 6, 2017 18 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in 3 Public Clouds for Scientific Modeling Services

  4. IC2E 2017 – Wes J. Lloyd 4/6/2017 Scientific Modeling Workloads - 2 Outline WEPS / RUSLE CPU utilization:  Background  Research Questions  Experimental Workloads  Experiments/Evaluation  Conclusions April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 19 April 6, 2017 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services 20 Testing for VM Type Amazon EC2 Heterogeneity VM Type Heterogeneity  Identified CPU by checking /proc/cpuinfo  Launched 50 VMs of a given type VM type Region Backing CPU Backing CPU Intel E5-2650 v0 Intel Xeon E5645  If there was heterogeneity, launched 50 more m1.medium us-east-1c 8c,95w,96% 6c,80w,4% Intel Xeon X5550 Intel Xeon E5-2665 v0 m2.xlarge us-east-1c 4c, 95w, 48% 8c, 115w, 42%  Tested 12 VM types, across 3 generations us-east-1d Intel Xeon E5-2650 v0 Intel Xeon E5-2651 v2 m1.large 8c,95w,74% 12c,105w,19%  1 st : m1.medium, m1.large, m1.xlarge, c1.medium, c1.xlarge Intel Xeon E5645 m1.large us-east-1d --  2 nd : m2.xlarge, m2.2xlarge, and m2.4xlarge 6c,80w,7% us-east-1d Intel Xeon E5-2665 v0 Intel Xeon X5550  3 rd : c3.large, c3.xlarge c3.2xlarge, m3.large m2.xlarge 8c, 115w,78% 4c, 95w, 22% April 6, 2017 21 April 6, 2017 22 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services VM Type Heterogeneity VM Type Heterogeneity Performance Implications Performance Variation  Tested small 5 VM pools  Compared the two most abundant hardware implementations  m1.large - Intel Xeon  E5-2650 v0, 8cores, 95 w vs. E5-2651 v2, 12 cores, 105 w  m2.xlarge - Intel Xeon  E5-2665 v0, 8 cores, 115 w vs. X5550, 4 cores, 95 w  Workloads  WEPS: 10 x 100 runs  RUSLE2: 10 x 660 runs April 6, 2017 23 April 6, 2017 24 Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in Public Clouds for Scientific Modeling Services Mitigating Resource Contention and Heterogeneity in 4 Public Clouds for Scientific Modeling Services

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend