resource efficiency in the cloud
play

Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj - PowerPoint PPT Presentation

SAIL (Systems, Architecture and Infrastructure Lab) Leveraging Approximation to Improve I MPROVING R ESOURCE E FFICIENCY Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando Christina Delimitrou and


  1. SAIL (Systems, Architecture and Infrastructure Lab) Leveraging Approximation to Improve I MPROVING R ESOURCE E FFICIENCY Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando Christina Delimitrou and Christina Delimitrou Cornell University Cornell University WAX – April 9 th 2017

  2. Datacenter Underutilization Twitter (Mesos) 1 Google (Borg) 2 4-5x 3-5x 0 10 20 30 40 50 60 70 80 90 100 CPU Utilization (%) 1 C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management, ASPLOS 2014. 2 L. A. Barroso, U. Holzle. The Datacenter as a Computer, 2013. 2

  3. A Common Approach App1 App2  Co-schedule multiple cloud services on same physical platform  Often leads to resource interference, especially when sharing cores 3

  4. A Common Cure App1 App2  Co-schedule one high priority and one/more best-effort apps  Performance is non-critical for best effort jobs  Disadvantage: assume best-effort apps are always low priority 4

  5. Approximate Computing Apps to the Rescue App1 App2  Approximate computing apps can absorb a loss of resources as loss of output quality instead of a loss in performance  Advantage: performance of all co-scheduled applications is high- priority 5

  6. Pliant Pliant runtime App1 App2  Enables latency-critical & approximate apps to share resources (including cores) without penalizing their performance  Tunes degree and type of approximation based on measured 6 interference

  7. Challenges Identify opportunities for approximation 1. Pliant runtime ACCEPT (precision, loop perforation, sync  elision), algorithmic exploration App1 App2 Lightweight profiling to determine when to 2. employ approximation End-to-end latency/throughput & perf counters  Determine what resource(s) to constrain? 3. Based on measured interference  Determine what type of approximation & to 4. what extent? Based on interference and performance impact 7 

  8. Pliant Server Pliant runtime Interference Client monitor Workload generator App1 App2 Performance monitor DynamoRIO for switching between precise/approximate versions Initial implementation, overheads high but not prohibitive  Looking into Petabricks and LLVM 8 

  9. Adaptive Approximation  Incremental approximation:  Employ the minimum amount of approximation (quality loss) to restore the performance of the interactive service  Several versions for each type of approximation, choose online  Interference-aware approximation:  Choose the type of interference that minimizes pressure in the bottlenecked resource  Example:  High memory interference  prioritize algo tuning  High CPU interference  prioritize sync elision, loop perforation 9

  10. Methodology  Latency-critical interactive services: memcached & nginx  Open-loop workload generator & performance monitor  Facebook traffic pattern  Approximate computing apps: PARSEC, SPLASH, Spark MLlib  System: 2 2-socket, 40-core servers, 128GB RAM each 10

  11. Evaluation  memcached sharing physical cores with PARSEC  Latency  Degree of approximation 11

  12. Conclusions  Approximate computing: opportunity to improve cloud efficiency without loss in performance  Pliant: cloud runtime to co-schedule interactive services with approximate computing apps  Incremental and interference-aware approximation  Preserves QoS for interactive service with minimal loss in quality for approximate computing application  Current work:  DynamoRIO  Petabricks/LLVM  Add cloud approximate computing application  Improve interference awareness  Leverage hardware isolation techniques 12

  13. Questions?  Approximate computing: opportunity to improve cloud efficiency without loss in performance  Pliant: cloud runtime to co-schedule interactive services with approximate computing apps  Incremental and interference-aware approximation  Preserves QoS for interactive service with minimal loss in quality for approximate computing application  Current work:  DynamoRIO  Petabricks/LLVM  Add cloud approximate computing application  Improve interference awareness  Leverage hardware isolation techniques 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend