Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj - - PowerPoint PPT Presentation

resource efficiency in the cloud
SMART_READER_LITE
LIVE PREVIEW

Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj - - PowerPoint PPT Presentation

SAIL (Systems, Architecture and Infrastructure Lab) Leveraging Approximation to Improve I MPROVING R ESOURCE E FFICIENCY Resource Efficiency in the Cloud I N C LOUD C OMPUTING Neeraj Kulkarni, Feng Qi, Glyfina Fernando Christina Delimitrou and


slide-1
SLIDE 1

Christina Delimitrou

Cornell University

WAX – April 9th 2017

SAIL (Systems, Architecture and Infrastructure Lab)

IMPROVING RESOURCE EFFICIENCY IN CLOUD COMPUTING

Neeraj Kulkarni, Feng Qi, Glyfina Fernando and Christina Delimitrou

Cornell University

Leveraging Approximation to Improve Resource Efficiency in the Cloud

slide-2
SLIDE 2

2

Datacenter Underutilization

1 C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management,

ASPLOS 2014.

2 L. A. Barroso, U. Holzle. The Datacenter as a Computer, 2013.

Twitter (Mesos)1 4-5x Google (Borg)2 3-5x

0 10 20 30 40 50 60 70 80 90 100 CPU Utilization (%)

slide-3
SLIDE 3

3

 Co-schedule multiple cloud services on same physical platform  Often leads to resource interference, especially when sharing

cores

A Common Approach

App1 App2

slide-4
SLIDE 4

4

 Co-schedule one high priority and one/more best-effort apps  Performance is non-critical for best effort jobs  Disadvantage: assume best-effort apps are always low priority

A Common Cure

App1 App2

slide-5
SLIDE 5

5

 Approximate computing apps can absorb a loss of resources as

loss of output quality instead of a loss in performance

 Advantage: performance of all co-scheduled applications is high-

priority

Approximate Computing Apps to the Rescue

App1 App2

slide-6
SLIDE 6

6

 Enables latency-critical & approximate apps to share resources

(including cores) without penalizing their performance

 Tunes degree and type of approximation based on measured

interference

Pliant

App1 App2 Pliant runtime

slide-7
SLIDE 7

7

1.

Identify opportunities for approximation

ACCEPT (precision, loop perforation, sync elision), algorithmic exploration

2.

Lightweight profiling to determine when to employ approximation

End-to-end latency/throughput & perf counters

3.

Determine what resource(s) to constrain?

Based on measured interference

4.

Determine what type of approximation & to what extent?

Based on interference and performance impact

Challenges

App1 App2

Pliant runtime

slide-8
SLIDE 8

8

DynamoRIO for switching between precise/approximate versions

Initial implementation, overheads high but not prohibitive

Looking into Petabricks and LLVM

Pliant

App1 App2

Pliant runtime

Client

Performance monitor Workload generator

Interference monitor

Server

slide-9
SLIDE 9

9

 Incremental approximation:

 Employ the minimum amount of approximation (quality loss) to

restore the performance of the interactive service

 Several versions for each type of approximation, choose online

 Interference-aware approximation:

 Choose the type of interference that minimizes pressure in the

bottlenecked resource

 Example:  High memory interference  prioritize algo tuning  High CPU interference  prioritize sync elision, loop perforation

Adaptive Approximation

slide-10
SLIDE 10

10

 Latency-critical interactive services: memcached & nginx  Open-loop workload generator & performance monitor  Facebook traffic pattern  Approximate computing apps: PARSEC, SPLASH, Spark MLlib  System: 2 2-socket, 40-core servers, 128GB RAM each

Methodology

slide-11
SLIDE 11

11

Evaluation

 memcached sharing physical cores with PARSEC  Latency  Degree of approximation

slide-12
SLIDE 12

12

 Approximate computing: opportunity to improve cloud efficiency

without loss in performance

 Pliant: cloud runtime to co-schedule interactive services with

approximate computing apps

 Incremental and interference-aware approximation  Preserves QoS for interactive service with minimal loss in quality for

approximate computing application

 Current work:

 DynamoRIO  Petabricks/LLVM  Add cloud approximate computing application  Improve interference awareness  Leverage hardware isolation techniques

Conclusions

slide-13
SLIDE 13

13

 Approximate computing: opportunity to improve cloud efficiency

without loss in performance

 Pliant: cloud runtime to co-schedule interactive services with

approximate computing apps

 Incremental and interference-aware approximation  Preserves QoS for interactive service with minimal loss in quality for

approximate computing application

 Current work:

 DynamoRIO  Petabricks/LLVM  Add cloud approximate computing application  Improve interference awareness  Leverage hardware isolation techniques

Questions?