P ARAGON : Q O S-A WARE S CHEDULING F OR H ETEROGENEOUS D ATACENTERS - - PowerPoint PPT Presentation

p aragon q o s a ware s cheduling f or
SMART_READER_LITE
LIVE PREVIEW

P ARAGON : Q O S-A WARE S CHEDULING F OR H ETEROGENEOUS D ATACENTERS - - PowerPoint PPT Presentation

P ARAGON : Q O S-A WARE S CHEDULING F OR H ETEROGENEOUS D ATACENTERS Christina Delimitrou and Christos Kozyrakis Stanford University ASPLOS March 18 th 2013 Executive Summary Problem: scheduling in cloud environments (e.g., EC2, Azure,


slide-1
SLIDE 1

PARAGON: QOS-AWARE SCHEDULING FOR HETEROGENEOUS DATACENTERS

Christina Delimitrou and Christos Kozyrakis

Stanford University

ASPLOS – March 18th 2013

slide-2
SLIDE 2

2

Executive Summary

 Problem: scheduling in cloud environments (e.g., EC2, Azure, etc. )

 Heterogeneity  losses when running on wrong server  Interference  performance loss when interference is high  High rates of unknown workloads  no a priori assumptions

 How to get information for a workload?

 Detailed profiling  intolerable overheads  Instead: Leverage info about previously scheduled apps  fast and

accurate application classification

 Paragon is a scheduling framework that is:

 Heterogeneity and interference-aware, app agnostic  Scalable & lightweight: scales to 10,000s of apps and servers  Results: 5,000 apps on 1,000 servers  48% utilization increase,

90% of apps < 10% degradation

slide-3
SLIDE 3

3

Outline

 Motivation  Application Classification  Paragon  Evaluation

slide-4
SLIDE 4

4

Cloud DC Scheduling

 Workloads are unknown

 Random apps submitted for short periods, known workloads evolve

 Significant churn (arrivals/departures)  High variability in workloads characteristics  Decisions must be performed fast

Applications

Scheduler System State Metrics

slide-5
SLIDE 5

5

 Least-loaded scheduling

 Using CPU & memory availability  Ignores heterogeneity  Ignores interference

 Poor efficiency

 Over 48% degradation compared

to running alone

 Some apps won’t even finish

Common Practice Today

slide-6
SLIDE 6

6

 Least-loaded scheduling

 Using CPU & memory availability  Ignores heterogeneity  Ignores interference

 Poor efficiency

 Over 48% degradation compared

to running alone

 Some apps won’t even finish

Common Practice Today

slide-7
SLIDE 7

7

 Least-loaded scheduling

 Using CPU & memory availability  Ignores heterogeneity  Ignores interference

 Poor efficiency

 Over 48% degradation compared

to running alone

 Some apps won’t even finish

Common Practice Today

slide-8
SLIDE 8

8

 Reason for scheduling inefficiency

 Lack of knowledge of application behavior  Heterogeneity & interference characteristics

 Existing approach for app characterization: exhaustive profiling

 High overheads, does not work with unknown apps  Our work: Leverage knowledge about previously-scheduled apps  Accurate, small data Vs. noisy, big data

Insight

Scheduler System State Metrics Apps Apps

slide-9
SLIDE 9

9

 Reason for scheduling inefficiency

 Lack of knowledge of application behavior  Heterogeneity & interference characteristics

 Existing approach for app characterization: exhaustive profiling

 High overheads, does not work with unknown apps  Our work: Leverage knowledge about previously-scheduled apps  Accurate, small data Vs. noisy, big data

Insight

Scheduler Apps System State Heterogeneity Interference Learning Metrics App Classification

slide-10
SLIDE 10

10

Outline

 Motivation  Application Classification  Paragon  Evaluation

slide-11
SLIDE 11

11

Understanding App Behavior

 Goal: quickly extract accurate info on each application to guide scheduling

 Input:  Small signal about a new workload  Large amount of information about previously-scheduled applications  Output:  Understand app behavior/requirements  recommendations for scheduling

 Looks like a classification problem  Similar to systems used in e-commerce, Netflix, etc.

Understand app

Small app signal Big cluster data Scheduling insight

slide-12
SLIDE 12

12

Something familiar…

 Collaborative filtering – similar to Netflix Challenge system

 Singular Value Decomposition (SVD) + PQ reconstruction (SGD)  Leverage the rich information the system already has

 Extract similarities between applications on:

 Heterogeneous platforms that benefit them  Interference they cause and tolerate in shared resources

 Recommendations on platforms and co-scheduled applications Sparse utility matrix Initial decomposition SVD PQ SGD Reconstructed utility matrix Final decomposition SVD

movies users

slide-13
SLIDE 13

13

Classification for Heterogeneity

 Offline mode

 Profile a few apps (20-30) across the different configurations  Assign performance scores per run (IPS, QPS, other system metric)

 Online mode

 For each new app, run briefly on two platforms (1min)  Assign performance scores  Derive missing entries & identify similarities between apps

The Netflix Challenge Platform Classification

Recommend movies to users Recommend platforms to apps Utility matrix rows  users Utility matrix rows  apps Utility matrix columns  movies Utility matrix columns  platforms Utility matrix elements  movie ratings Utility matrix elements  app scores

slide-14
SLIDE 14

14

Classification for Interference

 Two types of interference:

 Interference the application tolerates  Interference the application causes

 Identifying sources of interference (SoIs):

 Cache hierarchy, memory bandwidth/capacity, CPU, network/

storage bandwidth

The Netflix Challenge Interference Classification

Recommend movies to users Recommend minimally interfering co-runners to apps Utility matrix rows  users Utility matrix rows  apps Utility matrix columns  movies Utility matrix columns  microbenchmarks (SoIs) Utility matrix elements  movie ratings Utility matrix elements  sensitivity scores to interference

slide-15
SLIDE 15

15

Measuring Interference Sensitivity

 Rank sensitivity of an application to each microbenchmark (0-100%)  Increase microbenchmark intensity until the application violates its QoS

 sensitivity to tolerated interference

 Similarly for sensitivity to caused interference

28% QoS

slide-16
SLIDE 16

16

Classification Validation

 Large set of ST, MT, MP and I/O workloads  10 Server Configurations (SC)  10 Sources of Interference (SoI)

Metric Applications (%) Heterogeneity

ST MT MP I/O Select best SC 86% 86% 83% 89% Select SC within 5% of best 91% 90% 89% 92%

Interference

  • Avg. error across µbenchmarks

5.3% Apps with < 10% error ST: 81% MT: 63% SoI with highest error: for ST: L1 i-cache 15.8% for MT: LLC capacity 7.8%

slide-17
SLIDE 17

17

Classification Overhead

 Time overhead:

 Training:  2x1min runs for heterogeneity (alone) + 2x1min with two

microbenchmarks for interference  in parallel

 Decision:  SVD + PQ reconstruction: O(min(n2m, m2n)) + O(mn)  Practically: msec for 1,000s apps and servers

 Space overhead:  64B per app and 64B per server

slide-18
SLIDE 18

18

Outline

 Motivation  Application Classification  Paragon  Evaluation

slide-19
SLIDE 19

19

Greedy Server Selection

 Two step process:

 Select servers with minimal interference  Select server with best hardware configuration

 Overview:

 Start with most critical resource  Prune servers that would violate QoS  Repeat for all resources  Select server with best HW configuration  If no candidate left, backtrack and relax QoS requirement  Rare, but ensures convergence

slide-20
SLIDE 20

20

Monitor & Adapt

 Sources of inaccuracy:  App goes through phases  App is misclassified  App is mis-scheduled  Monitor & adapt: 1.

Reactive phase detection: upon performance degradation, reclassify the workload and searches for a more suitable server

2.

Preemptive phase detection: periodically sample a workload subset, reclassify and if heterogeneity/interference profile has changed re- schedule before QoS degrades

 Preview: application scenario with changing workloads in

evaluation

slide-21
SLIDE 21

21

Outline

 Motivation  Application Classification  Paragon  Evaluation

slide-22
SLIDE 22

22

Methodology

 Workloads:  Single-threaded: SPEC CPU2006  Multi-threaded: PARSEC, SPLASH-2, BioParallel, Minebench, Specjbb  Multiprogrammed mixes: 350 4-app mixes of SPEC CPU2006  I/O: data mining, Matlab, single-node Hadoop  Systems:

 Small-scale  40-machine local cluster (10 configurations)  Large-scale  1,000 EC2 servers (14 configurations)

 Workload Scenarios:

 Low load, high load, with phases and oversubscribed

slide-23
SLIDE 23

23

Evaluation: Small Scale (high load)

slide-24
SLIDE 24

24

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

slide-25
SLIDE 25

25

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

Gain

slide-26
SLIDE 26

26

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

Distance from

  • ptimal
slide-27
SLIDE 27

27

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

slide-28
SLIDE 28

28

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

slide-29
SLIDE 29

29

Evaluation: Small Scale (high load)

 Paragon preserves QoS for 64% of workloads  Bounds degradation to less than 10% degradation for 90% of workloads

slide-30
SLIDE 30

30

Decision Quality

 LL: poor decision quality both for heterogeneity and interference  NH: poor platform decisions, good interference decisions  NI: good platform decisions, poor interference decisions  Paragon: better than NI in heterogeneity, better than NH in interference

80% 82% Heterogeneity Interference

slide-31
SLIDE 31

31

Increasing Utilization

 Paragon increases server utilization by 47%:

 Same performance for user (QoS guarantees)  Better utilization for the DC operator  resource efficiency

 With baseline (LL):

 Imbalance in server utilization (too high vs. too low)  Per-app QoS violations + scenario execution time increase

Paragon Least-Loaded (LL)

slide-32
SLIDE 32

32

Workloads with Phases

 QoS is preserved for 75% of applications

 Using the other schedulers preserves QoS for < 10% of apps

 Paragon adapts to workload phases over time  performance recovers

shortly after the phase change

slide-33
SLIDE 33

33

Large Scale (EC2) – High Load

 LL: violates QoS for 99% of workloads  NH: violates QoS for 96% of workloads  NI: violates QoS for 97% of workloads

slide-34
SLIDE 34

34

Large Scale (EC2) – High Load

 Paragon preserves QoS for 61% of workloads  Bounds degradation to less than 10% for 90% of workloads.

slide-35
SLIDE 35

35

Large Scale (EC2) – High Load

 Paragon preserves QoS for 61% of workloads  Bounds degradation to less than 10% for 90% of workloads.

Gain

slide-36
SLIDE 36

36

Conclusions

 A heterogeneity and interference aware DC scheduler  Leverages robust analytical methods to quickly classify apps  Minimizes interference and maximizes utilization  It is scalable and lightweight

slide-37
SLIDE 37

37

Questions?

Thank you! cdel@stanford.edu http://paragonDC.stanford.edu