Tarcil: Reconciling Scheduling Speed and Quality in Large Shared - - PowerPoint PPT Presentation

tarcil reconciling scheduling speed and
SMART_READER_LITE
LIVE PREVIEW

Tarcil: Reconciling Scheduling Speed and Quality in Large Shared - - PowerPoint PPT Presentation

Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters Christina Delimitrou 1 , Daniel Sanchez 2 and Christos Kozyrakis 1 1 Stanford University, 2 MIT SOCC August 27 th 2015 Executive Summary Goals of


slide-1
SLIDE 1

Christina Delimitrou1, Daniel Sanchez2 and Christos Kozyrakis1

1Stanford University, 2MIT

SOCC ¡– ¡August ¡27th ¡2015 ¡

Tarcil: Reconciling Scheduling Speed and Quality in Large Shared Clusters

slide-2
SLIDE 2

2

¨ Goals of cluster scheduling

¤ High decision quality ¤ High scheduling speed

¨ Problem: Disparity in scheduling designs

¤ Centralized schedulers à High quality, low speed ¤ Sampling-based schedulers à High speed, low quality

¨ Tarcil: Key scheduling techniques to bridge the gap ¤ Account for resource preferences à High decision quality ¤ Analytical framework for sampling à Predictable performance ¤ Admission control àHigh quality & speed ¤ Distributed design à High scheduling speed

Executive Summary

High performance High cluster utilization

slide-3
SLIDE 3

3

Motivation

¨ Optimize scheduling speed (sampling-based, distributed) ¨ Optimize scheduling quality (centralized, greedy)

Good: Short jobs Good: Long jobs Bad: Long jobs Bad: Short jobs

Short: 100msec, Medium: 1-10sec, Long: 10sec-10min

slide-4
SLIDE 4

4

Motivation

¨ Optimize scheduling speed (sampling-based, distributed) ¨ Optimize scheduling quality (centralized, greedy)

Good: Short jobs Good: Long jobs Bad: Long jobs Bad: Short jobs

Short: 100msec, Medium: 1-10sec, Long: 10sec-10min

slide-5
SLIDE 5

5

Key Scheduling Techniques at Scale

slide-6
SLIDE 6

6

  • 1. Determine Resource Preferences

¨ Scheduling quality depends on: interference,

heterogeneity, scale up/out, …

¤ Exhaustive exploration à infeasible ¤ Practical data mining framework1 ¤ Measure impact of a couple of allocations à estimate for

large space

  • 1C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management. In ASPLOS 2014.
slide-7
SLIDE 7

7

Example: Quantifying Interference

¨ Interference: set of microbenchmarks of tunable intensity (iBench)

  • 1C. Delimitrou and C. Kozyrakis. Quasar: Resource-Efficient and QoS-Aware Cluster Management. In ASPLOS 2014.

Measure tolerated & generated interference

QoS 68%

… Resource Quality Q

QoS 7%

Data mining: Recover missing resources

slide-8
SLIDE 8

8

  • 2. Analytical Sampling Framework

¨ Sample w.r.t. required resource quality

slide-9
SLIDE 9

9

  • 2. Analytical Sampling Framework

¨ Fine-grain allocations: partition servers in Resource Units (RU) à

minimum allocation unit RU Single-threaded apps Reclaim unused resources

slide-10
SLIDE 10

10

  • 2. Analytical Sampling Framework

¨ Match a new job with required quality Q to appropriate RUs

QR1 QR2 QR3 QR4 QR5 QR6 QR7 QR8 QR9 QR10 QR20 QR30 QR42 QR54 QR11 QR21 QR31 QR43 QR55 QR61 QR67 QR60 QR66 QR74

slide-11
SLIDE 11

11

  • 2. Analytical Sampling Framework

¨ Rank resources by quality

slide-12
SLIDE 12

12

  • 2. Analytical Sampling Framework

¨ Break ties with a fair coin à uniform distribution

CDF Resource Quality Q

slide-13
SLIDE 13

13

  • 2. Analytical Sampling Framework

¨ Break ties with a fair coin à uniform distribution

CDF Resource Quality Q Better resources Worse resources

slide-14
SLIDE 14

14

  • 2. Analytical Sampling Framework

¨ Sample on uniform distribution à guarantees on resource

allocation quality Pr[Q≤x] = xR Pr[Q<0.8]=10-3

slide-15
SLIDE 15

15

Validation

¨ 100 server EC2 cluster ¨ Short Spark tasks ¨ Deviation between analytical and empirical is minimal

slide-16
SLIDE 16

16

Sampling at High Load

¨ Performance degrades (with small sample size) ¨ Or sample size needs to increase

slide-17
SLIDE 17

17

  • 3. Admission Control

¨ Queue jobs based on required resource quality ¨ Resource quality vs. waiting time à set max waiting time limit

Q

slide-18
SLIDE 18

18

Tarcil Implementation

¨ 4,000 loc in C/C++ and Python ¨ Supports apps in various frameworks (Hadoop, Spark, key-value

stores)

¨ Distributed design: Concurrent scheduling agents (sim. Omega2) ¤ Each agent has local copy of state, one resilient master copy ¤ Lock-free optimistic concurrency for conflict resolution (rare) à Abort and

retry

¤ 30:1 worker to scheduling agent ratio

  • 2M. Schwarzkopf, A. Konwinski, et al. Omega: flexible, scalable schedulers for large compute clusters.

In EuroSys 2013.

slide-19
SLIDE 19

19

Evaluation Methodology

  • 1. TPC-H Workload

¤

~40k queries of different types

¤

Compare with a centralized scheduler (Quasar) and a distributed scheduler based on random sampling (Sparrow)

¤

110-server EC2 cluster (100 workers, 10 scheduling agents)

n

Homogeneous cluster, no interference

n

Homogeneous cluster, with interference

n

Heterogeneous cluster, with interference

¨

Metrics:

¤

Task performance

¤

Performance predictability

¤

Scheduling latency

slide-20
SLIDE 20

20

Evaluation

Centralized: high overheads Sparrow and Tarcil: similar

slide-21
SLIDE 21

21

Evaluation

Centralized: high overheads Sparrow and Tarcil: similar

Centralized and Sparrow: comparable performance Tarcil: 24% lower completion time

slide-22
SLIDE 22

22

Evaluation

Centralized: high overheads Sparrow and Tarcil: similar

Centralized and Sparrow: comparable performance Tarcil: 24% lower completion time Centralized outperforms Sparrow Tarcil: 41% lower completion time & less jitter

slide-23
SLIDE 23

23

Scheduling Overheads

Heterogeneous, with interference

¨ Centralized: Two orders of magnitude slower than the distributed,

sampling-based schedulers

¨ Sparrow and Tarcil: Comparable scheduling overheads

slide-24
SLIDE 24

24

Resident Load

¨ Tarcil and Centralized account for cross-job interference à

preserve memcached’s QoS

¨ Sparrow causes QoS violations for memcached

memcached

slide-25
SLIDE 25

25

Motivation Revisited

Distributed, sampling-based Centralized, greedy Tarcil

Short: 100msec Medium: 1-10sec Long:10sec-10min

slide-26
SLIDE 26

26

More details in the paper…

¨ Sensitivity on parameters such as:

¤ Cluster load ¤ Number of scheduling agents ¤ Sample size ¤ Task duration, etc.

¨ Job priorities ¨ Large allocations ¨ Generic application scenario (batch and latency-critical) on 200

EC2 servers

slide-27
SLIDE 27

27

Conclusions

¨ Tarcil: Reconciles high quality and high speed scheduling

¤ Account for resource preferences ¤ Analytical sampling framework to improve predictability ¤ Admission control to maintain high scheduling quality at high load ¤ Distributed design to improve scheduling speed

¨ Results:

¤ 41% better performance than random sampling-based schedulers ¤ 100x better scheduling latency than centralized schedulers ¤ Predictable allocation quality & performance

slide-28
SLIDE 28

28

Questions?

¨ Tarcil: Reconciles high quality and high speed schedulers

¤ Account for resource preferences ¤ Analytical sampling framework to improve predictability ¤ Admission control to maintain high scheduling quality at high load ¤ Distributed design to improve scheduling speed

¨ Results:

¤ 41% better performance than random sampling-based schedulers ¤ 100x better scheduling latency than centralized schedulers ¤ Predictable allocation quality & performance

slide-29
SLIDE 29

29

¡ ¡

¡ Thank you Questions??