QoS-Aware Admission Control in Heterogeneous Datacenters Christina - PowerPoint PPT Presentation

QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou and Christos Kozyrakis Stanford University ICAC – June 28 th 2013

Cloud DC Scheduling S Workloads S DC Scheduler S S System State Metrics  Workloads are unknown  random apps submitted for short periods  Significant churn (app arrivals/departures)  not large long-running apps  High variability in workloads (runtime, number of threads, etc. )  Fast admission & scheduling decisions 2

Users are Interested in The amount of time the Fast Execution Time job needs to run The amount of time the Low Waiting Time job is waiting before it gets scheduled 3

Executive Summary  Problem: Admission control in large-scale cloud DCs (e.g., EC2, Azure)  Heterogeneity  performance/efficiency  Interference  performance loss from high interference  High arrival rates  system can become oversubscribed  Background: Paragon is a heterogeneity and interference-aware scheduler for cloud DCs.  Limitations: In high-load scenarios demanding workloads can block easy-to- satisfy applications  head-of-line blocking  long waiting time  ARQ is an admission control protocol for cloud DCs that is:  Application-aware: Accounts for the resource quality of each app  QoS-aware: Queues applications s.t. their QoS guarantees are preserved  Scalable: Scales to 10,000s of applications and servers  Lightweight: Low and upper-bound queueing overheads 4

Users are Interested in Paragon The amount of time the Fast Execution Time job needs to run ARQ The amount of time the Low Waiting Time job is waiting before it gets scheduled 5

Background: Paragon  Classification: ~Netflix Challenge  Small information signal about new application  Leverage system knowledge about previously scheduled applications  Collaborative filtering techniques (SVD + PQ reconstruction with SGD)  Scheduling recommendations: Heterogeneity + Interference Server Platform Caused (c) Tolerated (t)  Greedy Scheduler:  Co-schedule workloads with no/small interference on suitable hardware platforms  preserve QoS & improve utilization Learning Heterogeneity Apps App Scheduler Classification Interference System State Metrics 6

Limitations  Scheduling in FIFO order:  Applications with small resource requirements get blocked behind demanding workloads  head-of-line-blocking  long queueing delays  Short jobs get blocked behind long jobs  High-priority jobs get blocked behind low-priority jobs  Resource-agnostic queueing of applications:  Application in the head of the queue gets dispatched to first available server  not necessarily a suitable server for that workload 7

ARQ: Application-aware Admission Control Resource Quality: Degree of tolerated and caused interference in various shared  resources (higher quality means more demanding application) For server j: For application i: Resource quality-aware queueing: Applications are queued based on the resource  quality they need Multi-class admission control: Each class corresponds to apps with specific range of  Qi  dispatched to servers with the required Qj Preserving QoS: Applications can be diverged to different queues to preserve their  QoS (when waiting time is high) 8

ARQ Design Q1: [90,100] Q1 Q2: [80,90] Q2 Q3: [70,80] Q3 Higher quality resources … Q10 Q10: [0,10] 9

ARQ Design Q1: [90,100] Q1 Q2: [80,90] Q2 Q3: [70,80] Q3 Qi … Q10 Q10: [0,10] 10

ARQ Design Q1: [90,100] Q1 Q2: [80,90] Q2 Q3: [70,80] Q3 … Q10 Q10: [0,10] 11

ARQ: Queue Switching -- Utilization If no applications in higher Q1: [90,100] Q1 queue diverge up  suboptimal utilization but maintains QoS Q2: [80,90] Q2 Q3: [70,80] Q3 … Q10 Q10: [0,10] 14

ARQ: Queue Switching -- QoS Q1: [90,100] Q1 Q2: [80,90] Q2 If server available diverge to Q3: [70,80] lower queue  some QoS Q3 degradation … Q10 Q10: [0,10] 15

Switching between Queues  Statistically analyze per-pool freed-server-time  distribution fitting (represent using known distributions)  Updated every time a new server is freed  From CDFs of per-pool freed-server-time compute the optimal switching point between queues 16

Switching between Queues  Optimization function:  Find switching time t s.t.: maximize Prob[server is freed], subj. total waiting time preserves QoS  Solving the optimization problem is fast (~msec) and scalable (O(n)) even for large numbers of applications and servers 17

Methodology  Workloads:  Single-threaded: SPEC CPU2006  Multi-threaded: PARSEC, SPLASH-2, BioParallel, Minebench, Specjbb  Multiprogrammed: 4-app mixes of SPEC CPU2006 workloads  I/O-bound: Hadoop + data mining (Matlab)  Small scale:  40 servers, 10 server configurations (Xeons, Atoms, etc. )  178 applications used in four workload scenarios:  Low load, high load and oversubscribed  Large scale: 1,000 EC2 servers, oversubscribed scenario (8,500 apps) 18

Evaluation: Small Scale  Paragon + ARQ preserves QoS for 95% of workloads  94% without ARQ  Average performance is 99.6% of optimal 19

Evaluation: Small Scale  Paragon + ARQ preserves QoS for 82% of workloads  64% without ARQ  Average performance is 98% of optimal 20

Evaluation: Large Scale (EC2)  Paragon preserves QoS for 75% of workloads  61% without ARQ  Bounds degradation to less than 10% for 99% of workloads 21

Other experiments  Workload scenario with application phases (app requirements change)  Shortest Job First (SJF) and priorities  Queueing overheads  Sensitivity to parameters (e.g., number of queues, etc.)  Distributions of server freed times 22

Conclusions  ARQ leverages Paragon to classify applications in multiple queues such that QoS guarantees are preserved and utilization is maximized  It improves performance both for low and especially for oversubscribed workload scenarios  It is scalable and lightweight 23

Questions?? Thank you 24

QoS-Aware Admission Control in Heterogeneous Datacenters Christina - PowerPoint PPT Presentation

QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou and Christos Kozyrakis Stanford University ICAC June 28 th 2013 Cloud DC Scheduling S Workloads S DC Scheduler S S System State Metrics Workloads are

QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou, Nick Bambos and

Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters Christina Delimitrou & Christos

CS 218- QoS Routing + CAC Fall 2003 M. Gerla et al: Resource Allocation and Admission

Improving QOS in IP Networks Principles for QOS Guarantees Improving QOS in IP Networks

QoS QoS Aware Aware BiNoC BiNoC Architecture Architecture Shih Shih- -Hsin Hsin Lo, Ying

Next Generation Networks Next Generation Networks QoS Control Architectures and QoS Control

Phoenix: A Constraint-aware Scheduler for Heterogeneous Datacenters Prashanth Thinakaran ,

QoS in 5G: Enhancements for Connected Cars 5G V2X Communications Summer School Kings College

QoS & Scheduling Danny Dolev Danny Dolev * Notes from * Notes from Keshav Keshav and and

QoS-aware Antenna Grouping and Cross-layer Scheduling for mmWave Massive MU-MIMO [1] [1] C.

Trusted End Host Monitors for Securing Cloud Datacenters Alan Shieh Srikanth Kandula

Performance Datacenters HotNets15 Xpander: Unveiling the Secrets of High-Performance

Reliable Communication for Datacenters Mahesh Balakrishnan Cornell University Mahesh

Resource and Admission Control Resource and Admission Control for Next Generation Networks for

Austrian Humanitarian Admission Programme Humanitarian Admission Programme Austria - Overview

Overview Welcome & Introductions Understanding Admission Factors College Admission Testing

dCUDA: Hardware Supported Overlap of Computation and Communication Tobias Gysi, Jeremia Br, and

Building Blocks Operating Systems, Processes, Threads Dr Mark Bull, EPCC markb@epcc.ed.ac.uk

Oversubscription on Multicore Processors Costin Iancu, Steven Hofmeyr, Filip Blagojevi, Yili

Jill Larsen EVP & CHRO Medidata Solutions Agenda AI Defined Embracing AI AI Myths AI

ICT and Development ICT and Development Week 10 March 28 - 30 1 Computers and Society

GPCF* Update Present status as a series of questions / answers related to decisions made / yet

The Impact of Process Placement and Oversubscription on Application Performance: A Case Study for

Fast In Memory Checkpointing with POSIX API for Legacy Exascale Applications Jan Fajerski,

QoS-Aware Admission Control in Heterogeneous Datacenters Christina - PowerPoint PPT Presentation

QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou and Christos Kozyrakis Stanford University ICAC June 28 th 2013 Cloud DC Scheduling S Workloads S DC Scheduler S S System State Metrics Workloads are

QoS-Aware Admission Control in Heterogeneous Datacenters Christina Delimitrou, Nick Bambos and

Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters Christina Delimitrou &amp; Christos

CS 218- QoS Routing + CAC Fall 2003 M. Gerla et al: Resource Allocation and Admission

Improving QOS in IP Networks Principles for QOS Guarantees Improving QOS in IP Networks

QoS QoS Aware Aware BiNoC BiNoC Architecture Architecture Shih Shih- -Hsin Hsin Lo, Ying

Next Generation Networks Next Generation Networks QoS Control Architectures and QoS Control

Phoenix: A Constraint-aware Scheduler for Heterogeneous Datacenters Prashanth Thinakaran ,

QoS in 5G: Enhancements for Connected Cars 5G V2X Communications Summer School Kings College

QoS &amp; Scheduling Danny Dolev Danny Dolev * Notes from * Notes from Keshav Keshav and and

QoS-aware Antenna Grouping and Cross-layer Scheduling for mmWave Massive MU-MIMO [1] [1] C.

Trusted End Host Monitors for Securing Cloud Datacenters Alan Shieh Srikanth Kandula

Performance Datacenters HotNets15 Xpander: Unveiling the Secrets of High-Performance

Reliable Communication for Datacenters Mahesh Balakrishnan Cornell University Mahesh

Resource and Admission Control Resource and Admission Control for Next Generation Networks for

Austrian Humanitarian Admission Programme Humanitarian Admission Programme Austria - Overview

Overview Welcome &amp; Introductions Understanding Admission Factors College Admission Testing

dCUDA: Hardware Supported Overlap of Computation and Communication Tobias Gysi, Jeremia Br, and

Building Blocks Operating Systems, Processes, Threads Dr Mark Bull, EPCC markb@epcc.ed.ac.uk

Oversubscription on Multicore Processors Costin Iancu, Steven Hofmeyr, Filip Blagojevi, Yili

Jill Larsen EVP &amp; CHRO Medidata Solutions Agenda AI Defined Embracing AI AI Myths AI

ICT and Development ICT and Development Week 10 March 28 - 30 1 Computers and Society

GPCF* Update Present status as a series of questions / answers related to decisions made / yet

The Impact of Process Placement and Oversubscription on Application Performance: A Case Study for

Fast In Memory Checkpointing with POSIX API for Legacy Exascale Applications Jan Fajerski,

Paragon: QoS-Aware Scheduling for Heterogeneous Datacenters Christina Delimitrou & Christos

QoS & Scheduling Danny Dolev Danny Dolev * Notes from * Notes from Keshav Keshav and and

Overview Welcome & Introductions Understanding Admission Factors College Admission Testing

Jill Larsen EVP & CHRO Medidata Solutions Agenda AI Defined Embracing AI AI Myths AI