Mage: Online and Interference-Aware Scheduling for Multi-Scale - PowerPoint PPT Presentation

Mage: Online and Interference-Aware Scheduling for Multi-Scale Heterogeneous Systems Francisco Romero 1 and Christina Delimitrou 2 1 Stanford University, 2 Cornell University PACT – Session 4a – November 2, 2018

Motivation • Heterogeneity is becoming more App 2 App 1 prevalent • Different server generations • Advanced management features, e.g., power management • Allows for systems to better Small Small Big Core Core match applications to the Core underlying hardware • Challenge : How do we maximize Memory Memory application performance and maintain high resource utilization?

Prior Work System Heterogeneous Clusters Heterogeneous CMPs ✓ ❌ Paragon ✓ ❌ Whare-map ✓ ❌ Bubble-flux ❌ ✓ Composite cores ❌ ✓ Hass ❌ ✓ PIE

The Problem with “Sum of Schedulers” • Suboptimal performance Heterogeneous Heterogeneous • Revisit several scheduling decisions Cluster Scheduler CMP Scheduler Need a data-driven approach to avoid exhaustive search Exhaustive search Heterogeneous Cluster + • High overhead CMP Scheduler • Not scalable

Mage • Tiered runtime scheduler that considers inter- and intra-server heterogeneity jointly • Leverages fast and online data mining to quickly explore the space of application placements • Lightweight application monitoring and rescheduling • Heterogeneous CMPs: 38% average improvement compared to a greedy scheduler • Heterogeneous Cluster: 30% average improvement compared to a greedy scheduler and 11% average improvement compared to a heterogeneity- and interference- aware scheduler

Mage Master and Mage Agents Agent Mage Agent • Monitor the performance of Big Small all scheduled applications Core Core • Notify the master when QoS violations occur Memory Master Agent Small Core Mage Master • Runs inference • Makes optimal application-to- Memory resource scheduling decision • Decides when applications should be migrated/rescheduled

Application Arrival and Initial Scheduling Agent Big Small Core Core Memory Master Agent Small Core Memory

What we want Application-to- Resource App1:Core1 App1:Core1 App1:Core3 App2:Core2 App2:Core3 … App2:Core2 How can Mage quickly and accurately App3:Core3 App3:Core2 App3:Core1 Applications App1 MIPS 1,1 MIPS 1,2 … MIPS 1,6 generate this matrix? App2 MIPS 2,1 MIPS 2,2 … MIPS 2,6 App3 MIPS 3,1 MIPS 3,2 … MIPS 3,6 ✓ Heterogeneous resources that benefit an application ✓ Performance impact of co-scheduling applications

Collaborative Filtering • Use Single Value Decomposition (SVD) with PQ-Reconstruction (SGD) to uncover: • Heterogeneous resources that benefit individual applications • Interference that can be tolerated between applications App-to- Resource 4 5 3 4 SGD SVD Apps 1 2 7 V U Σ 2 2 3 9 3 9 Reconstructed Utility Matrix Sparse Utility Matrix Decomposed Matrices

Contentious Kernel Profiling Core1 Core2 Core3 Cont. Kernel 1 Cont. Kernel 1 Cont. Kernel 2 Cont. Kernel 2 Cont. Kernel n Cont. Kernel n … … [Network] [CPU] [Cache] [Network] [CPU] [Cache] App1 App1 MIPS 1,1 MIPS 1,1 MIPS 1,2 MIPS 1,2 … … MIPS 1, n ? App2 App2 MIPS 2,1 ? MIPS 2,2 ? … … MIPS 2, n MIPS 2, n Memory App3 App3 MIPS 3,1 MIPS 3,1 MIPS 3,2 ? … … MIPS 3, n ? Common reference point for the sensitivity of new applications to interference of shared resources

Co-Scheduling Sensitivity Small Big Core Small Core Core Memory Memory

Co-Scheduling Sensitivity App1:Core1 App1:Core1 App1:Core2 App1:Core2 App1:Core3 App1:Core3 App2:Core2 App2:Core3 App2:Core1 App2:Core3 App2:Core1 App2:Core2 App3:Core3 App3:Core2 App3:Core3 App3:Core1 App3:Core2 App3:Core1 App1 MIPS 1,1 MIPS 1,2 ? ? ? ? App2 MIPS 2,1 ? ? ? ? MIPS 2,6 App3 MIPS 3,1 ? MIPS 3,3 ? ? ?

Co-Scheduling Sensitivity App1:Core1 App1:Core1 App1:Core2 App1:Core2 App1:Core3 App1:Core3 App2:Core2 App2:Core3 App2:Core1 App2:Core3 App2:Core1 App2:Core2 App3:Core3 App3:Core2 App3:Core3 App3:Core1 App3:Core2 App3:Core1 App1 MIPS 1,1 MIPS 1,2 MIPS 1,3 MIPS 1,4 MIPS 1,5 MIPS 1,6 App2 MIPS 2,1 MIPS 2,2 MIPS 2,3 MIPS 2,4 MIPS 2,5 MIPS 2,6 App3 MIPS 3,1 MIPS 3,2 MIPS 3,3 MIPS 3,4 MIPS 3,5 MIPS 3,6 Profile of the impact of co-scheduling applications on all combinations of resources

Initial Application Placement Agent Big Small Core Core Memory Master Agent Small Core Memory

Runtime Monitoring and Rescheduling Agent • Increase Least Big Small Core Core invasive resources locally Memory • Migrate from smaller core to Master Agent Agent bigger core Small • Migrate across Core Most servers invasive Memory

Evaluation ● Workloads ○ Single- and multi-threaded benchmark suites ○ Latency-critical, interactive services ● Execution scenarios ○ Simulated heterogeneous 16-core CMP ○ Real 40-server heterogeneous cluster ○ Real cluster with core-level heterogeneity using power management (DVFS) ● Comparison schedulers ○ Greedy, Smallest-First, Mage- Static, PIE [ISCA’12], Paragon [ASPLOS’13]

Low Error and Scheduling Overhead Initial Scheduling Overhead (sec) Heterogeneous CMP Heterogeneous Cluster 2.0 10 10 without DVFS Estimation Error (%) Estimation Error (%) 8 8 with DVFS 1.5 6 6 CMP 1.0 Cluster + DVFS 4 4 0.5 2 2 0.0 0 0 0 50 100 150 200 250 300 350 0 50 100 150 200 250 300 350 0 20 40 60 80 100 120 140 160 Application Mix Application Mix Application Mix Mage has low initial scheduling overhead and low estimation error ● Reduces the need to adjust scheduling decisions frequently during application lifetime

Versus Greedy Heterogeneous CMP Heterogeneous Cluster Heterogeneous Cluster + DVFS 2.0 1.8 1.8 1.8 1.6 1.6 Speedup Gmean Speedup Gmean Speedup Gmean 1.6 1.4 1.4 1.4 1.2 1.2 1.2 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0 50 100 150 200 250 300 350 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Application Mix Application Mix Application Mix Mage outperforms the Greedy scheduler by only allocating the necessary resources to meet an application’s QoS

Versus Smallest-First Heterogeneous CMP Heterogeneous Cluster Heterogeneous Cluster + DVFS 2.0 1.8 1.8 1.8 1.6 1.6 Speedup Gmean Speedup Gmean Speedup Gmean 1.6 1.4 1.4 1.4 1.2 1.2 1.2 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0 50 100 150 200 250 300 350 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Application Mix Application Mix Application Mix Mage outperforms the Smallest-First scheduler by not exacerbating contention in shared resources

Versus Mage-Static Heterogeneous CMP Heterogeneous Cluster Heterogeneous Cluster + DVFS 2.0 1.8 1.8 1.8 1.6 1.6 Speedup Gmean Speedup Gmean Speedup Gmean 1.6 1.4 1.4 1.4 1.2 1.2 1.2 1.0 1.0 1.0 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0 50 100 150 200 250 300 350 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Application Mix Application Mix Application Mix Mage outperforms Mage-Static by rescheduling applications that were mispredicted or that exhibit diurnal patterns

Versus Paragon+PIE and Paragon+Paragon Heterogeneous Cluster + DVFS Heterogeneous Cluster + DVFS 1.8 1.8 1.6 1.6 Speedup Gmean Speedup Gmean 1.4 1.4 1.2 1.2 1.0 1.0 0.8 0.8 0.6 0.6 0.4 0.4 0 20 40 60 80 100 120 140 160 0 20 40 60 80 100 120 140 160 Application Mix Application Mix Mage outperforms Paragon+PIE and Paragon+Paragon by having a global view of resource availability and per-application resource requirements

Sensitivity to Heterogeneity Increase ● As degree of heterogeneity increases, the benefits of using Mage also increases ● Results are also consistent for heterogeneous CMPs ● Minimal scheduling overhead as degree of heterogeneity increases

Conclusion ● Heterogeneity is becoming more prevalent; need a scheduler that can match applications to their resource needs ● Mage is a tiered scheduler that bridges the gap between CMP- and cluster-level heterogeneous scheduling ● Mage leverages a novel staged , parallel SGD algorithm to quickly and accurately classify applications ● Mage is lightweight and scalable ● Mage outperforms heterogeneity-agnostic and the sum of CMP- and cluster-level schedulers

Thank you! Questions? faromero@stanford.edu

Backup

Versus Paragon Heterogeneous Cluster 1.8 1.6 Speedup Gmean 1.4 1.2 1.0 0.8 0.6 0.4 0 20 40 60 80 100 120 140 160 Application Mix

Versus PIE Heterogeneous CMP 1.8 1.6 Speedup Gmean 1.4 1.2 1.0 0.8 0.6 0.4 0 50 100 150 200 250 300 350 Application Mix

Partial Interference Sensitivity – SGD Step 2 App1:Core1 App1:Core1 App1:Core2 App1:Core3 App2:Core2 App2:Core3 App2:Core1 App2:Core2 App3:Core3 App3:Core2 App3:Core3 App3:Core1 App1 MIPS 1,1 MIPS 1,2 ? ? App2 MIPS 2,1 ? ? MIPS 2,6 App3 MIPS 3,1 ? MIPS 3,3 ? Solution : Run SGD without those columns, and add them in afterwards

Partial Interference Sensitivity – SGD Step 2 App1:Core1 App1:Core1 App1:Core2 App1:Core3 App2:Core2 App2:Core3 App2:Core1 App2:Core2 App3:Core3 App3:Core2 App3:Core3 App3:Core1 App1 MIPS 1,1 MIPS 1,2 ? ? A SGD1 App2 MIPS 2,1 ? ? MIPS 2,6 App3 MIPS 3,1 ? MIPS 3,3 ? Solution : Run SGD without those columns, and add them in afterwards

Mage: Online and Interference-Aware Scheduling for Multi-Scale - PowerPoint PPT Presentation

Mage: Online and Interference-Aware Scheduling for Multi-Scale Heterogeneous Systems Francisco Romero 1 and Christina Delimitrou 2 1 Stanford University, 2 Cornell University PACT Session 4a November 2, 2018 Motivation Heterogeneity

PERFORMANCE STUDY of THEOS DATA TRANSFER via Mobile IP Mr. Prawit CHERTCHOM Network Engineer

The wave model of light explains diffraction and interference. 31 Diffraction and Interference

A Broadcast Approach Maha Zohdy, Ali Tajer, Shlomo Shamai RPI RPI Technion ISIT'20 1

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Orbital Slots and Spectrum Use in an Era of Interference Interference and Telecommunication

LAURA GRAEN Tobacco Industry Interference: The Facts and More Tobacco Industry Interference

How I learned to stop worrying and love interference Pierre de Vries ISART 2010 ISART 2010 1

Technician License Course Chapter 8 Lesson Module 17 Interference, Remote & Automatic

Treating Interference as Noise is Optimal for Covert Communication over Interference Channels

Renan Kazazoglu, Alper Ucar, Ediz Cetin, Izzet Kale Outline Introduction Multiple Access

Passive Intermodulation (PIM), an interference challenge for the radio Passive Intermodulation

Centennial 1917 - 2017 1 Electromagnetic Interference Interference to garage door openers

Interference Alignment via Message-Passing Message-Passing M. Guillaud Motivation Maxime

Incremental Analysis of Interference Among Aspects Interference Among Aspects Authors: Emilia

Physics 116 Lecture 8 Interference Halfway between are Oct 11, 2011 lines of destructive

GoBack The Thermodynamic of Interference COW19, UCL May 2012 Pasquale Malacaria Queen Mary

Lowness of the piegeonhole principle Benoit Monin joint work with Ludovic Patey Universit e

Fair partition of a convex planar pie Roman Karasev 1 joint work with Arseniy Akopyan 2 and Sergey

interaction models talking generally From Formalism to Physicality, Alan Dix, UPC

Lecture 12 Chapter 9 Plots, Graphs, and Pictures Determining What Display is Appropriate

Lecture 9: The CKY parsing algorithm Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel

EHEALTH COMMISSION MEETING DECEMBER 12TH, 2018 DECEMBER AGENDA Call to Order Roll Call and

PIGEON HOLE PRINCIPLE Pigeon Hole Principle If f : [ m ] [ n ] then there exists i [ n ]

Chapter 1. Pigeonhole Principle Prof. Tesler Math 184A Winter 2019 Prof. Tesler Ch. 1.

Mage: Online and Interference-Aware Scheduling for Multi-Scale - PowerPoint PPT Presentation

Mage: Online and Interference-Aware Scheduling for Multi-Scale Heterogeneous Systems Francisco Romero 1 and Christina Delimitrou 2 1 Stanford University, 2 Cornell University PACT Session 4a November 2, 2018 Motivation Heterogeneity

PERFORMANCE STUDY of THEOS DATA TRANSFER via Mobile IP Mr. Prawit CHERTCHOM Network Engineer

The wave model of light explains diffraction and interference. 31 Diffraction and Interference

A Broadcast Approach Maha Zohdy, Ali Tajer, Shlomo Shamai RPI RPI Technion ISIT'20 1

Toolkit to Support Intelligibility in Context Aware Applications Context-Aware Applications P

Orbital Slots and Spectrum Use in an Era of Interference Interference and Telecommunication

LAURA GRAEN Tobacco Industry Interference: The Facts and More Tobacco Industry Interference

How I learned to stop worrying and love interference Pierre de Vries ISART 2010 ISART 2010 1

Technician License Course Chapter 8 Lesson Module 17 Interference, Remote &amp; Automatic

Treating Interference as Noise is Optimal for Covert Communication over Interference Channels

Renan Kazazoglu, Alper Ucar, Ediz Cetin, Izzet Kale Outline Introduction Multiple Access

Passive Intermodulation (PIM), an interference challenge for the radio Passive Intermodulation

Centennial 1917 - 2017 1 Electromagnetic Interference Interference to garage door openers

Interference Alignment via Message-Passing Message-Passing M. Guillaud Motivation Maxime

Incremental Analysis of Interference Among Aspects Interference Among Aspects Authors: Emilia

Physics 116 Lecture 8 Interference Halfway between are Oct 11, 2011 lines of destructive

GoBack The Thermodynamic of Interference COW19, UCL May 2012 Pasquale Malacaria Queen Mary

Lowness of the piegeonhole principle Benoit Monin joint work with Ludovic Patey Universit e

Fair partition of a convex planar pie Roman Karasev 1 joint work with Arseniy Akopyan 2 and Sergey

interaction models talking generally From Formalism to Physicality, Alan Dix, UPC

Lecture 12 Chapter 9 Plots, Graphs, and Pictures Determining What Display is Appropriate

Lecture 9: The CKY parsing algorithm Julia Hockenmaier juliahmr@illinois.edu 3324 Siebel

EHEALTH COMMISSION MEETING DECEMBER 12TH, 2018 DECEMBER AGENDA Call to Order Roll Call and

PIGEON HOLE PRINCIPLE Pigeon Hole Principle If f : [ m ] [ n ] then there exists i [ n ]

Chapter 1. Pigeonhole Principle Prof. Tesler Math 184A Winter 2019 Prof. Tesler Ch. 1.

Technician License Course Chapter 8 Lesson Module 17 Interference, Remote & Automatic