Deep Reinforcement Learning based Elasticity-compatible - PowerPoint PPT Presentation

Deep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing Zixia Liu, University of Central Florida Liqiang Wang, University of Central Florida Gang Quan, Florida International University

Background • Expanding needs for data analytics call for greater scale computing infrastructure, multi-cluster computing environment shows its benefits and necessity in this. • Example: institution-owned geo-distributed clusters, hybrid-cloud, etc. • An efficient resource management is needed. • Many features to consider for resource management, also including cluster heterogeneity and elasticity. • To consider features in an integration, We presents a DRL based resource management in such environment. An institution An example of a multi-cluster environment: Cluster Cluster Cluster in public Cloud at location 1 at location 2 2

Contribution • We propose a DRL based approach utilizing: • LSTM model and • multi-target regression with partial model sharing mechanism and compare its effectiveness with baselines and another RL approach. • The approach is designed for distributed multi-cluster computing environments considering: • its heterogeneity and • being elasticity-compatible. • It provides scheduling support for time-critical computing in such a multi-cluster environment. 3

Problem Description • Cluster in environment expresses its computing resources as the number of executors it could provide. • Executors of different clusters may have different computing capabilities. • Some clusters may be elastic. • Goals for resource management: (1) Reducing occurrences of missing temporal deadline events. (2) Maintaining a low average execution time ratio for a hybrid workload containing multiple time- critical and general jobs. 4

DRL based Approach • Brief introduction of Reinforcement learning • We are using: • Reinforcement learning on deep neural networks • With neural networks serving as value estimators. 5

DRL based Approach • Challenges: • How to represent system status and job information as state for such environment? • How should we define value? • Effective value estimator? • Environment • Action set • Episode • State • Computing system features and status • Scheduling job information 6

DRL based Approach • Value definition ideas: • Attend to causes of missing deadlines. • Attend to job’s influence on resource competition. • Attend to mutual influences among jobs in cluster. • Attend to influences of heterogeneity and elasticity. • Attend to both missing deadlines and execution delay ratio. • Value formula: (𝑢) : The happening of each 𝑢 𝑡 and 𝑢 𝑓 : The deployment 𝑆 𝑘 : The overall average 𝜃 𝑑 : The heterogeneity factor of the 𝑋 𝑘 cluster. and termination moment of execution delay ratio of job missing deadline event of job j at job j. j. 𝑘 . moment t, if not in 𝑁 𝜃 𝑘 : The expected heterogeneity 𝛾: The decay factor. 𝑛 𝑗ℎ , 𝑛 𝑗𝑑 , 𝜔 𝑗ℎ and 𝜔 𝑗𝑑 : factor of the job. (𝑢) : The number of missing 𝑋 penalty terms w.r.t. 𝑑𝑚 deadlines of all jobs in the cluster Improper Heterogeneity 𝐸 𝑢 : Number of new jobs 𝑁 𝑘 : The number of missing deadlines at t if with resource waiting. and Initial Competition. deployed to the cluster after of job j without resource waiting. 𝑢 𝑡 , till moment t. 7

DRL based Approach • DRL model structure and value definition decomposition 8

DRL based Approach • Training Enhancement Skills Cluster occupation status traverse: • Cluster occupation status traverse. • Towards better cooperation with LSTM. • Training with decayed learning rate. • Towards finer model adjustment at later episodes in training. • Training with randomized workload. • Towards more general knowledge from various workloads. • Modified ε -greedy exploration. • Towards utilizing knowledge of rule-based model to partially guide exploration. • Solving multi-job selection dilemma • Towards coping with jobs in the job buffer. 9

DRL based Approach • Training architecture Random Knowledge Retrival Reinforcement Knowledge Learning Training Replay Buffer New Knowledge Value Knowledge Model Update Global v1 v3 Simulation Module Job Job Generation Module v2 Deep Neural Buffer V1 Network based Action Value Resource Calculation Job Arriving V2 V3 Multi-cluster Categorical Job Retrival Query Management Pattern Guided Environment Single Job Workload Simulation Engine Generation Query Generator Performance Value feedback for actions w.r.t. the job Engine Metrics Collection Select Job and its Action with max value in global job buffer 10

Experiments • Introduction • Experiment via simulation with a testing environment of 5 clusters. Clusters in this environment are heterogeneous and 2 of the clusters have elasticity as well. • Elasticity controller • Local intra-cluster scheduler 11

Experiments • Comparison: • Performance metrics: • Rule-based baselines: • TMDL: • Random (RAN) • Total number of occurrences of missing deadlines for all jobs in all clusters during the • Round-Robin (RR) execution of the workload. • Most Available First (MAF) • AJER: • Another RL approach: • Average job execution time ratio among all • RL-FC clusters • Job arriving patterns: • S_log • Uniform, Bernoulli and Beta 12

Experiments Performance comparison ( 𝑇 𝑚𝑝𝑕 ) of our deep RL approach RL- LSFC and baseline approaches in different training episodes. 13

Experiments Comparison of RL-LSFC and MAF for 50 testing episodes. (L) lower is better. (H) higher is better. Fully-dominant(F), Semi-dominant(S) or Non-dominant(N) receives score 1 in an episode, if our approach is better than MAF in both, only one or none of the two metrics (TMDL and AJER). 14

Experiments Comparison of RL-LSFC and MAF in variant workloads. (a)-(c) are related to b=36 scenario. (d)-(f) are related to b=40. Here b is a parameter in Uniform job pattern. 15

Experiments Comparison of RL-LSFC and MAF in other job arriving patterns. (a)-(c): Bernoulli pattern. (d)-(f): Beta pattern. 16

Experiments Comparison of three RL models w.r.t. MAF. In (b), we give F:2, S:1 and N:0 for scoring to show a dominant area (larger is better) of RL-LSFC (RL-LSFCb is very similar to RL-LSFC here, so omitted for viewing) and RL-FC. 17

Experiments (e) RL-LSFC Cate-2 (a) RL-LSFC overall (c) RL-LSFC Cate-1 (g) RL-LSFC Cate-3 (f) MAF Cate-2 (h) MAF Cate-3 (d) MAF Cate-1 (b) MAF overall Job-Cluster scheduling patterns for RL-LSFC and MAF in one testing episode. One point for each job and one color for each job category. Vertical axis 1-5 is referring to cluster sequence number. Horizontal axis is time slice. 18

Experiments RL-LSFC Cate-2 RL-LSFC Cate-3 RL-LSFC Cate-1 Comparison of Job-Cluster scheduling pattern with respect to different job categories under RL-LSFC control. Value axis is on logarithmic scale of job counts; angle axis is time slice. One color for each cluster. 19

Conclusion • Obtained an elasticity-compatible resource management via DRL for a heterogeneous multi-cluster environment. • Comparing to the best baseline, it • reduces the occurrence of missing execution deadline events for workloads of 1000 jobs by around 5x to 18x, • and reduces average execution time ratio by around 2% to 5%. • Also shows better performance than a previous reinforcement learning based approach with fully-connected layers. 20

Deep Reinforcement Learning based Elasticity-compatible - PowerPoint PPT Presentation

Deep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing Zixia Liu, University of Central Florida Liqiang Wang, University of Central Florida Gang Quan, Florida International

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Deep Reinforcement Learning [Human-Level Control through deep reinforcement learning, Nature

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Deep he(a)p, big feat arXiv:1707.06887 A Distributional Perspective on Reinforcement Learning

Advanced Model-Based Reinforcement Learning CS 294-112: Deep Reinforcement Learning Sergey

Deep Reinforcement Learning Philipp Koehn 21 April 2020 Philipp Koehn Artificial Intelligence:

Lecture 21 : The Sample Total and Mean and The Central Limit Theorem 0/ 25 1. Statistics and

The Eurozone's awkward threesome: fiscal stance, macroeconomic stability and growth Professor

Dan Halperin School of Computer Science Tel Aviv University Heraklion, January 2013 Overview

Earnings Conference Call Third Quarter 2015 October 28, 2015 Cautionary Statements And Risk

Unit 8: Non-Randomness of Corpus Data & Generalised Linear Models Marco Baroni 1 & Stefan

Sentiment Survey J. Andrew Hansz, Ph.D., CFA, MAI Robert M. S tanton Chair of Real Estate

Lessons Learned from the CART Services Mobile We have nothing to disclose. Consult Team Gerri

Q3 2019 Earnings Review 1 Cautionary ry Note Non-GAAP Measures This presentation of Pan

Deep Reinforcement Learning based Elasticity-compatible - PowerPoint PPT Presentation

Deep Reinforcement Learning based Elasticity-compatible Heterogeneous Resource Management for Time-critical Computing Zixia Liu, University of Central Florida Liqiang Wang, University of Central Florida Gang Quan, Florida International

Deep Neural Networks and Deep Reinforcement Learning Deep Learning, Goodfellow, Bengio and

Reinforcement Learning Timothy Chou Charlie Tong Vincent Zhuang April 19, 2016 Reinforcement

Reinforcement Learning AIMA Chapters: 21.1, 21.2, 21.3. Sutton and Barto, Reinforcement Learning:

Reinforcement Learning and Simulation-Based Search David Silver Reinforcement Learning and

1 Deep Reinforcement Learning Qianqian Li, Nayeon Koong, Langtian He What is deep reinforcement

RL Overview of topics About Reinforcement Learning The Reinforcement Learning Problem

Deep Reinforcement Learning [Mastering the Game of Go with Deep Reinforcement Learning and Tree

Reinforcement Learning UMaine COS 470/570 Introduction to AI Why reinforcement learning?

Reinforcement Learning Reinforcement Learning Reinforcement Learning in a nutshell g Imagine

Safe Reinforcement Learning Philip S. Thomas Stanford CS234: Reinforcement Learning, Guest

Deep Reinforcement Learning [Human-Level Control through deep reinforcement learning, Nature

Deep learning Deep reinforcement learning Hamid Beigy Sharif university of technology December

CS885 Reinforcement Learning Module 2: June 6, 2020 Maximum Entropy Reinforcement Learning

Deep he(a)p, big feat arXiv:1707.06887 A Distributional Perspective on Reinforcement Learning

Advanced Model-Based Reinforcement Learning CS 294-112: Deep Reinforcement Learning Sergey

Deep Reinforcement Learning Philipp Koehn 21 April 2020 Philipp Koehn Artificial Intelligence:

Lecture 21 : The Sample Total and Mean and The Central Limit Theorem 0/ 25 1. Statistics and

The Eurozone's awkward threesome: fiscal stance, macroeconomic stability and growth Professor

Dan Halperin School of Computer Science Tel Aviv University Heraklion, January 2013 Overview

Earnings Conference Call Third Quarter 2015 October 28, 2015 Cautionary Statements And Risk

Unit 8: Non-Randomness of Corpus Data &amp; Generalised Linear Models Marco Baroni 1 &amp; Stefan

Sentiment Survey J. Andrew Hansz, Ph.D., CFA, MAI Robert M. S tanton Chair of Real Estate

Lessons Learned from the CART Services Mobile We have nothing to disclose. Consult Team Gerri

Q3 2019 Earnings Review 1 Cautionary ry Note Non-GAAP Measures This presentation of Pan

Unit 8: Non-Randomness of Corpus Data & Generalised Linear Models Marco Baroni 1 & Stefan