Learning Scheduling Algorithms for Data Processing Clusters
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh
Learning Scheduling Algorithms for Data Processing Clusters Hongzi - - PowerPoint PPT Presentation
Learning Scheduling Algorithms for Data Processing Clusters Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh Mot Motivation on Scheduling is a fundamental task in computer systems Cluster
Hongzi Mao, Malte Schwarzkopf, Shaileshh Bojja Venkatakrishnan, Zili Meng, Mohammad Alizadeh
2
5
Scheduler
Executor 1 Executor m Executor 2
5
6
Scheduler
Server 1 Server m Server 2
State
Job DAG 1 Job DAG n Executor 1 Executor m
Scheduling Agent
p[
Policy Network Graph Neural Network Environment Schedulable Nodes Objective Reward
Observation of jobs and cluster status
Number of servers working on this job
State
Job DAG 1 Job DAG n Executor 1 Executor m
Scheduling Agent
p[
Policy Network Graph Neural Network Environment Schedulable Nodes Objective Reward
Observation of jobs and cluster status
14
15
Job DAG 1 Job DAG n Server 1 Server 2 Server 4 Server 3 Server m
Set of identical free executors
Job DAG 1 Job DAG n Server 1 Server 2 Server 4 Server 3 Server m
16
Job DAG 1 Job DAG n Server 1 Server 2 Server 4 Server 3 Server m
17
18
Job DAG 1 Job DAG n Server 1 Server 2 Server 4 Server 3 Server m
Use 3 servers Use 1 server Use 1 server
19
Job DAG 1 Job DAG n
20
21
Time Number of backlogged jobs
Initial random policy
22
Time Number of backlogged jobs
Waste training time Initial random policy
23
Time Number of backlogged jobs
Early reset for initial training Initial random policy
24
Time Number of backlogged jobs
As training proceeds, stronger policy keeps the queue stable Increase the reset time
25
26
27
28
Mohammad Alizadeh. International Conference on Learning Representations (ICLR), 2019.
29
30
Better
Tuned weighted fair Decima
31
Tuned weighted fair Decima
32
33
34