Elasecutor: Elastic Executor Scheduling in Data Analytics Systems
Libin Liu, Hong Xu City University of Hong Kong
ACM Symposium on Cloud Computing 2018
1
Elasecutor: Elastic Executor Scheduling in Data Analytics Systems - - PowerPoint PPT Presentation
Elasecutor: Elastic Executor Scheduling in Data Analytics Systems Libin Liu , Hong Xu City University of Hong Kong ACM Symposium on Cloud Computing 2018 1 Data Analytics Systems Various workloads running in data analytics systems
1
2
2
2
2
3
3
Directed Acyclic Graph (DAG)
Stage 1
parallelize filter map
Stage 2
reduceByKey map
Stage 3
parallelize filter map
Stage 4
join
4
4
5
6
7
7
8
Resource CPU Memory Network Disk
Terasort
Peak/Avg. 1.8 1.7 6.2 1.5 Peak/Trough 60 3.3 237 6.1
K-means
Peak/Avg. 1.7 1.2 11.5 5.6 Peak/Trough 75 6 53 100
Pagerank
Peak/Avg. 3.9 1.3 20.2 9.1 Peak/Trough 50 11.5 119 50
Logistic Regression
Peak/Avg. 2.1 1.4 5.5 6.1 Peak/Trough 50 12 409.6 42.5
8
Resource CPU Memory Network Disk
Terasort
Peak/Avg. 1.8 1.7 6.2 1.5 Peak/Trough 60 3.3 237 6.1
K-means
Peak/Avg. 1.7 1.2 11.5 5.6 Peak/Trough 75 6 53 100
Pagerank
Peak/Avg. 3.9 1.3 20.2 9.1 Peak/Trough 50 11.5 119 50
Logistic Regression
Peak/Avg. 2.1 1.4 5.5 6.1 Peak/Trough 50 12 409.6 42.5
8
Resource CPU Memory Network Disk
Terasort
Peak/Avg. 1.8 1.7 6.2 1.5 Peak/Trough 60 3.3 237 6.1
K-means
Peak/Avg. 1.7 1.2 11.5 5.6 Peak/Trough 75 6 53 100
Pagerank
Peak/Avg. 3.9 1.3 20.2 9.1 Peak/Trough 50 11.5 119 50
Logistic Regression
Peak/Avg. 2.1 1.4 5.5 6.1 Peak/Trough 50 12 409.6 42.5
8
Resource CPU Memory Network Disk
Terasort
Peak/Avg. 1.8 1.7 6.2 1.5 Peak/Trough 60 3.3 237 6.1
K-means
Peak/Avg. 1.7 1.2 11.5 5.6 Peak/Trough 75 6 53 100
Pagerank
Peak/Avg. 3.9 1.3 20.2 9.1 Peak/Trough 50 11.5 119 50
Logistic Regression
Peak/Avg. 2.1 1.4 5.5 6.1 Peak/Trough 50 12 409.6 42.5
9
9
9
10
11
12
12
13
Improvement of DRR over TRC as an alternative metric for executor placement
14
14
Heartbeat received
14
Heartbeat received Search executors in the queue
14
Heartbeat received Search executors in the queue Calculate DRR for any executor placed on the machine
14
Heartbeat received Search executors in the queue Calculate DRR for any executor placed on the machine Choose the one producing minimum DRR to schedule
14
Heartbeat received Search executors in the queue Calculate DRR for any executor placed on the machine Choose the one producing minimum DRR to schedule Update placement results
14
Heartbeat received Search executors in the queue Calculate DRR for any executor placed on the machine Choose the one producing minimum DRR to schedule Update placement results
Termination Repeat the process
15
(a) Available resources of machine (b) Resource demands of executor 1 (c) Resource demands of executor 2
15
(a) Available resources of machine (b) Resource demands of executor 1 (c) Resource demands of executor 2
15
(a) Available resources of machine (b) Resource demands of executor 1 (c) Resource demands of executor 2
16
17
18
19
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository Scheduling Module Reprovisioning Module Resource Manager
Master Workers
20
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module Reprovisioning Module Resource Manager
Master Workers
21
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands
Reprovisioning Module Resource Manager
Master Workers
22
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands Available Resources
Reprovisioning Module Resource Manager
Master Workers
23
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands Available Resources
Reprovisioning Module Resource Manager
Scheduling Decision
Master Workers
24
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands Available Resources
Reprovisioning Module Resource Manager
Scheduling Decision Launch Adjust
Master Workers
25
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands Available Resources
Reprovisioning Module Resource Manager
Scheduling Decision Trigger Launch Adjust
Master Workers
26
Surrogate
Executor
Tasks
CPU Mem Net Disk
Allocation Module Prediction Module Resource Usage Depository
Report Profiles
Scheduling Module
Predicted Demands Available Resources
Reprovisioning Module Resource Manager
Scheduling Decision Trigger Reprovision Launch Adjust
Master Workers
27
28
29
Makespan measures the total time used to complete all applications
29
Makespan measures the total time used to complete all applications
30
30
31
31
31
32
32
33
34
35
36
37
37
38
38
38
5.5
38
5.5
39
39