Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh - - PowerPoint PPT Presentation
Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh - - PowerPoint PPT Presentation
Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh Vasu Singh Thomas Wies Damien Zufferey IST Austria June 14, 2011 Motivation (1) Cloud computing gives the illusion of (virtual) resources. Actually there is a finite amount
Motivation (1)
Cloud computing gives the illusion of ∞ (virtual) resources. Actually there is a finite amount of (physical) resources. We would like to efficiently share those resources:
1 being able to distinguish high priority (serving customer now)
from low priority (batch) requests;
2 schedule accordingly.
Therefore, we should be able to plan ahead computations.
Damien Zufferey Static Scheduling in Clouds HotCloud’11 2 / 13
Motivation (2)
Dynamic Scheduling: use work queues, priorities, but limited. Without knowledge of jobs, this is the best you can do. We need to ask the user for: what kind of resources his job require; a deadline/priority for his job. In exchange we can give him an expected completion time. We can also offer choice. (time is money.)
Damien Zufferey Static Scheduling in Clouds HotCloud’11 3 / 13
Flextic Overview
Job Parser
Program
User Interface Job Scheduler Cloud Representation Execution Plan Job Execution Platform
Schedules User Choice User chosen schedule
Task finish updates
Damien Zufferey Static Scheduling in Clouds HotCloud’11 4 / 13
Giving incentive to plan in advance
The scheduler returns not one but many possible schedules with different finish times. Use a pricing model to associate a cost to the schedules. Include the “scheduling difficulty” in the cost, give a discount to schedule with later finish time.
Minimum makespan (critical path) price time Price goes to ∞ as time reaches the minimum makespan Price converges to 0 as time goes to ∞
Problem: static scheduling is hard. Only possible if the scheduler can handle the work load.
Damien Zufferey Static Scheduling in Clouds HotCloud’11 5 / 13
Jobs Model
t1 1.5 t2 t3 t4 t5 7 t6 t7 t8 t9 10 2 1 4 A Job is a directed acyclic task (DAG) of tasks. Node are marked with worst case duration. Edges are marked with data transfer. duration and data can be parametric in the input.
Damien Zufferey Static Scheduling in Clouds HotCloud’11 6 / 13
Parametric Jobs
Database Schema Connections Mappers Reducers
Job Parser
Execution Plan Task Details Object Sizes
User Job
Input Data Size
Damien Zufferey Static Scheduling in Clouds HotCloud’11 7 / 13
Infrastructure Model
Router Router Router Router Router Router Datacenter as a tree-like graph: internal nodes are router; leaves are compute nodes (computation speed); edges specifies the bandwidth.
Damien Zufferey Static Scheduling in Clouds HotCloud’11 8 / 13
Scheduling Large Jobs using Abstraction [EuroSys 2011]
Assumption: job and infrastructure regularity Idea: regularity makes large scale scheduling feasible How: Using abstraction techniques
Execution Plan Cloud Representation
Abstraction Abstraction
Abstract Cloud Abstract EP Scheduler
- Abs. Schedule
Concretization
Cloud Representation
Damien Zufferey Static Scheduling in Clouds HotCloud’11 9 / 13
Abstraction for jobs:
Group independent tasks as per a topological sort. Merge them into an abstract task. t9
1 4
t1
1.5
t2
1.8
t3
3
t4
2.4
t5
7
t6
8
t7
12
t8
10 10 12 20 16 2 2 4 3 1 1 1 1
Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13
Abstraction for jobs:
Group independent tasks as per a topological sort. Merge them into an abstract task. t9
1 4
t1
1.5
t2
1.8
t3
3
t4
2.4
t5
7
t6
8
t7
12
t8
10 10 12 20 16 2 2 4 3 1 1 1 1
Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13
Abstraction for jobs:
Group independent tasks as per a topological sort. Merge them into an abstract task. t9
1 4
#4
3
#4
12 20 4 1
Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13
Abstraction for infrastructure:
Merge nodes to according to network topology: 3
busy 1
3
1
2
busy 1 2
3
1
2
busy 1
2
1 2
2
1/3 1
2
2/3 1 1 1
2
3/6 1
Concrete System Medium abstraction Coarsest abstraction
Damien Zufferey Static Scheduling in Clouds HotCloud’11 11 / 13
Experiments: compared to Hadoop
Caution: static scheduling alone will not work. Task duration are conservative estimates; Variability of the performance of the compute node. We use static scheduling with backfilling.
50 100 150 200 1 2 4 6 8 10 Job duration (in seconds) Number of m1.xlarge instances Hadoop FISCH BLIND 20 40 60 80 100 120 10 20 30 40 50 60 70 80 Normalized job duration Number of virtual cores Hadoop FISCH
The jobs are MapReduce jobs doing image transformation. Hadoop streaming version 0.19.0
Damien Zufferey Static Scheduling in Clouds HotCloud’11 12 / 13
Questions ?
Damien Zufferey Static Scheduling in Clouds HotCloud’11 13 / 13