Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh - - PowerPoint PPT Presentation

static scheduling in clouds
SMART_READER_LITE
LIVE PREVIEW

Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh - - PowerPoint PPT Presentation

Static Scheduling in Clouds Thomas A. Henzinger Anmol V. Singh Vasu Singh Thomas Wies Damien Zufferey IST Austria June 14, 2011 Motivation (1) Cloud computing gives the illusion of (virtual) resources. Actually there is a finite amount


slide-1
SLIDE 1

Static Scheduling in Clouds

Thomas A. Henzinger Anmol V. Singh Vasu Singh Thomas Wies Damien Zufferey

IST Austria

June 14, 2011

slide-2
SLIDE 2

Motivation (1)

Cloud computing gives the illusion of ∞ (virtual) resources. Actually there is a finite amount of (physical) resources. We would like to efficiently share those resources:

1 being able to distinguish high priority (serving customer now)

from low priority (batch) requests;

2 schedule accordingly.

Therefore, we should be able to plan ahead computations.

Damien Zufferey Static Scheduling in Clouds HotCloud’11 2 / 13

slide-3
SLIDE 3

Motivation (2)

Dynamic Scheduling: use work queues, priorities, but limited. Without knowledge of jobs, this is the best you can do. We need to ask the user for: what kind of resources his job require; a deadline/priority for his job. In exchange we can give him an expected completion time. We can also offer choice. (time is money.)

Damien Zufferey Static Scheduling in Clouds HotCloud’11 3 / 13

slide-4
SLIDE 4

Flextic Overview

Job Parser

Program

User Interface Job Scheduler Cloud Representation Execution Plan Job Execution Platform

Schedules User Choice User chosen schedule

Task finish updates

Damien Zufferey Static Scheduling in Clouds HotCloud’11 4 / 13

slide-5
SLIDE 5

Giving incentive to plan in advance

The scheduler returns not one but many possible schedules with different finish times. Use a pricing model to associate a cost to the schedules. Include the “scheduling difficulty” in the cost, give a discount to schedule with later finish time.

Minimum makespan (critical path) price time Price goes to ∞ as time reaches the minimum makespan Price converges to 0 as time goes to ∞

Problem: static scheduling is hard. Only possible if the scheduler can handle the work load.

Damien Zufferey Static Scheduling in Clouds HotCloud’11 5 / 13

slide-6
SLIDE 6

Jobs Model

t1 1.5 t2 t3 t4 t5 7 t6 t7 t8 t9 10 2 1 4 A Job is a directed acyclic task (DAG) of tasks. Node are marked with worst case duration. Edges are marked with data transfer. duration and data can be parametric in the input.

Damien Zufferey Static Scheduling in Clouds HotCloud’11 6 / 13

slide-7
SLIDE 7

Parametric Jobs

Database Schema Connections Mappers Reducers

Job Parser

Execution Plan Task Details Object Sizes

User Job

Input Data Size

Damien Zufferey Static Scheduling in Clouds HotCloud’11 7 / 13

slide-8
SLIDE 8

Infrastructure Model

Router Router Router Router Router Router Datacenter as a tree-like graph: internal nodes are router; leaves are compute nodes (computation speed); edges specifies the bandwidth.

Damien Zufferey Static Scheduling in Clouds HotCloud’11 8 / 13

slide-9
SLIDE 9

Scheduling Large Jobs using Abstraction [EuroSys 2011]

Assumption: job and infrastructure regularity Idea: regularity makes large scale scheduling feasible How: Using abstraction techniques

Execution Plan Cloud Representation

Abstraction Abstraction

Abstract Cloud Abstract EP Scheduler

  • Abs. Schedule

Concretization

Cloud Representation

Damien Zufferey Static Scheduling in Clouds HotCloud’11 9 / 13

slide-10
SLIDE 10

Abstraction for jobs:

Group independent tasks as per a topological sort. Merge them into an abstract task. t9

1 4

t1

1.5

t2

1.8

t3

3

t4

2.4

t5

7

t6

8

t7

12

t8

10 10 12 20 16 2 2 4 3 1 1 1 1

Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13

slide-11
SLIDE 11

Abstraction for jobs:

Group independent tasks as per a topological sort. Merge them into an abstract task. t9

1 4

t1

1.5

t2

1.8

t3

3

t4

2.4

t5

7

t6

8

t7

12

t8

10 10 12 20 16 2 2 4 3 1 1 1 1

Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13

slide-12
SLIDE 12

Abstraction for jobs:

Group independent tasks as per a topological sort. Merge them into an abstract task. t9

1 4

#4

3

#4

12 20 4 1

Damien Zufferey Static Scheduling in Clouds HotCloud’11 10 / 13

slide-13
SLIDE 13

Abstraction for infrastructure:

Merge nodes to according to network topology: 3

busy 1

3

1

2

busy 1 2

3

1

2

busy 1

2

1 2

2

1/3 1

2

2/3 1 1 1

2

3/6 1

Concrete System Medium abstraction Coarsest abstraction

Damien Zufferey Static Scheduling in Clouds HotCloud’11 11 / 13

slide-14
SLIDE 14

Experiments: compared to Hadoop

Caution: static scheduling alone will not work. Task duration are conservative estimates; Variability of the performance of the compute node. We use static scheduling with backfilling.

50 100 150 200 1 2 4 6 8 10 Job duration (in seconds) Number of m1.xlarge instances Hadoop FISCH BLIND 20 40 60 80 100 120 10 20 30 40 50 60 70 80 Normalized job duration Number of virtual cores Hadoop FISCH

The jobs are MapReduce jobs doing image transformation. Hadoop streaming version 0.19.0

Damien Zufferey Static Scheduling in Clouds HotCloud’11 12 / 13

slide-15
SLIDE 15

Questions ?

Damien Zufferey Static Scheduling in Clouds HotCloud’11 13 / 13