A Genetic Algorithm with Communiation Costs to Schedule Workflows on - - PowerPoint PPT Presentation

a genetic algorithm with communiation costs to schedule
SMART_READER_LITE
LIVE PREVIEW

A Genetic Algorithm with Communiation Costs to Schedule Workflows on - - PowerPoint PPT Presentation

A Genetic Algorithm with Communiation Costs to Schedule Workflows on a SOA-Grid Laurent PHILIPPE Co-authors: Lamiel Toch and Jean-Marc Nicod Laboratoire dInformatique de Franche-Comt Universit de Franche-Comt Besanon HETEROPAR -


slide-1
SLIDE 1

A Genetic Algorithm with Communiation Costs to Schedule Workflows on a SOA-Grid

Laurent PHILIPPE

Co-authors: Lamiel Toch and Jean-Marc Nicod Laboratoire d’Informatique de Franche-Comté Université de Franche-Comté Besançon

HETEROPAR - August 2011

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 1 / 28

slide-2
SLIDE 2

Context GA Scheduling Simulation General Dags Identical Intrees

Workflow applications

Combine several applications

  • r application modules

Precedence constraints (Files) Application domaine : Astronomy, Bioinformatics, Chemistry, Climate Modeling, Computer Science, Image Processing, etc. Batch processing Collection of workflows

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 2 / 28

slide-3
SLIDE 3

Context GA Scheduling Simulation General Dags Identical Intrees

SOA Grids

Provides applications access Execution on clusters Simple acess for scientists Tools : DIET or NINF-G

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 3 / 28

slide-4
SLIDE 4

Context GA Scheduling Simulation General Dags Identical Intrees

Contents

1

Context

2

GA Scheduling

3

Simulation

4

General Dags

5

Identical Intrees

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 4 / 28

slide-5
SLIDE 5

Context GA Scheduling Simulation General Dags Identical Intrees

Framework model

Applicative framework Collection B = {J j, 1 ≤ j ≤ N} of N workflows to schedule Workflow J j is represented by a DAG J j = (T j, Dj)

T j = {T j

1, . . . , T j nj} : the tasks

Dj : the precedence constraints F j

k,i is the file sent between T j k and T j i when (T j k, T j i ) ∈ Dj

T = ∪N

j=1T j = {T j ij, 1 ≤ ij ≤ nj and 1 ≤ j ≤ N} : set to

schedule Typed tasks : t(i, j) as the type of task T j

i .

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 5 / 28

slide-6
SLIDE 6

Context GA Scheduling Simulation General Dags Identical Intrees

Framework model - 2

Target platform Platform PF : n machines modeled by an undirected graph PF = (P, L)

The vertices in P = {p1, . . . , pn} represent the machines The edges of L are the communication links Each link (pi, pj) has a bandwidth bw(pi, pj)

τ : set of task types available

Each machine pi is able to perform a subset of τ. t ∈ τ is available on the machine pi, w(t, pi) is the time to perform a task of type t on pi.

a(i, j) is the machine on which T j

i is assigned.

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 6 / 28

slide-7
SLIDE 7

Context GA Scheduling Simulation General Dags Identical Intrees

Framework model - 3

Communication model

  • ne-port model
  • ne data transmitted / communication link
  • ne reception and one transmission / node

R(pk, pi) = {(pj, pj′) ∈ L} is a route from pk to pi. Problem definition Static scheduling Makespan optimization for the collection of worflows

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 7 / 28

slide-8
SLIDE 8

Context GA Scheduling Simulation General Dags Identical Intrees

Related works

Workflow Scheduling Makespan optimization : NP-Hard Problem List based heuristics : HEFT, Critical Path, etc. Difficult in heterogeneous contexts Advanced algorithms GA for scheduling

GA give good results on complex systems But still a heuristic, distance to optimal ?

Steady State :

flow optimization identical intrees

  • ptimal results

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 8 / 28

slide-9
SLIDE 9

Context GA Scheduling Simulation General Dags Identical Intrees

Steady-state Scheduling

A B C D A B C D

C B A B C A B C D A B B C D A A B C D A C D A B B A A D C B A D C B A A C C D D B B A A

...

B C C D D B C D D C D A B C D D A B D A B C D A A B C D B C D A C D B C

Period N Period 2 Period 1

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 9 / 28

slide-10
SLIDE 10

Context GA Scheduling Simulation General Dags Identical Intrees

Contents

1

Context

2

GA Scheduling

3

Simulation

4

General Dags

5

Identical Intrees

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 10 / 28

slide-11
SLIDE 11

Context GA Scheduling Simulation General Dags Identical Intrees

GA without communication costs

Classical GA for workflow : gene = task chromosome one row per processor phenotype = schedule fitness = 1/makespan population, generation, crossover, mutation ...

T0 T3 T4 T0 T4 P0 P0 T2 T1 P1 P2 T3 T1 T2 P1 P2

Do not take communication into account

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 11 / 28

slide-12
SLIDE 12

Context GA Scheduling Simulation General Dags Identical Intrees

With Communication Costs

Communications in the chromosome Communication task One row per communication link Dependencies to the source and target node -> inconsistent communications Poor efficiency Evaluation function Communications depends upon tasks placement Fitness evaluation with comunication costs Used solution

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 12 / 28

slide-13
SLIDE 13

Context GA Scheduling Simulation General Dags Identical Intrees

Algorithm : fitness of a chromosome

Data : TToSched : remaining tasks, C(T j

i ) : completion time of T j i , σ(T j i ) : start

time of T j

i on pa(i,j), δ(pu) : next time pu is idle, w(t, pi) : the time to

perform a task of type t on pi, CT(F j

k,i) : the communication time to

send F j

k,i along route R(pa(k,j), pa(i,j))

TToSched ← T while TToSched = ∅ do choose a free task T j

i ∈ TToSched (EFT heuristic)

Tpred ← {T j

k|(T j k, T j i ) ∈ Dj}

and σ(T j

i ) ← 0

foreach task T j

k ∈ Tpred do

σ(T j

i ) ← max(σ(T j i ), C(T j k) + CT(F j k,i))

σ(T j

i ) ← max(δ(pa(i,j)), σ(T j i ))

C(T j

i ) ← σ(T j i ) + w(t(i, j), pa(i,j))

δ(pa(i,j)) ← C(T j

i )

and TToSched ← TToSched \ {T j

i }

return fitness(ch) = 1/Cmax = 1/maxT j

i ∈T (C(T k

i )) Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 13 / 28

slide-14
SLIDE 14

Context GA Scheduling Simulation General Dags Identical Intrees

Contents

1

Context

2

GA Scheduling

3

Simulation

4

General Dags

5

Identical Intrees

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 14 / 28

slide-15
SLIDE 15

Context GA Scheduling Simulation General Dags Identical Intrees

Experimental settings

Simulations SimGrid-MSG GA = 200 individuals Platforms Random platform generation : uniform distribution Platform size : 4 to 10 nodes Homogeneous Heterogeneous CCR : communication to computation ratio

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 15 / 28

slide-16
SLIDE 16

Context GA Scheduling Simulation General Dags Identical Intrees

Experimental settings - 2

Applications Batch sizes from 1 to 10.000 Applications : 4 to 12 tasks 1900 simulations of platform/application Heterogeneity :

Execution from 1 to 10 Communications from 1 to 4

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 16 / 28

slide-17
SLIDE 17

Context GA Scheduling Simulation General Dags Identical Intrees

Contents

1

Context

2

GA Scheduling

3

Simulation

4

General Dags

5

Identical Intrees

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 17 / 28

slide-18
SLIDE 18

Context GA Scheduling Simulation General Dags Identical Intrees

Communication Model

No cost Static 1-route Bellman-Ford 3-route Bellman-Ford

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 18 / 28

slide-19
SLIDE 19

Context GA Scheduling Simulation General Dags Identical Intrees

Communication Model - Results

10 20 30 40 50 10 100 1000 10000

Percentage of experiments with a RMO above 0.9 Number of jobs

GA no comm GA static route GA 1−route Bellman−Ford GA 3−routes Bellman−Ford LIST

Percentage of experiments with a RMO above 0.8 Number of jobs

GA no comm GA static route GA 1−route Bellman−Ford GA 3−routes Bellman−Ford LIST 10 20 30 40 50 10 100 1000 10000

FIGURE: Comparing different algorithms to choose the route

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 19 / 28

slide-20
SLIDE 20

Context GA Scheduling Simulation General Dags Identical Intrees

GA Improvement (3-Bellman-Ford)

20 40 60 80 100 10 100 1000 10000 Percentage of GA experiments with a makespan improvement of X % relative to LIST Number of jobs 0% 10% 20% 30%

  • a. Improvement for different

DAGs

20 40 60 80 100 10 100 1000 10000 Percentage of GA experiments with a makespan improvement of X % relative to LIST Number of jobs 0% 10% 20% 30%

  • b. Improvement for identical

DAGs

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 20 / 28

slide-21
SLIDE 21

Context GA Scheduling Simulation General Dags Identical Intrees

Contents

1

Context

2

GA Scheduling

3

Simulation

4

General Dags

5

Identical Intrees

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 21 / 28

slide-22
SLIDE 22

Context GA Scheduling Simulation General Dags Identical Intrees

Relative Measure to Optimal

Distance to optimal ? Algorithm improves the quality of the results Case of collection of intrees : Steady state algorithm gives

  • ptimal flow

Lower bound Relative measure to Optimal (RMO) Optimal throughput ρ Lower bound L0 = N

ρ , N number of intrees

RMO = Lo makespanr , makespanr result of the algorithm

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 22 / 28

slide-23
SLIDE 23

Context GA Scheduling Simulation General Dags Identical Intrees

Fully homogeneous platforms, CCR ≈ 0.01

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 40 60 80 100 Percent of experiments with a RMO above ’threshold’ GATS Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • a. GA algorithm

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 40 60 80 100 Percent of experiments with a RMO above ’threshold’ Steady−state Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • b. Steady-State

algorithm

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 40 60 80 100 Percent of experiments with a RMO above ’threshold’ LIST Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • c. LIST algorithm

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 23 / 28

slide-24
SLIDE 24

Context GA Scheduling Simulation General Dags Identical Intrees

Fully homogeneous platforms, CCR ≈ 1

a RMO above threshold

Percent of experiments with Number of jobs threshold GA

100 80 60 40 20 1 0.9 0.8 0.7 0.6 0.5 0.4 0.3 0.2 0.1 10000 1000 100 10 1

  • a. GA algorithm

a RMO above threshold

Percent of experiments with

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 40 60 80 100

threshold Number of jobs Steady−state

  • b. Steady-State

algorithm

a RMO above threshold

Percent of experiments with

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 40 60 80 100

threshold Number of jobs LIST

  • c. LIST algorithm

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 24 / 28

slide-25
SLIDE 25

Context GA Scheduling Simulation General Dags Identical Intrees

Fully heterogeneous platforms, CCR ≈ 0.01

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 20 30 40 50 60 70 80 90 100 Percent of experiments with a RMO above ’threshold’ GATS Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • a. GA algorithm

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 30 40 50 60 70 80 90 100 Percent of experiments with a RMO above ’threshold’ Steady−state Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • b. Steady-State

algorithm

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 30 40 50 60 70 80 90 100 Percent of experiments with a RMO above ’threshold’ LIST Number of jobs threshold Percent of experiments with a RMO above ’threshold’

  • c. LIST algorithm

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 25 / 28

slide-26
SLIDE 26

Context GA Scheduling Simulation General Dags Identical Intrees

Fully heterogeneous platforms, CCR ≈ 1

Percentage of experiments with a RMO above ’threshold’ GA threshold Number of jobs

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60 70 80 90 100

  • a. GA algorithm

Percentage of experiments with a RMO above ’threshold’ threshold Number of jobs Steady−state

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60 70 80 90 100

  • b. Steady-State

algorithm

Percentage of experiments with a RMO above ’threshold’ threshold Number of jobs LIST

1 10 100 1000 10000 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60 70 80 90 100

  • c. LIST algorithm

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 26 / 28

slide-27
SLIDE 27

Context GA Scheduling Simulation General Dags Identical Intrees

Conclusion and future works

Algorithm’s performance : GA Scheduling for batches of workflows on SOA Grids with communication costs Collection of different workflows Identical intrees, comparison to optimal Complex implementation Future Works Other communication models Other Genetic representation, network driven

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 27 / 28

slide-28
SLIDE 28

Context GA Scheduling Simulation General Dags Identical Intrees

Thank you !

Laurent PHILIPPE Genetic Algorithm to Schedule Workflows 28 / 28