Starting Workflow Tasks Before Theyre Ready Wladislaw Gusew, Bj - - PowerPoint PPT Presentation

starting workflow tasks before they re ready
SMART_READER_LITE
LIVE PREVIEW

Starting Workflow Tasks Before Theyre Ready Wladislaw Gusew, Bj - - PowerPoint PPT Presentation

Starting Workflow Tasks Before Theyre Ready Wladislaw Gusew, Bj orn Scheuermann Computer Engineering Group, Humboldt University of Berlin Agenda Introduction Execution semantics Methods and tools Simulation results


slide-1
SLIDE 1

Starting Workflow Tasks Before They’re Ready

Wladislaw Gusew, Bj¨

  • rn Scheuermann

Computer Engineering Group, Humboldt University of Berlin

slide-2
SLIDE 2

Agenda

◮ Introduction ◮ Execution semantics ◮ Methods and tools ◮ Simulation results ◮ Experimental results ◮ Conclusion

1 / 21

slide-3
SLIDE 3

Big data in research

2 / 21

slide-4
SLIDE 4

Scientific workflow example

◮ Directed Acyclic Graph

(DAG)

◮ Executed on distributed

systems

◮ Aggregation and broadcast

types of tasks

◮ Demanding for network

resources

3 / 21

slide-5
SLIDE 5

Execution semantics

4 / 21

slide-6
SLIDE 6

Execution semantics

4 / 21

slide-7
SLIDE 7

Execution semantics

◮ But in reality resources are limited ◮ Execute only a subset of parent tasks concurrently

(insufficient number of workers)

◮ Congestion of network (all parent tasks have the same priority)

4 / 21

slide-8
SLIDE 8

Example execution

5 / 21

slide-9
SLIDE 9

Example execution

5 / 21

slide-10
SLIDE 10

Example execution

5 / 21

slide-11
SLIDE 11

Example execution

◮ Network congestion can slow down processing even further

(effects of data losses at the transport protocol layer)

◮ High delay to the start of the aggregation task ◮ Low performance and

high execution costs (e.g., in computation clouds)

5 / 21

slide-12
SLIDE 12

What can we do to improve this?

6 / 21

slide-13
SLIDE 13

What can we do to improve this?

6 / 21

slide-14
SLIDE 14

What can we do to improve this?

6 / 21

slide-15
SLIDE 15

What can we do to improve this?

6 / 21

slide-16
SLIDE 16

What can we do to improve this?

6 / 21

slide-17
SLIDE 17

What can we do to improve this?

6 / 21

slide-18
SLIDE 18

What can we do to improve this?

List of actions:

  • 1. Obtain information on task’s input characteristics
  • 2. Refine the workflow and inform the execution engine
  • 3. Let the aggregation task ”feel comfortable” in changed setting

6 / 21

slide-19
SLIDE 19

What can we do to improve this?

List of actions:

  • 1. Obtain information on task’s input characteristics
  • 2. Refine the workflow and inform the execution engine
  • 3. Let the aggregation task ”feel comfortable” in changed setting

6 / 21

slide-20
SLIDE 20

Obtaining input characteristics

  • 1. Annotations to workflows
  • 2. Manual code review
  • 3. Automated profiling

7 / 21

slide-21
SLIDE 21

Automated profiling

◮ Operating system instrumentation tool ◮ Enables interception of system calls

(file open, read/write, file close)

◮ Record and evaluate logfiles with

traces of conducted file accesses.

8 / 21

slide-22
SLIDE 22

Automated profiling

◮ Operating system instrumentation tool ◮ Enables interception of system calls

(file open, read/write, file close)

◮ Record and evaluate logfiles with

traces of conducted file accesses.

0.5 1 1.5 2 2.5 3 0.5 1 1.5 2 2.5 3 Read accesses [MB] Execution progress [108 CPU cycles] Reads by mAdd in a small workflow 0.5 1 1.5 2 2.5 3 3.5 4 4.5 2 4 6 8 10 12 14 16 18 Read accesses [MB] Execution progress [108 CPU cycles] Reads by mAdd in a medium sized workflow 8 / 21

slide-23
SLIDE 23

Refining workflow by transforming DAG

9 / 21

slide-24
SLIDE 24

Refining workflow by transforming DAG

9 / 21

slide-25
SLIDE 25

Refining workflow by transforming DAG

9 / 21

slide-26
SLIDE 26

Refining workflow by transforming DAG

9 / 21

slide-27
SLIDE 27

Realizing virtual task split

◮ Real task is transparently wrapped ◮ FUSE enables the setup of a virtual

File system in USEr space

◮ Access to input files is performed

through our wrapper

◮ Wrapper is responsible for maintaining

the correct execution logic

10 / 21

slide-28
SLIDE 28

Evaluation with the Montage workflow

11 / 21

slide-29
SLIDE 29

Simulating workflow execution

◮ Java-based simulation framework for scientific workflows ◮ Simulates an execution on a Pegasus/HTCondor stack ◮ Use provided Montage workflows with 25, 50, 100, 1000 tasks ◮ Python script conducted DAG transformation of DAX files ◮ Network configured as bottleneck (by bandwidth limitation)

  • W. Chen and E. Deelman, ”WorkflowSim: A toolkit for simulating scientific workflows

in distributed environments,” in eScience’12.

12 / 21

slide-30
SLIDE 30

Simulation results

13 / 21

slide-31
SLIDE 31

Simulation results

13 / 21

slide-32
SLIDE 32

Variation of number of tasks

1 10 100 1000 25 50 100 1000 Total workflow runtime (log.) [s] Number of tasks Simulation results for 50 workers and max-min Normal Split 15% 19% 25% 31%

14 / 21

slide-33
SLIDE 33

Variation of workers

15 / 21

slide-34
SLIDE 34

Variation of workers

100 150 200 250 300 350 400 450 5 10 50 100 Total workflow runtime [s] Number of workers Simulation results for Montage100 and min-min Normal Split 10% 14% 26% 25%

16 / 21

slide-35
SLIDE 35

Variation of scheduling algorithms

17 / 21

slide-36
SLIDE 36

Variation of scheduling algorithms

50 100 150 200 250 300 350

M i n

  • m

i n M a x

  • m

i n R

  • u

n d

  • r
  • b

i n H E F T D H E F T R a n d

  • m

Total workflow runtime [s] Scheduling algorithm Simulation results for Montage100 on 100 workers Normal Split 25% 27% 28% 25% 17% 34%

18 / 21

slide-37
SLIDE 37

Evaluation in a computing cluster

◮ Small cluster of up to 10 compute nodes ◮ Intel i7 CPU@ 2.5GHz, 8GB RAM, connected to common

network switch with 1Gbit/s

◮ Execute Montage 133 workflow in Pegasus/HTCondor ◮ Network bandwidth was limited on application layer to

10Mbit/s

◮ 10 repetitions, mean values with 95% confidence intervals

19 / 21

slide-38
SLIDE 38

Measurement results

20 40 60 80 100 120 140 160 180 200 1 2 3 4 5 6 7 8 9 10 Total workflow runtime [s] Number of computing nodes Computing cluster results for 1...10 workers Original Montage133 Transformed Montage133

20 / 21

slide-39
SLIDE 39

Conclusion

◮ Many ”legacy” workflows exist which are executed with classic

semantics

◮ Our approach is applicable to aggregation tasks that are often

the most time intensive tasks in a workflow

◮ By using DAG transformation, no changes to task

implementations and execution engines are required

21 / 21

slide-40
SLIDE 40

Conclusion

◮ Many ”legacy” workflows exist which are executed with classic

semantics

◮ Our approach is applicable to aggregation tasks that are often

the most time intensive tasks in a workflow

◮ By using DAG transformation, no changes to task

implementations and execution engines are required

◮ Simulation and real experiment show that performance can be

improved by up to 15%

◮ Potential of outperforming the original workflow grows with

increasing #workers and #tasks

21 / 21