Integrated Scientific Workflow Management for the Emulab Network - - PowerPoint PPT Presentation

integrated scientific workflow management for the emulab
SMART_READER_LITE
LIVE PREVIEW

Integrated Scientific Workflow Management for the Emulab Network - - PowerPoint PPT Presentation

Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide Eide , Leigh Eric , Leigh Stoller Stoller, , Tim Stack, Juliana Freire Freire, , Tim Stack, Juliana and Jay Lepreau Lepreau and Jay University of Utah,


slide-1
SLIDE 1

Integrated Scientific Workflow Management for the Emulab Network Testbed

Eric Eric Eide Eide, Leigh , Leigh Stoller Stoller, , Tim Stack, Juliana Tim Stack, Juliana Freire Freire, , and Jay and Jay Lepreau Lepreau

University of Utah, University of Utah, School of Computing School of Computing USENIX 2006 / June 3, 2006 USENIX 2006 / June 3, 2006

slide-2
SLIDE 2

2

This Talk in One Slide

  • Current network

Current network testbeds testbeds

  • …manage the “laboratory”

…manage the “laboratory”

  • …not the experimentation process.

…not the experimentation process.

→ A big problem for large A big problem for large-

  • scale activities!

scale activities!

  • Evolve Emulab for experiments based on

Evolve Emulab for experiments based on scientific workflows scientific workflows

  • Big mutual benefits:

Big mutual benefits: testbed testbed ↔ ↔ workflow workflow

  • Work in progress

Work in progress

slide-3
SLIDE 3

3

Example: UAV Simulation

  • A distributed, real

A distributed, real-

  • time

time application application

  • Evaluate improvements

Evaluate improvements to real to real-

  • time middleware

time middleware

  • vs. CPU load
  • vs. CPU load
  • vs. network load
  • vs. network load
  • 4 research groups

4 research groups

  • x 19 experiments

x 19 experiments

  • x 56 metrics

x 56 metrics

UAV UAV Receiver Receiver ATR ATR

images images → → ← ← images images alerts alerts → →

slide-4
SLIDE 4

4

Use Emulab

Concept Concept Experiment Experiment Emulate Emulate

write “ns” file write “ns” file “ “swap in” swap in”

slide-5
SLIDE 5

5

Problems Solved

  • I get machines!

I get machines!

  • 328 PCs, and more

328 PCs, and more

  • Time

Time-

  • & space

& space-

  • shared

shared

  • Loads OS and software

Loads OS and software

  • I get network!

I get network!

  • Config
  • Config. topology & quality

. topology & quality

  • I get to collaborate!

I get to collaborate!

  • Available to researchers

Available to researchers and educators worldwide and educators worldwide

  • File storage, email, …

File storage, email, …

slide-6
SLIDE 6

6

Problems Not Solved

“Now what?” Now what?”

  • Getting off the ground

Getting off the ground

  • Run all my software

Run all my software

  • Add instrumentation

Add instrumentation

  • Collect all my data

Collect all my data

  • Analyze it

Analyze it

  • Scaling up

Scaling up

  • 19 configurations

19 configurations

  • Automation

Automation

slide-7
SLIDE 7

7

More Problems Not Solved

“How did I get here?” How did I get here?”

  • Over the short term…

Over the short term…

  • “Where are the results

“Where are the results I got last week?” I got last week?”

  • “How did I get those

“How did I get those results anyway?” results anyway?”

  • “What if…?”

“What if…?”

  • …and the long term

…and the long term

  • Reproducing results

Reproducing results

  • Reusing artifacts

Reusing artifacts

slide-8
SLIDE 8

8

Idea: Scientific Workflow

  • Managing activities, inputs, and outputs is the

Managing activities, inputs, and outputs is the job of a job of a scientific workflow system scientific workflow system

  • Our approach:

Our approach: evolve Emulab with evolve Emulab with integrated support for scientific workflows integrated support for scientific workflows

  • Build on existing abstractions & mechanisms

Build on existing abstractions & mechanisms

  • Resource focus

Resource focus → → user & task focus user & task focus

  • Users work “within” and “across” experiments

Users work “within” and “across” experiments

slide-9
SLIDE 9

9

Contributions

  • Address demand + opportunity

Address demand + opportunity

  • Users need to manage large

Users need to manage large-

  • scale complexity

scale complexity

  • A symbiotic combination:

A symbiotic combination: leverage and impact leverage and impact

  • Advance the applicability of

Advance the applicability of testbeds testbeds

  • Not just Emulab

Not just Emulab — — e.g., e.g., PlanetLab PlanetLab and DETER and DETER

  • Advance scientific workflow systems

Advance scientific workflow systems

  • Exploit

Exploit testbed testbed capabilities capabilities — — e.g., “total control” e.g., “total control”

  • Address

Address testbed testbed requirements requirements — — e.g., flexible use e.g., flexible use

slide-10
SLIDE 10

10

Issue: Encapsulation

  • Current “experiment” model

Current “experiment” model is not fully encapsulating is not fully encapsulating

  • Topology + static events

Topology + static events

  • Need everything else!

Need everything else!

  • Challenge: specification

Challenge: specification

  • Complete and precise…

Complete and precise…

  • …w/o huge user burden

…w/o huge user burden

  • Approach: be automatic

Approach: be automatic

  • E.g., track files used

E.g., track files used

  • Snapshot, archive, restore

Snapshot, archive, restore

  • User can refine “extent”

User can refine “extent”

ns file ns file OSes OSes packages packages my software my software inputs inputs

  • utputs
  • utputs

NFS monitors packet monitors AJAX GUI Subversion repo. datapository (DB) research filesystems

slide-11
SLIDE 11

11

Issue: Definition vs. Execution

  • Current “experiment” has

Current “experiment” has multiple roles multiple roles

  • Definition

Definition

  • The thing that you run

The thing that you run

  • Challenge: representing

Challenge: representing relationships relationships

  • Multiple runs of one setup

Multiple runs of one setup

  • Similar configurations

Similar configurations

  • Approach: a new model of

Approach: a new model of experimentation experimentation

  • Separate the roles

Separate the roles

  • Evolve the new abstractions

Evolve the new abstractions

slide-12
SLIDE 12

12

New Model

  • Template

Template

  • Swapin

Swapin

  • Experiment

Experiment

  • Activity

Activity

  • Record

Record

n = 2 n = 2 n = 4 n = 4

slide-13
SLIDE 13

13

Issue: History

  • Research and educational

Research and educational plans are dynamic plans are dynamic

  • By design & by discovery

By design & by discovery

  • Challenge: safe exploration

Challenge: safe exploration

  • Fork

Fork

  • Back up

Back up

  • Approach: keep history &

Approach: keep history & support temporal navigation support temporal navigation

  • Keep template revisions

Keep template revisions

  • Track provenance

Track provenance

  • Locate, repeat, and reuse

Locate, repeat, and reuse

rev 1.1 rev 1.1 bigger nets bigger nets add add params params

  • ops: need new
  • ops: need new

measurements measurements what about what about loss? loss?

slide-14
SLIDE 14

14

Implementation in Progress

Definition Definition Execution Execution & History & History Data Analysis Data Analysis

slide-15
SLIDE 15

15

Conclusion

  • Large and powerful

Large and powerful testbeds testbeds

  • …enable complex and large

…enable complex and large-

  • scale activities

scale activities

  • …lead to complex and large

…lead to complex and large-

  • scale workflow

scale workflow management problems management problems

  • Integrated workflow management can

Integrated workflow management can leverage the strengths of leverage the strengths of testbeds testbeds

  • Systems approach

Systems approach — — and systems challenges and systems challenges

→ Better Better testbeds testbeds and workflow systems and workflow systems

slide-16
SLIDE 16

http://www.emulab.net/

Thanks! Thanks!

slide-17
SLIDE 17

17

Extra Slides After This Point