integrated scientific workflow management for the emulab
play

Integrated Scientific Workflow Management for the Emulab Network - PowerPoint PPT Presentation

Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide Eide , Leigh Eric , Leigh Stoller Stoller, , Tim Stack, Juliana Freire Freire, , Tim Stack, Juliana and Jay Lepreau Lepreau and Jay University of Utah,


  1. Integrated Scientific Workflow Management for the Emulab Network Testbed Eric Eide Eide , Leigh Eric , Leigh Stoller Stoller, , Tim Stack, Juliana Freire Freire, , Tim Stack, Juliana and Jay Lepreau Lepreau and Jay University of Utah, University of Utah, School of Computing School of Computing USENIX 2006 / June 3, 2006 USENIX 2006 / June 3, 2006

  2. This Talk in One Slide � Current network Current network testbeds testbeds � � …manage the “laboratory” …manage the “laboratory” � � …not the experimentation process. …not the experimentation process. � � → → A big problem for large A big problem for large- -scale activities! scale activities! � � Evolve Emulab for experiments based on Evolve Emulab for experiments based on � scientific workflows scientific workflows � Big mutual benefits: Big mutual benefits: testbed testbed ↔ ↔ workflow workflow � � Work in progress Work in progress � 2

  3. Example: UAV Simulation � A distributed, real A distributed, real- -time time � images → → images application application UAV UAV � Evaluate improvements Evaluate improvements � to real- -time middleware time middleware to real alerts → → alerts Receiver Receiver � vs. CPU load vs. CPU load � � vs. network load vs. network load � � 4 research groups 4 research groups � ATR ATR � x 19 experiments x 19 experiments � ← images images ← � x 56 metrics x 56 metrics � 3

  4. Use Emulab write “ns” file write “ns” file “swap in” swap in” “ Concept Experiment Emulate Concept Experiment Emulate 4

  5. Problems Solved � I get machines! I get machines! � � 328 PCs, and more 328 PCs, and more � � Time Time- - & space & space- -shared shared � � Loads OS and software Loads OS and software � � I get network! I get network! � � Config Config. topology & quality . topology & quality � � I get to collaborate! I get to collaborate! � � Available to researchers Available to researchers � and educators worldwide and educators worldwide � File storage, email, … File storage, email, … � 5

  6. Problems Not Solved � “ “Now what?” Now what?” � � Getting off the ground Getting off the ground � � Run all my software Run all my software � � Add instrumentation Add instrumentation � � Collect all my data Collect all my data � � Analyze it Analyze it � � Scaling up Scaling up � � 19 configurations 19 configurations � � Automation Automation � 6

  7. More Problems Not Solved � “ “How did I get here?” How did I get here?” � � Over the short term… Over the short term… � � “Where are the results “Where are the results � I got last week?” I got last week?” � “How did I get those “How did I get those � results anyway?” results anyway?” � “What if…?” “What if…?” � � …and the long term …and the long term � � Reproducing results Reproducing results � � Reusing artifacts Reusing artifacts � 7

  8. Idea: Scientific Workflow � Managing activities, inputs, and outputs is the Managing activities, inputs, and outputs is the � job of a scientific workflow system scientific workflow system job of a � Our approach: Our approach: evolve Emulab with evolve Emulab with � integrated support for scientific workflows integrated support for scientific workflows � Build on existing abstractions & mechanisms Build on existing abstractions & mechanisms � � Resource focus Resource focus → → user & task focus user & task focus � � Users work “within” and “across” experiments Users work “within” and “across” experiments � 8

  9. Contributions � Address demand + opportunity Address demand + opportunity � � Users need to manage large Users need to manage large- -scale complexity scale complexity � � A symbiotic combination: A symbiotic combination: leverage and impact leverage and impact � � Advance the applicability of Advance the applicability of testbeds testbeds � � Not just Emulab Not just Emulab — — e.g., e.g., PlanetLab PlanetLab and DETER and DETER � � Advance scientific workflow systems Advance scientific workflow systems � � Exploit Exploit testbed testbed capabilities capabilities — — e.g., “total control” e.g., “total control” � � Address Address testbed testbed requirements requirements — — e.g., flexible use e.g., flexible use � 9

  10. Issue: Encapsulation � Current “experiment” model Current “experiment” model � is not fully encapsulating is not fully encapsulating Topology + static events Topology + static events � � Need everything else! Need everything else! � � ns file OSes packages ns file OSes packages � Challenge: specification Challenge: specification � Complete and precise… Complete and precise… � � …w/o huge user burden …w/o huge user burden � � my software inputs outputs my software inputs outputs � Approach: be automatic Approach: be automatic � NFS monitors Subversion repo. E.g., track files used E.g., track files used � � packet monitors datapository (DB) Snapshot, archive, restore Snapshot, archive, restore � � AJAX GUI research filesystems User can refine “extent” User can refine “extent” � � 10

  11. Issue: Definition vs. Execution � Current “experiment” has Current “experiment” has � multiple roles multiple roles Definition Definition � � The thing that you run The thing that you run � � � Challenge: representing Challenge: representing � relationships relationships Multiple runs of one setup Multiple runs of one setup � � Similar configurations Similar configurations � � � Approach: a new model of Approach: a new model of � experimentation experimentation Separate the roles Separate the roles � � Evolve the new abstractions Evolve the new abstractions � � 11

  12. New Model n = 4 n = 2 n = 4 n = 2 � Template Template � � Swapin Swapin � � Experiment Experiment � � Activity Activity � � Record Record � 12

  13. Issue: History � Research and educational Research and educational � plans are dynamic plans are dynamic � By design & by discovery By design & by discovery � � Challenge: safe exploration Challenge: safe exploration � rev 1.1 rev 1.1 � Fork Fork � oops: need new oops: need new � Back up Back up � measurements measurements bigger nets bigger nets � Approach: keep history & Approach: keep history & � what about support temporal navigation what about support temporal navigation loss? loss? � Keep template revisions Keep template revisions � � Track provenance Track provenance � add params params add � Locate, repeat, and reuse Locate, repeat, and reuse � 13

  14. Implementation in Progress Definition Definition Data Analysis Data Analysis Execution Execution & History & History 14

  15. Conclusion � Large and powerful Large and powerful testbeds testbeds � � …enable complex and large …enable complex and large- -scale activities scale activities � � …lead to complex and large …lead to complex and large- -scale workflow scale workflow � management problems management problems � Integrated workflow management can Integrated workflow management can � leverage the strengths of testbeds testbeds leverage the strengths of � Systems approach Systems approach — — and systems challenges and systems challenges � � → → Better Better testbeds testbeds and workflow systems and workflow systems � 15

  16. Thanks! http://www.emulab.net/ Thanks!

  17. 17 Extra Slides After This Point

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend