Parallel scripting with Swift for applications at the petascale and - PowerPoint PPT Presentation

Parallel scripting with Swift for applications at the petascale and beyond VecPar PEEPS Workshop Berkeley, CA – June 22, 2010 Michael Wilde – wilde@mcs.anl.gov Computation Institute, University of Chicago and Argonne National Laboratory www.ci.uchicago.edu/swift 1

Problems addressed by Swift • Many applications need loosely coupled scripting • Swift harnesses parallel & distributed resources through a simple scripting language • Productivity gains by enabling use of more powerful systems with less concern for the mechanics 2

Modeling uncertainty for CIM ‐ EARTH Parallel AMPL workflow by Joshua Elliott, Meredith Franklin, Todd Munson, Allan Espinosa.

Fast Ocean Atmosphere Model (MPI) NCAR Manual config, execution, bookkeeping VDS on Teragrid Automated Visualization courtesy Pat Behling and Yun Liu, UW Work of Madison 4 Veronica Nefedova and Rob Jacob, Argonne

Problem: Drug screening at APS O(Millions) of drug 2M+ ligands candidates O(tens) of fruitful (B) candidates 5 for wetlab & (Mike Kubal, Benoit Roux, and others) APS

Manually prep Manually prep NAB script ZINC DOCK6 rec file FRED rec file NAB parameters 3-D Script (defines flexible structures DOCK6 FRED Template residues, Receptor Receptor #MDsteps) 6 2M (1 per protein: (1 per protein: structures GB defines pocket defines pocket PDB 1 (6 GB) to bind to) to bind to) BuildNABScript protein protein descriptions (1MB) Amber prep: NAB start 2. AmberizeReceptor Script 4. perl: gen nabscript ~4M x 60s x 1 cpu FRED DOCK6 ~60K cpu-hrs Select best ~5K Select best ~5K Amber Score: ~10K x 20m x 1 cpu 1. AmberizeLigand Amber ~3K cpu-hrs 3. AmberizeComplex 5. RunNABScript Select best ~500 ~500 x 10hr x 100 cpu GCMC ~500K cpu-hrs end report ligands complexes 6 Work of Andrew Binkoaski and Michael Kubal

Problem: preprocessing and analysis of neuroscience experiments Many 3a.h 3a.i 4a.h 4a.i ref.h ref.i 5a.h 5a.i 6a.h 6a.i Data align_warp/1 align_warp/3 align_warp/5 align_warp/7 Files: 3a.w 4a.w 5a.w 6a.w Many reslice/2 reslice/4 reslice/6 reslice/8 Application Programs: 3a.s.h 3a.s.i 4a.s.h 4a.s.i 5a.s.h 5a.s.i 6a.s.h 6a.s.i softmean/9 atlas.h atlas.i slicer/10 slicer/12 slicer/14 atlas_x.ppm atlas_y.ppm atlas_z.ppm convert/11 convert/13 convert/15 atlas_x.jpg atlas_y.jpg atlas_z.jpg

Automated image registration for spatial normalization AIRSN workflow: AIRSN workflow expanded: reorientRun reorient re orient/ 25 reorie nt/27 reorie nt/29 reori ent/09 reori ent/01 re orient /05 re orient/ 31 reorient /33 reorie nt/35 reo r i ent/3 7 reorientRun reorient reorien t/51 re orient/ 52 re orient/ 53 reo rient/1 0 reori ent/02 reorie nt/06 reorie nt/54 r e orient /55 reo r i ent/5 6 reorie nt/57 random_select alignlinear al ignli near/1 1 al ignli near/ 03 ali gnli near/07 alignlinearRun reslice re s lic e/12 res lic e/04 res lic e/08 resliceRun softmean s oftm e an/13 softmean alignlinear ali gnli near/1 7 alignlinear combine_warp c om bi newarp/21 combinewarp reslice_warp re s li c e_warp/2 6 r e s li c e_warp/2 8 res li c e_warp/ 30 res li c e_ w arp /24 res l ic e_ w arp /22 res l ic e _w a r p /23 res l ic e _w a rp/32 res lic e _warp/34 res lic e_warp/36 res lic e_warp/38 reslice_warpRun strictmean s t r i c tm ean/39 strictmean binarize binarize bi nariz e/40 gsmooth gsmoothRun gs m ooth/4 4 gs m ooth /45 gs mooth/ 46 gs m ooth/4 3 gs m oo th/41 gs mooth/4 2 g s m oo th/47 g s mo oth/48 g s m o oth/49 gs moot h/50

Swift programs • A Swift script is a set of functions – Atomic functions wrap & invoke application programs (on parallel compute nodes) – Composite functions invoke other functions (run in Swift engine) • Data is typed as composable arrays and structures of files and simple scalar types (int, float, string) • Collections of persistent file structures are mapped into this data model as arrays and structures • Variables are single assignment • Expressions and statements are executed in data ‐ flow dependency order and concurrency • Members of datasets can be processed in parallel • Provenance is gathered as scripts execute 9

A simple Swift script To run the Image Magick app “convert”: convert ‐ rotate 180 $in $out 1 type imagefile { } // Declare a “file” type. 2 3 app (imagefile output) rotate (imagefile input) { 4 { 5 convert " ‐ rotate" 1 80 @input @output ; 6 } 7 8 imagefile image <"m 1 0 1 .20 1 0.060 1 .jpg">; 9 1 0 imagefile newimage <"output.jpg">; 11 1 2 newimage = rotate(image); 10

Execution is driven by data flow (int result) myproc (int input) 1 { 2 j = f(input); 3 k = g(input); 4 result = j + k; 5 } 6 j=f() and k=g() are computed in parallel. 7 This parallelism is automatic , based on futures; 8 Works recursively down the scripts’s call graph. 9 11

Parallelism via foreach { } type imagefile; // Declare a “file” type. 1 2 app (imagefile output) rotate (imagefile input) { 3 convert " ‐ rotate" " 1 80" @input @output; 4 } 5 Map inputs from local directory 6 imagefile observations[ ] <simple_mapper; prefix=“m 1 0 1‐ raw”>; 7 imagefile flipped[ ] <simple_mapper; prefix=“m 1 0 1‐ flipped”>; 8 9 Name outputs based on index 1 0 11 1 2 foreach obs,i in observations { flipped[i] = rotate(obs); 1 3 1 4 } Process all dataset members in parallel 12

Many domains process structured datasets Many 3a.h 3a.i 4a.h 4a.i ref.h ref.i 5a.h 5a.i 6a.h 6a.i Data Files: align_warp/1 align_warp/3 align_warp/5 align_warp/7 3a.w 4a.w 5a.w 6a.w Many reslice/2 reslice/4 reslice/6 reslice/8 Application Programs: 3a.s.h 3a.s.i 4a.s.h 4a.s.i 5a.s.h 5a.s.i 6a.s.h 6a.s.i softmean/9 atlas.h atlas.i slicer/10 slicer/12 slicer/14 atlas_x.ppm atlas_y.ppm atlas_z.ppm convert/11 convert/13 convert/15 atlas_x.jpg atlas_y.jpg atlas_z.jpg

Swift Data Mapping type Study { Group g[ ]; } type Group { Subject s[ ]; } On-Disk Data type Subject { Layout Swift’s Volume anat; in-memory Run run[ ]; data model } type Run { Volume v[ ]; } type Volume { Mapping function Mapping function Image img; or script Header hdr; or script }

Application: Protein structure prediction Fasta Fasta file file To run: psim –s 1ubq.fas –pdb p \ seq seq –temp 100.0 –inc 25.0 >log dt t dt t In Swift code: app (PDB pg, File log) predict (Protein seq, PSim application PSim application Float temp, Float dt) { psim "-s" @pseq.fasta "-pdb" @pg "–temp" temp ”-inc" dt; Swift app function Swift app function } “predict()” “predict()” Protein p <ext; exec="Pmap", id="1ubq">; pg pg log log ProtGeo structure; Encapsulation is the key to TextFile log; transparent distribution, (structure, log) = predict(p, 100., 25.); parallelization, and provenance

Parallelism via foreach { } 1000 predict() calls Analyze() foreach sim in [1:1000] { (structure[sim], log[sim]) = predict(p, 100., 25.); } result = analyze(structure)

Application: 3D Protein structure prediction type Fasta; // Primary protein sequence file in FASTA format 1 . type SecSeq; // Secodary structure file 2. type RamaMap; // “Ramachandra” mapping info files 3. type RamaIndex; 4. type ProtGeo; // PDB ‐ format file – protein geometry: 3D atom coords 5. type SimLog; 6. 7. type Protein { // Input file struct to protein simulator 8. Fasta fasta; // sequence to predict structure of 9. SecSeq secseq; // Initial secondary structure to use 1 0. ProtGeo native; // 3D structure from experimental data when known 11 . RamaMap map; 1 2. RamaIndex index; 1 3. 1 4. } 1 5. 1 6. type PSimCf { // Science configuration parameters to simulator float st; 1 7. float tui; 1 8. float coeff; 1 9. 20. } 2 1 . 22. type ProtSim { // Output file struct from protein simulator ProtGeo pgeo; 23. SimLog log; 24. 25. } 17

Protein structure prediction 1 . app (ProtGeo pgeo) predict (Protein pseq) 2. { 3. PSim @pseq.fasta @pgeo; 4. } 5. 6. (ProtGeo pg[ ]) doRound (Protein p, int n) { 7. foreach sim in [0:n ‐1 ] { 8. pg[sim] = predict(p); 9. } 1 0. } 11 . 1 2. Protein p <ext; exec="Pmap", id=" 1 af7">; 1 3. ProtGeo structure[ ]; 1 4. int nsim = 1 0000; 1 5. structure = doRound(p, nsim); 18

Protein structure prediction 1 (ProtSim psim[ ]) doRoundCf (Protein p, int n, PSimCf cf) { 2 foreach sim in [0:n ‐1 ] { 3 psim[sim] = predictCf(p, cf.st, cf.tui, cf.coeff ); 4 } 5 } 6 (boolean converged) analyze( ProtSim prediction[ ], int r, int numRounds) 7 { 8 if( r == (numRounds ‐1 ) ) { 9 converged = true; 1 0 } 11 else { 1 2 converged = test_convergence(prediction); 1 3 } 1 4 } 19

Protein structure prediction 1 . ItFix( Protein p, int nsim, int maxr, float temp, float dt) 2. { 3. ProtSim prediction[ ][ ]; 4. boolean converged[ ]; 5. PSimCf config; 6. 7. config.st = temp; 8. config.tui = dt; 9. config.coeff = 0. 1 ; 1 0. 11 . iterate r { 1 2. prediction[r] = 1 3. doRoundCf(p, nsim, config); 1 4. converged[r] = 1 5. analyze(prediction[r], r, maxr); 1 6. } until ( converged[r] ); 1 7. } 20

Parallel scripting with Swift for applications at the petascale and - PowerPoint PPT Presentation

Parallel scripting with Swift for applications at the petascale and beyond VecPar PEEPS Workshop Berkeley, CA June 22, 2010 Michael Wilde wilde@mcs.anl.gov Computation Institute, University of Chicago and Argonne National Laboratory

COMBINING SWIFT AND OBJECTIVE-C AGENDA Using Objective-C from Swift Using Swift from

SWIFT presentation SWIFT for Corporates - Do not share without SWIFT's prior consent 2 Whats

Good Morning SWIFT HI! I'm Marc Prud'hommeaux marc@glimpse.io Swift Public beta: June 2014

Swift Swiftly A quick introduction to the Swift language Oliver Jones Technical Director

Acquisition Scripting Nion Swift Workshop Chris Meyer 2018 October 4 Introduce STEM control,

OpenStack Swift OpenStack Summit Atlanta 2014 Martin Lanner & Hugo Kuo May 15, 2014 Agenda

Death by a 1000 Cuts: Bringing Swift to Windows Saleem Abdulrasool ( @ compnerd) Porting by a

Distribution Nion Swift Workshop Chris Meyer 2018 October 5 Nion Swift is software for

Scripting with Objects: A Comparative Presentation of Scripting With Perl and Python Avinash C.

The DOM Scripting Toolkit: jQuery Remy Sharp Left Logic Why JS Libraries? DOM scripting

Day 13: Scripting Workflows II DAGMan 2012 Fall Cartwright 1 Computer Sciences 368 Scripting

Exploring Scientific Discovery with Large-Scale Parallel Scripting Tim Armstrong 1 Justin M.

SWIFT: Administration SWIM Industry Collaboration Workshop #6 SWIM, Services & SWIFT

SWIFT: Administration SWIM Industry Collaboration Workshop #10 SWIM, Services & SWIFT

Swift Fox By Ruby Stucki Looks The swift fox is a small, light orange-tan fox with large ears,

Pippa Leary, CEO SWIFT MEDIA CEO Messages What is Swift? TRANSITIONING TO A STRONGER

TRACERJD: GENERIC TRACE-BASED DYNAMIC DEPENDENCE ANALYSIS WITH FINE-GRAINED LOGGING Haipeng Cai

Built- -In Self In Self- -Test for Programmable Test for Programmable Built I/O Buffers in

DUAL CONGESTION AWARENESS SCHEME IN ON-CHIP NETWORKS DEPARTMENT OF INFORMATION TECHNOLOGY

Reglerteknik 4 Kapitel 8 Kp bok och vningshfte p krbokhandeln William Sandqvist

How clean is your production line? 1 Presenting Hinke Dekter Accountmanager (Food and

Box-and-Pointer Notation Hierarchical structures are made up of parts, which themselves are

Tutorial 3: Slicing CISC422/853 Scott Grant Overview Getting Started (Eclipse)

What is Parametric Trace Slicing Good For? Giles Reger School of Computer Science, University of

Parallel scripting with Swift for applications at the petascale and - PowerPoint PPT Presentation

Parallel scripting with Swift for applications at the petascale and beyond VecPar PEEPS Workshop Berkeley, CA June 22, 2010 Michael Wilde wilde@mcs.anl.gov Computation Institute, University of Chicago and Argonne National Laboratory

COMBINING SWIFT AND OBJECTIVE-C AGENDA Using Objective-C from Swift Using Swift from

SWIFT presentation SWIFT for Corporates - Do not share without SWIFT's prior consent 2 Whats

Good Morning SWIFT HI! I'm Marc Prud'hommeaux marc@glimpse.io Swift Public beta: June 2014

Swift Swiftly A quick introduction to the Swift language Oliver Jones Technical Director

Acquisition Scripting Nion Swift Workshop Chris Meyer 2018 October 4 Introduce STEM control,

OpenStack Swift OpenStack Summit Atlanta 2014 Martin Lanner &amp; Hugo Kuo May 15, 2014 Agenda

Death by a 1000 Cuts: Bringing Swift to Windows Saleem Abdulrasool ( @ compnerd) Porting by a

Distribution Nion Swift Workshop Chris Meyer 2018 October 5 Nion Swift is software for

Scripting with Objects: A Comparative Presentation of Scripting With Perl and Python Avinash C.

The DOM Scripting Toolkit: jQuery Remy Sharp Left Logic Why JS Libraries? DOM scripting

Day 13: Scripting Workflows II DAGMan 2012 Fall Cartwright 1 Computer Sciences 368 Scripting

Exploring Scientific Discovery with Large-Scale Parallel Scripting Tim Armstrong 1 Justin M.

SWIFT: Administration SWIM Industry Collaboration Workshop #6 SWIM, Services &amp; SWIFT

SWIFT: Administration SWIM Industry Collaboration Workshop #10 SWIM, Services &amp; SWIFT

Swift Fox By Ruby Stucki Looks The swift fox is a small, light orange-tan fox with large ears,

Pippa Leary, CEO SWIFT MEDIA CEO Messages What is Swift? TRANSITIONING TO A STRONGER

TRACERJD: GENERIC TRACE-BASED DYNAMIC DEPENDENCE ANALYSIS WITH FINE-GRAINED LOGGING Haipeng Cai

Built- -In Self In Self- -Test for Programmable Test for Programmable Built I/O Buffers in

DUAL CONGESTION AWARENESS SCHEME IN ON-CHIP NETWORKS DEPARTMENT OF INFORMATION TECHNOLOGY

Reglerteknik 4 Kapitel 8 Kp bok och vningshfte p krbokhandeln William Sandqvist

How clean is your production line? 1 Presenting Hinke Dekter Accountmanager (Food and

Box-and-Pointer Notation Hierarchical structures are made up of parts, which themselves are

Tutorial 3: Slicing CISC422/853 Scott Grant Overview Getting Started (Eclipse)

What is Parametric Trace Slicing Good For? Giles Reger School of Computer Science, University of

OpenStack Swift OpenStack Summit Atlanta 2014 Martin Lanner & Hugo Kuo May 15, 2014 Agenda

SWIFT: Administration SWIM Industry Collaboration Workshop #6 SWIM, Services & SWIFT

SWIFT: Administration SWIM Industry Collaboration Workshop #10 SWIM, Services & SWIFT