Parallel Recipes Yves Vandriessche Sept. 08, 2015 scripts deal - - PowerPoint PPT Presentation

parallel recipes
SMART_READER_LITE
LIVE PREVIEW

Parallel Recipes Yves Vandriessche Sept. 08, 2015 scripts deal - - PowerPoint PPT Presentation

CnC as workflow coordina.on language for scien.fic compu.ng Parallel Recipes Yves Vandriessche Sept. 08, 2015 scripts deal with complexity of gluing together applications + = GATK, BWA,


slide-1
SLIDE 1

CnC ¡as ¡workflow ¡coordina.on ¡language ¡ ¡ for ¡scien.fic ¡compu.ng ¡

Yves ¡Vandriessche

Parallel ¡Recipes

  • Sept. ¡08, ¡2015
slide-2
SLIDE 2

+ =

scripts deal with complexity of gluing together applications

GATK, BWA, Picard, TopHat, samtools, …

Broad Institute best practices seq. pipeline ~ 200 SLoC

2

slide-3
SLIDE 3

+ =

Distribution and parallelisation explodes accidental complexity of scripts

GATK, BWA, Picard, TopHat, samtools, …

) (

x

distributed seq. pipeline ~ 2000 SLoC

distributed seq. pipeline ~ 2000 SLoC

eHive exome pipeline

28,066 SLoC1 (Perl) => $898,255 est.

3

1generated using David A. Wheeler's 'SLOCCount'

slide-4
SLIDE 4

parallel recipe:

What ¡is ¡the ¡essential ¡𝚬 ¡between ¡sequential ¡and ¡parallel ¡script?

  • rdering ¡dependencies!
  • data ¡dependencies
  • control ¡dependencies

sources ¡of ¡ordering:

more ¡#orderings more ¡parallelism more ¡performance

In ¡a ¡parallel ¡world: In ¡a ¡sequential ¡world:

  • ne ¡single ¡ordering ¡of ¡operations

=> =>

produce/consume, ¡consistency iteration, ¡branching, ¡recursion, ¡… concurrency ¡(shared ¡resources)

slide-5
SLIDE 5

Parallel ¡Recipes:

complex ¡glue ¡x ¡complex ¡coordination reuse ¡scripting Intel ¡Concurrent ¡Collections ¡inside ¡

as ¡

Coordination ¡Language

  • ¡ ¡cluster-­‑level ¡and ¡node-­‑level ¡parallelism ¡ ¡
  • ¡ ¡determinate ¡execution ¡
  • ¡ ¡flexible ¡parallel ¡execution ¡model ¡
  • ¡ ¡stable ¡& ¡practical ¡implementation ¡(CnC++)

CnC ¡offers:

5

precipes ¡

  • rdering ¡dependencies
slide-6
SLIDE 6

6

B

$ echo ‘B’ Bbis finish

parallel hello world recipe:

A A_done finish B B_finished Bbis B_or_C_done C

command:

  • ut: B_done

$ echo ‘another thing for B’

command: in:

B_done

  • ut:

B_or_C_done

$ echo ‘finished’

command: in:

{ A_done, B_or_C_done }

command:

what needs to happen when I start?

in:

  • ut:

what dependencies need to be satisfied before I can start? what dependencies are satisfied after I finished successfully?

slide-7
SLIDE 7

7

parallel hello world recipe bis:

practical ¡consideration: ¡ ¡ parallel ¡scripts ¡rarely ¡run ¡only ¡once ¡

fetch dosier dosier extract gross income income report income

$ wget ftp://citizenfiles.gov/dosiers/yves.txt .

command:

$ grep 'gross' yves.txt > yves_gross.txt

command:

$ echo -n citizen yves is making; cat yves_gross.txt ; echo a year.

command:

slide-8
SLIDE 8

8

parallel hello world recipe bis:

fetch dosier dosier extract gross income income report income

$ wget ftp://citizenfiles.gov/dosiers/{}.txt .

command:

$ grep 'gross' {}.txt > {}_gross.txt

command:

$ echo -n citizen {} is making ; cat {}_gross.txt ; echo a year.

command:

yves tom roel

practical ¡consideration: ¡ ¡ parallel ¡scripts ¡rarely ¡run ¡only ¡once ¡ parallel ¡scripts ¡typically ¡run ¡data-­‑parallel ¡

slide-9
SLIDE 9

9

parallel hello world recipe bis:

  • ut ¡of ¡the ¡box: ¡ ¡ ¡data-­‑parallel ¡runs

fetch dosier dosier extract gross income income report income fetch dosier dosier extract gross income income report income fetch dosier dosier extract gross income income report income

. . .

yves tom roel

slide-10
SLIDE 10

{ "stages" : { "A" : { "command" : "echo A for {}.", "out" : "A_done" }, "B" : { "command" : "echo B for {}.", "out" : "B_finished" }, "Bbis" : { "command" : "echo One more thing for B and {}.", "in" : "B_finished", "out" : "B_or_C_done" }, "C" : { "command" : "echo C for {}.", "out" : "B_or_C_done" }, "finish" : { "command" : "echo Done with A and B for {}.", "in" : ["A_done", "B_or_C_done"] } } }

10

A A_done finish B B_finished Bbis B_or_C_done C

slide-11
SLIDE 11

JSON parallel recipe:

11

$ ./precipes -p bpp.dot exome_best_practices_pipeline.json

{ "stages" : { "check_paired" : { "command" : "$CHECK_EXISTS $READS/{}_1.filt.fastq.gz", "out" : "has_paired_end_reads" }, "fetch_unpaired" : { "command" : "$FETCH $READS/{}.filt.fastq.gz $LOCAL_DIR/{}.unpaired.fastq.gz", "out" : "unpaired.fastq.gz" }, "fetch_paired_1" : { "command" : "$FETCH $READS/{}_1.filt.fastq.gz $LOCAL_DIR/{}.paired_1.fastq.gz", "in" : "has_paired_end_reads", "out" : "paired_1.fastq.gz" }, "fetch_paired_2" : { "command" : "$FETCH $READS/{}_2.filt.fastq.gz $LOCAL_DIR/{}.paired_2.fastq.gz", "in" : "has_paired_end_reads", "out" : "paired_2.fastq.gz" }, "alignment_paired" : { "command" : “\ $BWA mem -R '@RG\\tID:Group1\\tLB:lib1\\tPL:illumina\\tSM:sample1' \

  • t $NUM_THREADS $REF/ucsc.hg19.fasta \

$LOCAL_DIR/{}.paired_1.fastq.gz $LOCAL_DIR/{}.paired_2.fastq.gz \ > $LOCAL_DIR/{}.paired.sam && rm $LOCAL_DIR/{}.paired_1.fastq.gz $LOCAL_DIR/{}.paired_2.fastq.gz", "in" : ["paired_1.fastq.gz", "paired_2.fastq.gz"], "out" : "paired.sam" },
 …

check_paired has_paired_end_reads fetch_paired_1 fetch_paired_2 check_no_paired no_paired_end_reads merge_bams_unpaired check_no_unpaired no_unpaired_end_reads merge_bams_paired fetch_unpaired unpaired.fastq.gz alignment_unpaired paired_1.fastq.gz alignment_paired paired_2.fastq.gz unpaired.sam sort_for_coordinate_order_unpaired paired.sam sort_for_coordinate_order_paired sorted_paired.bam merge_bams_paired_unpaired sorted_unpaired.bam sorted.bam remove_duplicates dedup.bam build_bam_index_1 realign_around_indels_1 realign_around_indels_2 dedup.bai intervals 7.bam build_bam_index_2 base_recalibrate_1 base_recalibrate_2 7.bai recal 8.bam 8.bai call_variants vcf vcfinocx

[1]

  • G. A. Auwera, M. O. Carneiro, C. Hartlm, et al, “From FastQ data to high‐confidence variant calls: the genome analysis toolkit best

practices pipeline,” Curr. Protoc. Bioinform.11.10.1-11.10.33, October 2013.

slide-12
SLIDE 12

Execution

bash$ ¡ ¡./precipes ¡exome_best_practices_pipeline.json ¡sample_{00..07}

12

./precipes core

.json

  • ¡workstation ¡
  • ¡cluster ¡
  • ¡Amazon ¡EC2
slide-13
SLIDE 13

Execution

bash$ ¡ ¡./precipes ¡exome_best_practices_pipeline.json ¡sample_{00..07}

13

./precipes core

add_stage( “fetch_paired_1”, “$FETCH $READS/…”, { “has_paired_end_reads” }, { “paired_1.fastq.gz” } ); add_stage( “check_paired”, “test -f …”, { }, { “has_paired_end_reads” } ); add_stage( … );

.json

slide-14
SLIDE 14

14

sai sam 1.bam

// start running samples in parallel > for( int i = 2; i < argc; ++i ) pipeline.run( argv[i], i-2 );

sample_00 sample_07

> pipeline.tags.put( “sample_00” ) > pipeline.tags.put( “sample_01” ) …

Execution

bash$ ¡ ¡./precipes ¡exome_best_practices_pipeline.json ¡sample_{00..07}

pipeline.wait()

slide-15
SLIDE 15

15 Exome Best Practices Scaling Experiment Runtime

1d 2d 3d 4d 5d 6d 7d

# compute nodes

1 2 4 8

12h 20m 21h 38m 41h 21m 80h 31m 21h 7m 40h 21m 79h 21m 158h 7m

1 worker thread 2 worker threads

parallel scaling experiment: 32 samples from g1k NA12878

slide-16
SLIDE 16

16

Scaling Efficiency : single fat node

Efficiency

17% 33% 50% 67% 83% 100%

Runtime

0d 3,5d 7d 10,5d 14d

# workers

1 2 4 8 16 24 32 64

time(s) efficiency

46,928% 69,132% 72,38% 83,361% 95,285% 96,369% 98,224% 100%

14h 55m 15h 12m 19h 22m 25h 13m 44h 7m 87h 15m 171h 12m 336h 19m

100% 98,224% 96,369% 95,285% 83,361% 72,38% 69,132% 46,928%

(exome best practices, 32 samples)

slide-17
SLIDE 17

17

Scaling Efficiency : 1 worker

Efficiency

20% 40% 60% 80% 100%

Runtime

0d 1d 2d 3d 4d 5d 6d 7d

# compute nodes

1 2 4 8 1 worker runtime efficiency

93,60% 97,97% 99,63% 100,00%

21h 7m 40h 21m 79h 21m 158h 7m

100,00% 99,63% 97,97% 93,60% Scaling Efficiency : 2 workers

Efficiency

20% 40% 60% 80% 100%

Runtime

0d 1d 2d 3d 4d 5d 6d 7d

# compute nodes

1 2 4 8 2 workers runtime efficiency

81,60% 93,05% 97,36% 100,00%

12h 20m 21h 38m 41h 21m 80h 31m

100,00% 97,36% 93,05% 81,60%

Scaling Efficiency : cluster

slide-18
SLIDE 18

execution trace: 32 samples, 4 nodes, 2 workers

18

1 2 3

slide-19
SLIDE 19

19

Next!

Common ¡Workflow ¡Language1 ¡(CWL) ¡integration

{ … "run": { "inputs": [ { "inputBinding": { "position": 1, "prefix": "--reverse" }, "type": "boolean", "id": "#reverse" }, { "inputBinding": { "position": 2 }, "type": "File", "id": "#input" } ], … "class": "Workflow" }

1 https://github.com/common-workflow-language/common-workflow-language

core

  • ¡workstation ¡
  • ¡cluster ¡
  • ¡amazon ¡ec2

Shoutout ¡to ¡BOSC ¡CodeFest2015!

slide-20
SLIDE 20

20

check_paired has_paired_end_reads fetch_paired_1 fetch_paired_2 split paired_1.fastq.gz paired_2.fastq.gz chunked.paired check_no_paired no_paired_end_reads split chunked.unpaired fetch_unpaired unpaired.fastq.gz alignment_unpaired chunked.unpaired.sai unpaired_sai_to_bam chunked.unpaired.bam join unpaired.bam remove_unaligned_unpaired_reads aligned.unpaired.bam alignment_paired_1 alignment_paired_2 chunked.paired_1.sai chunked.paired_2.sai paired_1_sai_to_bam chunked.paired_1.bam paired_2_sai_to_bam chunked.paired_2.bam sort_paired_1 chunked.sorted.paired_1.bam sort_paired_2 chunked.sorted.paired_2.bam join paired.bam remove_unaligned_paired_reads aligned.paired.bam merge_bams_paired_unpaired 2.bam count count

easy ¡thanks ¡to ¡CnC ¡coordination

advanced workflow coordination

front-­‑end ¡language ¡bottleneck

SplitJoin construct in precipes

Next!

slide-21
SLIDE 21

21

easy ¡thanks ¡to ¡CnC ¡coordination

Execution — SplitJoin support

front-­‑end ¡language ¡bottleneck

check_paired has_paired_end_reads fetch_paired_1 fetch_paired_2 split paired_1.fastq.gz paired_2.fastq.gz chunked.paired check_no_paired no_paired_end_reads split chunked.unpaired fetch_unpaired unpaired.fastq.gz alignment_unpaired chunked.unpaired.sai unpaired_sai_to_bam chunked.unpaired.bam join unpaired.bam remove_unaligned_unpaired_reads aligned.unpaired.bam alignment_paired_1 alignment_paired_2 chunked.paired_1.sai chunked.paired_2.sai paired_1_sai_to_bam chunked.paired_1.bam paired_2_sai_to_bam chunked.paired_2.bam sort_paired_1 chunked.sorted.paired_1.bam sort_paired_2 chunked.sorted.paired_2.bam join paired.bam remove_unaligned_paired_reads aligned.paired.bam merge_bams_paired_unpaired 2.bam count count

… “splitjoin" : { "split" : { "command" : "zcat -v $LOCAL_DIR/{}.unpaired.fastq.gz \ | split -d -l 40000000 \

  • -filter='echo \"writing $FILE.unpaired.fastq.gz\" ; \

gzip --best -c - > $FILE.unpaired.fastq.gz' \

  • $LOCAL_DIR/{}.chunk_",

"in" : ["first", "second"], “fanout" : "chunks_in" }, "count" : "ls $LOCAL_DIR/{}.*.unpaired.fastq.gz | wc -l", "stages" : { "process_chunk" : { "command" : "echo processing chunk {}_##", "in" : "chunks_in", "out" : "chunks_out" }, "join" : { "command" : "$SAMTOOLS merge -f $LOCAL_DIR/{}.unpaired.bam \ @($LOCAL_DIR/{}.chunk_##.sorted.unpaired.bam)", "fanin" : "chunks_out", "out" : "splitjoin_finished" } }, …

SplitJoin construct in precipes

slide-22
SLIDE 22

Execution — advanced coordination expression problem

normal coordination:

  • for dep in in_deps 


get( in_deps, N/A )

  • if success?( system(command_string) )


for dep in out_deps 
 put( out_deps, N/A )

fanout/split coordination:

  • normal_coordination()
  • put( count, popen(count_command) )
  • for dep in fanout_deps


for i in count
 put( dep, i )

count

slide-23
SLIDE 23

Execution — advanced coordination expression problem

normal coordination:

  • for dep in in_deps 


get( in_deps, N/A )

  • if success?( system(command_string) )


for dep in out_deps 
 put( out_deps, N/A )

fanout/split coordination:

  • normal_coordination()
  • put( count, popen(count_command) )
  • for dep in fanout_deps


for i in count
 put( dep, i )

count

coordination: success/fail

slide-24
SLIDE 24

Execution — advanced coordination expression problem

normal coordination:

  • for dep in in_deps 


get( in_deps, N/A )

  • if success?( system(command_string) )


for dep in out_deps 
 put( out_deps, N/A )

fanout/split coordination:

  • normal_coordination()
  • put( count, popen(count_command) )
  • for dep in fanout_deps


for i in count
 put( dep, i )

count

coordination: success/fail coordination: success/fail, #outputs

slide-25
SLIDE 25

Coordination languages and their Significance

e can build a complete programming model
  • ”t
  • f two separate
pieces-the computation model and the cwnlination model The computation mcdel allows program- mers to build a single computational activity: a single-threaded, step- at-a-time computation. The coordination model is the glue that binds separate activities into an ensemble. An
  • rdinary computation language (e.g.,
Fortran) embodies some computa- tion model. A coordination language embodies a coordination model; it provides operations to LE& com- putational activities and to support communicaation among them. Our approach to coordination has been developed in the framework of a system called Linda? Linda is not a programming
  • language. Kahn
and Miller write that “Linda is best not thought of as a language-but rather as an extension that can be added to nearly any language to enable process creation, communica- tion, and synchronizatio~27].” We ? would rather say that Linda is a coor- z din&on language. It is one of two
  • components that together make up a
: complete programming language. uI (The suggestion that traditional ; pmgxumning languages are iumzpl& z is intentional.) Y A comwtation model and a cc&& : might also be separated into two ; distinct languages, in which case _ : programmers choose one of each: ‘
  • ne computation
language plus

Execution — advanced coordination expression problem

if only we had a coordination language custom stage/keyword for each construct?

slide-26
SLIDE 26

Execution — advanced coordination expression problem

we do: Concurrent Collections!

bash$ put(chunk_count, `ls *.foo | wc -l`)

slide-27
SLIDE 27

27

easy ¡thanks ¡to ¡CnC ¡coordination

Execution — advanced workflow coordination

front-­‑end ¡language ¡bottleneck ideal:

bash$ put(chunk_count, `ls *.foo | wc -l`)

not ¡just ¡split/join: ¡recursion, ¡groupby, ¡reduce, ¡streaming, ¡… Use ¡CnC ¡to ¡coordinate ¡with ¡client/peer ¡applications

Next?

client/peer ¡coordination ¡bottleneck common: Special ¡construct ¡for ¡each

slide-28
SLIDE 28

28

Next?

1 http://mesos.apache.org/

core

  • ¡workstation ¡
  • ¡cluster ¡
  • ¡amazon ¡ec2

resource ¡management ¡ deployment towards ¡resource-­‑aware ¡scheduling

Mesos ¡integration

slide-29
SLIDE 29

29

  • simple parallel & distributed scripting
  • CnC is a powerful coordination &

execution engine!

  • CWL integration for wide-range

acceptance

precipes :

Parallel Recipes

https://github.com/yvdriess/precipes

slide-30
SLIDE 30

Thank You!

30