Parallel and Memory-efficient Preprocessing for Metagenome Assembly
Vasudevan Rengasamy Paul Medvedev Kamesh Madduri
School of EECS The Pennsylvania State University {vxr162, pashadag, madduri}@cse.psu.edu HiCOMB 2017
1 / 46
Parallel and Memory-efficient Preprocessing for Metagenome Assembly - - PowerPoint PPT Presentation
Parallel and Memory-efficient Preprocessing for Metagenome Assembly Vasudevan Rengasamy Paul Medvedev Kamesh Madduri School of EECS The Pennsylvania State University {vxr162, pashadag, madduri}@cse.psu.edu HiCOMB 2017 1 / 46 Talk Outline
1 / 46
2 / 46
3 / 46
4 / 46
5 / 46
6 / 46
7 / 46
8 / 46
◮ Multipass approach: Only enumerate a subset of k-mers in
◮ e.g., 10 passes ⇒ 10× memory reduction.
9 / 46
10 / 46
11 / 46
Input: FASTQ files KmerHist FASTQPart LocalSort LocalCC MergeCC Output: FASTQ files KmerGen KmerGen-Comm Multiple Passes IndexCreate
12 / 46
13 / 46
14 / 46
◮ k-mers are partially sorted.
To MPI Task 1
Thread 1 offset Thread T offset To MPI Task P
15 / 46
◮ Reuse send buffer ⇒ No additional memory . ◮ Partition tuples into T disjoint ranges. ◮ Sort ranges in parallel using T threads.
16 / 46
10 8 2 6 20 5 10 8 2 6 20 5
Union (6,5) Union-by-index
10
8 2
6
20 5
Path Splitting Find (6)
10 8 2 6 20 5 10 8 2 6 20 5
17 / 46
◮ Store edges that merges components (similar to
◮ Process edges again in case of lost updates.
18 / 46
R1 R2 R4 R1 R3 R4 P0 P1 R3 R2 R1 R2 R4 P0 R3 R1 R2 R4 R1 R3 R4 P2 P3 R3 R2 R1 R2 R4 P2 R3 R1 R2 R4 P0 R3 0: 1: 2:
19 / 46
20 / 46
◮ Each node has 2× 12-core Ivy bridge processors and 64 GB
21 / 46
22 / 46
23 / 46
24 / 46
1 2 4 8 16 Nodes 5 10 15 20 25 30 35 40 Time (seconds) 12
1 2 4 8 16 Nodes 20 40 60 80 100 120 140 160 180 22
4 8 16 4 8 16 KmerGen-I/O KmerGen KmerGen-Comm LocalSort LocalCC-Opt Merge-Comm MergeCC CC-I/O Speedup
25 / 46
16 64 Nodes 100 200 300 400 500 600 700 800 900 Time(seconds) 3.25X 1X KmerGen-I/O KmerGen KmerGen-Comm LocalSort LocalCC-Opt Merge-Comm MergeCC CC-I/O
26 / 46
27 / 46
HG LL MM Dataset 20 40 60 80 100 120 140 Time (seconds) 1.56X 1.76X 1.57X
MetaPrep KMC-2
HG LL MM Dataset 10 20 30 40 50 60 70 80 90 2.72X 3.18X 6.76X
KMC-2 MetaPrep16
28 / 46
29 / 46
30 / 46
31 / 46
None KF < 30 10 ≤ KF < 30 Filter 20 40 60 80 100 Largest Component Size (%)
HG dataset k=27 k=63 32 / 46
33 / 46
34 / 46
35 / 46
36 / 46
◮ Splitting components using filters impacts assembly quality. ◮ Does scaffolding help in improving assembly quality?
37 / 46
38 / 46
39 / 46
40 / 46
41 / 46
42 / 46
43 / 46
44 / 46
45 / 46
Dataset Type MEGAHIT assembly output statistics Contigs Total (Mbp) Max (bp) N50 (bp) HG No Preproc 63 519 116.19 217 183 5071 No Filter 63 483 116.18 217 183 5098 LC 58 770 113.83 217 183 5510 Other 4713 2.35 2860 513 KF < 30 64 571 119.01 217 183 5123 LC 56 732 110.13 217 183 5687 Other 7839 8.87 43 863 2271 LL No Preproc 179 828 165.63 225 770 1273 No Filter 181 751 166.67 225 805 1263 LC 141 136 148.75 225 805 1593 Other 40 615 17.9 4028 432 KF < 30 182 717 168.42 225 770 1275 LC 140 081 147.51 225 770 1587 Other 42 636 20.90 43 718 465 MM No Preproc 24 931 203.65 1 067 762 50 607 No Filter 25 002 203.65 1 067 762 50 550 LC 23 959 202.99 1 067 762 50 781 Other 1043 0.66 5788 695 KF < 30 40 632 208.24 611 608 23 126 LC 26 233 156.04 611 608 28 135 Other 14 399 52.19 591 560 12 285
46 / 46