Accelerating Parallel Analysis
- f Scientific Simulation Data
via Zazen
Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw
- D. E. Shaw Research
Accelerating Parallel Analysis of Scientific Simulation Data via - - PowerPoint PPT Presentation
Accelerating Parallel Analysis of Scientific Simulation Data via Zazen Tiankai Tu, Charles A. Rendleman, Patrick J. Miller, Federico Sacerdoti, Ron O. Dror, and David E. Shaw D. E. Shaw Research Motivation Goal: To model biological
2
Achievement)
3
4
5
6
0.5 1 1.5 2 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Ion A Ion B
7
Above channel Inside channel Into channel from above Below channel Into channel from below
8
9
10
11
0.5 1 1.5 2 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Ion A Ion B
12
0.5 1 1.5 2 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Ion A Ion B
13
0.5 1 1.5 2 2 4 6 8 10 12 14 16 18 20 22 24 26 28 30
Ion A Ion B
14
reduce(K1, ...) reduce(K2, ...)
Google File System Input files Input files Input files
map(...)
K1: {v1j} K2: {v2j} K1: {v1j, v1i, v1k}
map(...)
K1: {v1i} K2: {v2i}
map(...)
K1: {v1k} K2: {v2k} K2: {v2k, v2j, v2i} Output file Output file
15
16
17
Analysis node
Analysis node Local disks Local disks
18
19
20
21
22
Analysis node 0
/ sim0 / bodhi
Analysis node 1
/ bodhi sim0 sim1 sim1 f0 f2
2
seq sim1 f0 f2 f1 f3 sim0 f1
1
seq f3
3 0 1 0 1 Local bitmap 1 0 1 0 Remote bitmap
NFS server
1 0 1 0 0 1 0 1 1 1 1 1 1 1 1 1 Merged bitmap
Local bitmap Remote bitmap Merged bitmap
23
24
25
26
Parallel supercomputer
Bodhi server Bodhi library Bodhi server
File servers
Analysis node Analysis node
Zazen cluster
Parallel analysis programs (HiMach jobs) Zazen protocol Bodhi library Bodhi library I/O node
27
28
29
2 4 6 8 10 12 14 16 1 2 4 8 16 32 64 128
Time (s) Number of nodes
30
1E-03 1E-02 1E-01 1E+00 1E+01 1E+02 1E+03 1E+04 1E+05 1E+06 1E+07 1E+08 1E+09
Time (s) Number of frames
31
5 10 15 20 25 1 2 4 8
GB/s Application read processes per node
5 10 15 20 25 1 2 4 8
GB/s Application read processes per node
1-GB 256-MB 64-MB 2-MB
32
SATA disks organized in RAID 6
equal to file sizes, three replications per file
(with a number of best-effort optimizations)
33
5 10 15 20 25 2 MB 64 MB 256 MB 1 GB
GB/s
NFS PVFS2 Hadoop/HDFS Zazen
34
1E+01 1E+02 1E+03 1E+04 1E+05 2 MB 64 MB 256 MB 1 GB
NFS PVFS2 Hadoop/HDFS Zazen
35
10 20 30 40 50 60 70 80 90 100 2 MB 64 MB 256 MB 1 GB
No writes 1 GB files 256 MB files 64 MB files 2 MB files
36
100 1,000 10,000 1 2 4 8
Time (s) Application processes per node
NFS Zazen Memory
37
200 400 600 800 1,000 1,200 1,400 1,600 0% 10% 20% 30% 40% 50%
Time (s) Node failure rate
Theoretical worst case Actual running time
38