Parallel accelerator simulations
past, present and future James Amundson
Fermilab
November 21, 2011
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 1 / 29
Parallel accelerator simulations past, present and future James - - PowerPoint PPT Presentation
Parallel accelerator simulations past, present and future James Amundson Fermilab November 21, 2011 James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 1 / 29 This Talk Accelerator Modeling and High-Performance
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 1 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 2 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 3 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 4 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 5 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 6 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 7 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 8 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 9 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 10 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 11 / 29
500 1000 1500 2000 2500 number of cores 200 400 600 800 1000 1200 1400 time [sec]
actual ideal
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 12 / 29
1 2 4 8 16 32 64 128 256 procs 10
10
10
10 normalized time
ideal "real"
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 13 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 14 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 15 / 29
2
3
2
4
2
5
2
6
2
7
cores 10
1
10
2
time [s]
total sc-get-global-rho independent-operation-aperture sc-get-phi2 sc-get-global-en sc-apply-kick sc-get-local-rho
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 16 / 29
2
3
2
4
2
5
2
6
2
7
cores 10 10
1
time [s]
kick time before optimization kick time after optimization
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 17 / 29
2 2
1
2
2
2
3
2
4
nodes 10
10
10 10
1
time [s]
reduce_scatter 8 cores/node allreduce 8 cores/node reduce_scatter 12 cores/node allreduce 12 cores/node
2 2
1
2
2
2
3
2
4
nodes 10
10
10 10
1
time [s]
gatherv bcast 8 cores/node allgatherv 8 cores/node allreduce 8 cores/node gatherv bcast 12 cores/node allgatherv 12 cores/node allreduce 12 cores/node
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 18 / 29
2 2
1
2
2
2
3
2
4
nodes 10
10
10 10
1
time [s]
reduce_scatter 8 cores/node allreduce 8 cores/node reduce_scatter 12 cores/node allreduce 12 cores/node
2 2
1
2
2
2
3
2
4
nodes 10
10
10 10
1
time [s]
gatherv bcast 8 cores/node allgatherv 8 cores/node allreduce 8 cores/node gatherv bcast 12 cores/node allgatherv 12 cores/node allreduce 12 cores/node
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 19 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 20 / 29
2
2 2
1
2
2
2
3
2
4
nodes 10
2
time [s] best pre-opt: 74.9 best post-opt: 45.0
(not optimized) 8 cores/node openmpi (not optimized) 12 cores/node openmpi (optimized) 8 cores/node openmpi
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 21 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 22 / 29
1 2 4 8 16 32 cores 10
1
10
2
10
3
time
independent-operation-apply sc-get-green-fn sc-apply-kick sc-get-phi2 sc-get-local-rho independent-operation-aperture
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 23 / 29
1 2 4 8 16 32 64 128 256 512 1024 procs 10
1
10
2
10
3
10
4
time
cxx_benchmark/mpi-amd cxx_benchmark/mixed32-amd
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 24 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 25 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 26 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 27 / 29
1 2 4 8 16 32 64 128 256 procs 10
1
10
2
10
3
10
4
time
cxx_benchmark/mpi-amd predicted
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 28 / 29
James Amundson (Fermilab) Parallel accelerator simulations November 21, 2011 29 / 29