1
Performance Analysis of MPI+OpenMP Programs with HPCToolkit
John Mellor-Crummey Department of Computer Science Rice University
http://hpctoolkit.org/slides/hpctoolkit-og15.pdf
Rice Oil & Gas HPC Workshop March 2015
Performance Analysis of MPI+OpenMP Programs with HPCToolkit John - - PowerPoint PPT Presentation
http://hpctoolkit.org/slides/hpctoolkit-og15.pdf Performance Analysis of MPI+OpenMP Programs with HPCToolkit John Mellor-Crummey Department of Computer Science Rice University http://hpctoolkit.org Rice Oil & Gas HPC Workshop March
1
Rice Oil & Gas HPC Workshop March 2015
– dynamic binaries on clusters – static binaries on supercomputers – batch jobs
– correlate measurements with code for actionable results – support analysis at the desired level intuitive enough for application scientists and engineers detailed enough for library developers and compiler writers
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
instruction pointer return address return address return address
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
[hpcrun]
[hpcstruct]
[hpcprof/hpcprof-mpi]
[hpcviewer/ hpctraceviewer]
16
0.500 0.625 0.750 0.875 1.000
1 4 16 64 256 1024 4096 16384 65536
Cellular detonation Helium burning on neutron stars Laser-driven shock instabilities Nova outbursts on white dwarfs Rayleigh-Taylor instability Orzag/Tang MHD vortex Magnetic Rayleigh-Taylor
Figures courtesy of FLASH Team, University of Chicago
Time Processes Call stack
1Tallent & Mellor-Crummey: PPoPP 2009 2Tallent, Mellor-Crummey, Porterfield: PPoPP 2010 3Liu, Mellor-Crummey, Fagan: ICS 2013
33
34
35
36
37
38
svn co http://hpctoolkit.googlecode.com/svn/branches/hpctoolkit-ompt
– essential overview that almost fits on one page
– a guide for using hpctoolkit on BG/Q and Cray platforms
– analyzing scalability, waste, multicore performance ...
– why don’t I have any source code in the viewer? – hpcviewer isn’t working well over the network ... what can I do?
49
50
51