PerfMon redux: analyzing a CUDA application with the Windows
S6287
Richard Wilton Department of Physics and Astronomy Johns Hopkins University
PerfMon redux: analyzing a CUDA application with the Windows PerfMon - - PowerPoint PPT Presentation
S6287 PerfMon redux: analyzing a CUDA application with the Windows PerfMon redux: analyzing a CUDA application with the Windows Performance Monitor Richard Wilton Department of Physics and Astronomy Johns Hopkins University S6287: Analyzing
Richard Wilton Department of Physics and Astronomy Johns Hopkins University
S6287: Analyzing a CUDA application with PerfMon
S6287: Analyzing a CUDA application with PerfMon
S6287: Analyzing a CUDA application with PerfMon
Counters in the GPU group:
S6287: Analyzing a CUDA application with PerfMon
Monitoring everything at once is probably not a good idea.
S6287: Analyzing a CUDA application with PerfMon
CPU compute activity GPU (CUDA) compute activity
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1, 2 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0 GPU compute activity % Global memory read/write activity % Sampled at 1-second intervals Sampled at 1-second intervals Samples are “snapshots” (not averaged)
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1, 2 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 1 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 2 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1, 2 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1, 2 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0 GPU compute activity % Global memory read/write activity % Host-related counters CPU activity % CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 2 GPU compute activity % Global memory allocated (bytes) Host-related counters CPU activity %
CPU activity %
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1 GPU compute activity % Global memory read/write activity % GPU temperature (°C) GPU total power draw (watts) GPU total power draw (watts) Host-related counters CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1 GPU compute activity % Global memory read/write activity % GPU temperature (°C) GPU total power draw (watts) GPU total power draw (watts) Host-related counters CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1 GPU compute activity % Global memory read/write activity % GPU temperature (°C) GPU total power draw (watts) GPU total power draw (watts) Host-related counters CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1 GPU compute activity % Global memory read/write activity % GPU temperature (°C) GPU total power draw (watts) GPU total power draw (watts) Host-related counters CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon
Device-related counters – device 0, 1 GPU compute activity % Global memory read/write activity % GPU temperature (°C) GPU total power draw (watts) GPU total power draw (watts) Host-related counters CPU activity % Host memory allocation
S6287: Analyzing a CUDA application with PerfMon