Hardware Counters for non-Intel Systems (and tools for Frontier) - - PowerPoint PPT Presentation

hardware counters for non intel systems
SMART_READER_LITE
LIVE PREVIEW

Hardware Counters for non-Intel Systems (and tools for Frontier) - - PowerPoint PPT Presentation

Hardware Counters for non-Intel Systems (and tools for Frontier) AMD CPU Counters @Gruber (LIKWID) the PFM hardware unit hasnt changed in years IBS still works some counters lie (important ones like vector ops and


slide-1
SLIDE 1

Hardware Counters for non-Intel Systems

(and tools for Frontier)

slide-2
SLIDE 2

AMD CPU Counters

  • @Gruber (LIKWID)

○ the PFM hardware unit hasn’t changed in years ○ IBS still works ○ some counters lie (important ones like vector ops and memory bandwidth) ■ even fixed counters are giving bad counts ○ Linux kernel settings can improve performance

https://www.amd.com/system/files/TechDocs/54945_3.03_ppr_ZP_B2_pub.zip

  • Wishlist:

○ good contacts within AMD for CPU counters ○ public documentation on DataFabric events ○ top-down methodology with associated counter groups

slide-3
SLIDE 3

AMD GPU Profiling

  • HIP looks like CUDA

○ ROCm profiling interface looks like CUPTI

  • HPCToolkit is looking to unify GPU profiling code
  • Frontier Tools WG provides opportunity for requesting changes to tools APIs

○ send requests to Mike Brim (brimmj@ornl.gov)

slide-4
SLIDE 4

POWER CPU Counters

  • Grouping creates difficulties for profiling
  • Cycle-based accounting (top-down) is oriented toward existing groups

○ See “CPI stack” in POWER9 Performance Monitor Unit User’s Guide

https://wiki.raptorcs.com/w/images/6/6b/POWER9_PMU_UG_v12_28NOV2018_pub.pdf

slide-5
SLIDE 5

ARM CPU Counters

  • No top-down equivalent behavior
  • Counters

○ Good: instruction counts, branching, load/store ○ Bad: flops, memory accesses

  • Recommendation: software prefetching helps performance significantly

○ could tools help add this?

  • Caveat: Counter names may measure different things on different vendor

implementations

slide-6
SLIDE 6

NVIDIA GPU Profiling

  • CUPTI is annoying, but still useful

○ NVIDIA wants to provide only metrics (not events)