finding and optimizing phases in parallel programs
play

Finding and Optimizing Phases in Parallel Programs Jeffrey K. - PowerPoint PPT Presentation

Finding and Optimizing Phases in Parallel Programs Jeffrey K. Hollingsworth <hollings@cs.umd.edu> Ray Chen <rchen@cs.umd.edu> Phases of UMD CS Computer & Space Sciences: 1962-1987 AV Williams: 1987-2018 Iribe Center: 2018-


  1. Finding and Optimizing Phases in Parallel Programs Jeffrey K. Hollingsworth <hollings@cs.umd.edu> Ray Chen <rchen@cs.umd.edu>

  2. Phases of UMD CS Computer & Space Sciences: 1962-1987 AV Williams: 1987-2018 Iribe Center: 2018- � CS@UMD: Future is Exciting � Largest Major on campus (over 2,800 undergrads, plus 400+ Computer Engineering) � New Building in 2018 � Hiring O(10) New Faculty in couple of years � New Big Data Masters & Certificate Programs

  3. Motivation • HPC programs often contain “phases” – Dynamic execution context – Each have distinct performance traits • Particularly problematic if inside a time-step loop – Short phases confound tools – Difficult to analyze a rapidly changing landscape – Worse if phases are nested 3

  4. LULESH2 MPI Call Trace while (locDom->time() < locDom->stoptime()) { TimeIncrement(*locDom); LagrangeLeapFrog(*locDom); } 4

  5. Automatic Phase Identification • My Failed Prior Attempts – IPS-2 (c. 1990) – Paradyn’s Performance Consultant (c. 1995) – Solution • Automatic identification is hard, rely on experts for annotations • Create virtual phases by stitching short ones together 5

  6. Guided Phase Identification while (locDom->time() < locDom->stoptime()) while (locDom->time() < locDom->stoptime()) { { cali::Annotation region1(“tuner.communication”).begin(); TimeIncrement(*locDom); TimeIncrement(*locDom); region1.end(); cali::Annotation region2(“tuner.computation”).begin(); LagrangeLeapFrog(*locDom); LagrangeLeapFrog(*locDom); region2.end() } } 6

  7. Performance Landscape 2.5KB Contextual Per Iteration Timeline Actual Timeline 3,700KB Contextual Timeline Per Iteration 7

  8. Cross-Domain Analysis • Utilize experts during development I know what variables affect – Library writers specify tuning variables FFTW performance – Application writers specify code regions I know what variables affect MPI My application performance has three phases I know what – Phase dictates different performance context variables affect BLAS • Even though the same function is being called performance 8

  9. Integration Work • Special annotation types identify: – Tunable variables – Code regions that should enable tuning • New Caliper tuning service – Listens for and reacts to special annotations – Calls Active Harmony to perform search 9

  10. 3D Fast Fourier Transform • FFT in 3 dimensions – Composed of three 1 dimensional FFT’s – Data is redistributed among processes between FFT’s 1 3 3 0 2 2 3 1 1 0 0 2 FFTz FFTx A2A1 FFTy A2A2 (blocking) (blocking) 10

  11. Computation/Communication Overlap 1 3 3 0 2 2 3 1 1 0 0 2 FFTz FFTx A2A1 FFTy A2A2 (blocking) (blocking) 1 3 3 1 3 0 2 0 2 2 1 3 1 0 0 2 A2A2 A2A1 FFTz FFTy1 FFTx FFTy2 (non-blocking) (non-blocking) 11

  12. Auto-tuning Opportunities T1 T2 1 3 3 1 3 0 2 0 2 2 1 3 1 0 0 2 FFTz A2A1 FFTy1 FFTy2 A2A2 FFTx (non-blocking) (non-blocking) T1 T1 y z T1 T1 Px1 Ux1 Nz / p2 Ny / p2 1 3 Py1 Uz1 0 2 x x 1 3 0 2 Unpack & FFTy1 FFTz & Pack 12

  13. Online Auto-Tuning 13

  14. Phase Aware Tuning • Improvements over offline (non-phase) tuning – Reduce search dimensions from 24 to 16 – 40% fewer search steps needed to converge – Equivalent performance after convergence • Eliminates need for training runs – Don’t allocate thousands of nodes to train 14

  15. Offline Auto-Tuning Cost 15

  16. Conclusion • Phases are key for HPC analysis tools – Rely on human guidance through annotations – Virtualizing repeated phases helps many types of tools • Annotations unite cross-domain expertise – Libraries annotate variables to analyze – Application annotate regions to analyze • Currently analyzing other HPC codes 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend