Finding and Optimizing Phases in Parallel Programs Ray Chen - PowerPoint PPT Presentation

Finding and Optimizing Phases in Parallel Programs Ray Chen <rchen@cs.umd.edu> Jeffrey K. Hollingsworth <hollings@cs.umd.edu> Scalable Tools Workshop 2016

Motivation • HPC programs often contain “phases” – Dynamic execution context (like a stack trace for performance) – Each have distinct performance traits • Particularly disruptive if inside a timestep loop – Short phases confound tools – Difficult to analyze a rapidly changing landscape – Worse if phases are nested 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 2

LULESH2 MPI Call Trace while (locDom->time() < locDom->stoptime()) { TimeIncrement(*locDom); LagrangeLeapFrog(*locDom); } 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 3

Automatic Phase Identification • Prior art (chosen completely at random) – IPS-2 – Paradyn’s Performance Consultant • Key: Automatic identification is hard – Rely on experts for annotations 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 4

Guided Phase Identification while (locDom->time() < locDom->stoptime()) while (locDom->time() < locDom->stoptime()) { { cali ::Annotation region1(“ tuner.communication ”).begin(); TimeIncrement(*locDom); TimeIncrement(*locDom); region1.end(); cali ::Annotation region2(“ tuner.computation ”).begin(); LagrangeLeapFrog(*locDom); LagrangeLeapFrog(*locDom); region2.end() } } 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 5

Performance Landscape 2.5KB Contextual Per Iteration Timeline Actual Timeline 3,700KB Contextual Timeline Per Iteration 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 6

Cross-Domain Analysis • Utilize experts during development I know what variables affect – Library writers specify tuning variables FFTW performance – Application writers specify code regions I know what variables affect MPI My application performance has three phases I know what – Phase dictates different performance context variables affect BLAS • Even though the same function is being called performance 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 7

Integration Work • Special annotation types identify: – Tunable variables – Code regions that should enable tuning • New Caliper tuning service – Listens for and reacts to special annotations – Calls Active Harmony to perform search 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 8

3D Fast Fourier Transform • FFT in 3 dimensions – Composed of three 1 dimensional FFT’s – Data is redistributed among processes between FFT’s 1 3 3 0 2 2 1 3 1 0 0 2 FFTz FFTx A2A1 FFTy A2A2 (blocking) (blocking) 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 9

Computation/Communication Overlap 3 1 3 0 2 2 1 3 1 0 0 2 FFTz FFTx A2A1 FFTy A2A2 (blocking) (blocking) 1 3 1 3 3 0 2 0 2 2 1 3 1 0 0 2 A2A2 A2A1 FFTz FFTx FFTy1 FFTy2 (non-blocking) (non-blocking) 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 10

Auto-tuning Opportunities T1 T2 1 3 1 3 3 0 2 0 2 2 1 3 1 0 0 2 FFTz FFTx A2A1 FFTy1 FFTy2 A2A2 (non-blocking) (non-blocking) T1 T1 y z T1 T1 Ux1 Px1 Nz / p2 Ny / p2 3 1 Py1 Uz1 0 2 x x 3 1 0 2 Unpack & FFTy1 FFTz & Pack 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 11

Nested Phases • Block size during A2A transfer is tunable – Relatively independent from other variables – May be tuned as a nested sub-phase • Outer and inner phases run in tandum 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 12

Online Auto-Tuning 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 13

Offline Auto-Tuning Cost 14

Online vs. Offline Tuning • Improvements over offline tuning – Nested phases simplifies search complexity – Reduce search dimensions from 24 to 16 – 40% fewer search steps needed to converge – Equivalent performance after convergence • Eliminates need for training runs – Don’t allocate thousands of nodes to train 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 15

Conclusion • Phases are key for HPC analysis tools – Rely on human guidance through annotations • Annotations unite cross-domain expertise – Libraries annotate variables to analyze – Application annotate regions to analyze • Currently analyzing other HPC codes – HPGMG has natural phases to exploit – AMR codes are next in line 8/2/16 Finding and Optimizing Phases in Parallel Programs: Scalable Tools Workshop 16

Finding and Optimizing Phases in Parallel Programs Ray Chen - PowerPoint PPT Presentation

Finding and Optimizing Phases in Parallel Programs Ray Chen <rchen@cs.umd.edu> Jeffrey K. Hollingsworth <hollings@cs.umd.edu> Scalable Tools Workshop 2016 Motivation HPC programs often contain phases Dynamic execution

AVIA PHASES 3 & 4 SAN DIEGO, CALIFORNIA Avia Phases 3 & 4 Avia Phases 3 & 4

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Finding and Optimizing Phases in Parallel Programs Jeffrey K. Hollingsworth

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

Tree Pr ee Proximity ximity Finding the good and bad of trees. joe@buildfax.com Tree

STATUS COUNT FINDING APPROVED 5 FINDING CONDITIONAL 16 FINDING DENIED 11

Galileo and the phases of Venus Galileo and the phases of Venus charles-henri.eyraud@ inrp.fr

The Phases of the Moon By: Miss Hannah Why does the Moon have phases? It depends on the

Structure of Cement Phases Structure of Cement Phases from ab initio Modeling Modeling from ab

General Principles Three Common Phases of Matter The different phases exist for different

Chapter 1: Compilation Phases Aarne Ranta Slides for the book Implementing Programming

THE 5 PHASES OF A THE 5 PHASES OF A SUCCESSFUL BWC PILOT SUCCESSFUL BWC PILOT March 31, 2015

Triangle Project Update December 2019 1 NWC Triangle Project Overview FUTURE PHASES FOR THE

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

An Introduction to Exotic Phases of Matter By Tyler D Ellison Outline 1. Vague statements

Using Cognitive Traits for I m proving the Using Cognitive Traits for I m proving the Detection

Is the UML appropriate for Interaction Design? Giorgio Brajnik Dip. di Matematica e Informatica

CheckThat! 2020 3 rd edition Automatic Identification and Verification of Claims Some tweets in

Behavioral Household Finance Discussion Jeremy Burke University of Southern California, Center

Automatic Forecasting Support System for Business Analytics applications based on Unobserved

DM-Group Meeting Subhodip Biswas 10/16/2014 Papers to be discussed 1. Crowdsourcing Land Use

Introduction to Scilab application to feedback control Yassine Ariba Brno University of

SNS Helium Cryogenic Plant Instrument and Controls Experience and Future Considerations

Finding and Optimizing Phases in Parallel Programs Ray Chen - PowerPoint PPT Presentation

Finding and Optimizing Phases in Parallel Programs Ray Chen <rchen@cs.umd.edu> Jeffrey K. Hollingsworth <hollings@cs.umd.edu> Scalable Tools Workshop 2016 Motivation HPC programs often contain phases Dynamic execution

AVIA PHASES 3 &amp; 4 SAN DIEGO, CALIFORNIA Avia Phases 3 &amp; 4 Avia Phases 3 &amp; 4

Optimizing monitoring networks for Optimizing monitoring networks for Optimizing monitoring

Finding and Optimizing Phases in Parallel Programs Jeffrey K. Hollingsworth

Finding your way in a graph Finding your way in a graph Finding your way in a graph Finding your

Finding Hidden Supernovae with Finding Hidden Supernovae with Finding Hidden Supernovae with

Tree Pr ee Proximity ximity Finding the good and bad of trees. joe@buildfax.com Tree

STATUS COUNT FINDING APPROVED 5 FINDING CONDITIONAL 16 FINDING DENIED 11

Galileo and the phases of Venus Galileo and the phases of Venus charles-henri.eyraud@ inrp.fr

The Phases of the Moon By: Miss Hannah Why does the Moon have phases? It depends on the

Structure of Cement Phases Structure of Cement Phases from ab initio Modeling Modeling from ab

General Principles Three Common Phases of Matter The different phases exist for different

Chapter 1: Compilation Phases Aarne Ranta Slides for the book Implementing Programming

THE 5 PHASES OF A THE 5 PHASES OF A SUCCESSFUL BWC PILOT SUCCESSFUL BWC PILOT March 31, 2015

Triangle Project Update December 2019 1 NWC Triangle Project Overview FUTURE PHASES FOR THE

Classification of Symmetry Classification of Symmetry Classification of Symmetry Classification

An Introduction to Exotic Phases of Matter By Tyler D Ellison Outline 1. Vague statements

Using Cognitive Traits for I m proving the Using Cognitive Traits for I m proving the Detection

Is the UML appropriate for Interaction Design? Giorgio Brajnik Dip. di Matematica e Informatica

CheckThat! 2020 3 rd edition Automatic Identification and Verification of Claims Some tweets in

Behavioral Household Finance Discussion Jeremy Burke University of Southern California, Center

Automatic Forecasting Support System for Business Analytics applications based on Unobserved

DM-Group Meeting Subhodip Biswas 10/16/2014 Papers to be discussed 1. Crowdsourcing Land Use

Introduction to Scilab application to feedback control Yassine Ariba Brno University of

SNS Helium Cryogenic Plant Instrument and Controls Experience and Future Considerations

AVIA PHASES 3 & 4 SAN DIEGO, CALIFORNIA Avia Phases 3 & 4 Avia Phases 3 & 4