The Numerical Reproducibility Fair Trade: Facing the Concurrency

The Numerical Reproducibility Fair Trade: Facing the Concurrency Challenges at the Extreme Scale Michela Taufer With Dylan Chapp, Travis Johnston Based on our IEEE Cluster 2015 paper University of Delaware

Reproducible Accuracy • From Van Nostrand’s ScienDfic Encyclopedia Reproducibility: “closeness of agreement among repeated simulaDon results under the same iniDal condiDons over Dme” Accuracy: “conformity of a resulted value to an accepted standard (or scienDfic laws)” • Context: ensemble simulaDons of scienDfic phenomena at extreme scale with mulDthreading hardware consisDng of mulD-core processors coupled with many-core accelerators 2

• Repeatability (Same team, same experimental setup) The measurement can be obtained with stated precision by the same team ▪ using the same measurement procedure, the same measuring system, under the same operaDng condiDons, in the same locaDon on mulDple trials. For computaDonal experiments, this means that a researcher can reliably repeat her own computaDon. • Replicability (Different team, same experimental setup) The measurement can be obtained with stated precision by a different team ▪ using the same measurement procedure, the same measuring system, under the same operaDng condiDons, in the same or a different locaDon on mulDple trials. For computaDonal experiments, this means that an independent group can obtain the same result using the author’s own arDfacts. • Reproducibility (Different team, different experimental setup) The measurement can be obtained with stated precision by a different team, a ▪ different measuring system, in a different locaDon on mulDple trials. For computaDonal experiments, this means that an independent group can obtain the same result using arDfacts which they develop completely independently. From: hQps://www.acm.org/publicaDons/policies/arDfact-review-badging 3

Molecular Dynamics on Accelerators MD simulation step: • Each GPU-thread computes forces on single atoms ▪ E.g., bond, angle, dihedrals and, nonbond forces • Forces are added to compute acceleration • Acceleration is used to update Force à AcceleraDon à velocities • Velocities are used to update the Velocity à PosiDon positions 4

The Strange Case of Constant Energy MDs • Enhancing performance of MD simulaDons allows simulaDons of larger Dme scales and length scales • GPU compuDng enables large-scale MD simulaDon ▪ SimulaDons exhibit unprecedented speed-up factors • MD simulation of NaI solution Constant energy MD simulation system containing 988 waters, 18 Na+, and 18 I − : GPU is X15 faster than CPU ----- Single precision 5

The Strange Case of Constant Energy MDs • Enhancing performance of MD simulaDons allows simulaDons of larger Dme scales and length scales • GPU compuDng enables large-scale MD simulaDon ▪ SimulaDons exhibit speed-up factors of X10-X30 • MD simulation of NaI solution Constant energy MD simulation system containing 988 waters, 18 Na+, and 18 I − : GPU is X15 faster than CPU ----- Single precision 6

The Strange Case of Constant Energy MDs • Enhancing performance of MD simulaDons allows simulaDons of larger Dme scales and length scales • GPU compuDng enables large-scale MD simulaDon ▪ SimulaDons exhibit unprecedented speed-up factors • MD simulation of NaI solution GPU single precision GPU single precision system containing 988 waters, 18 GPU double precision Na+, and 18 I − : GPU is X15 faster than CPU ----- Single precision 7

The Strange Case of Constant Energy MDs • Enhancing performance of MD simulaDons allows simulaDons of larger Dme scales and length scales • GPU compuDng enables large-scale MD simulaDon ▪ SimulaDons exhibit unprecedented speed-up factors • MD simulation of NaI solution GPU double precision system containing 988 waters, 18 Na+, and 18 I − : GPU is X15 faster than CPU 8

Just a Case of Code Accuracy? • A plot of the energy fluctua@ons versus @me step size should follow an approximately logarithmic trend 1 • Energy fluctuaDons are proporDonal to Dme step size for large Dme step size • Larger than 0.5 fs • A different behavior for step size less than 0.5 fs is consistent with results previously presented and discussed in other work 2 1 Allen and Tildesley, Oxford: Clarendon Press, (1987) 9 2 Bauer et al., J. Comput. Chem. 32(3): 375 – 385, 2011

The Exascale Environment From a recent talk of Lucy Nowell, DoE Program Director 10 (Distinguished Speaker Lecture, University of Delaware, Oct 10, 2014)

The Exascale Environment From a recent talk of Lucy Nowell, DoE Program Director 11 (Distinguished Speaker Lecture, University of Delaware, Oct 10, 2014)

Discussion Outline • Focus on reproducible accuracy of global summa@on • ScienDsts demand increased reproducible accuracy ▪ Must be reproducible enough • Many approaches have been proposed ▪ Must be cost effec@ve • Empirical results illustrate the need for runDme selecDon of reducDon operators that ensure a given degree of reproducible accuracy 12

Discussion Outline • Causes of loss of reproducibility ▪ Well-known floaDng-point issues ▪ Non-determinism at exascale • Techniques for recovering reproducibility ▪ Enhanced summaDon algorithms • Empirical evaluaDon of summaDon algorithms’ cost • QuanDfying reproducible accuracy ▪ IdenDfy key factors in variability of error accumulaDon ▪ Study response of summaDon algorithms to those factors • Lesson learned 13

Well-Known Problem • The modeling of finite-precision arithme@c maps an infinite set of real numbers onto a finite set of machine numbers http://cs.smith.edu/dftwiki/index.php/CSC231 An Introduction to Fixed- and 14 Floating-Point Numbers

Simple Example a = 10 9 , b = − 10 9 , c = 10 − 9 Summation order 1 ( a + b ) + c = (10 9 − 10 9 ) + 10 − 9 = 10 − 9 Summation order 2 a + ( b + c ) = 10 9 + ( − 10 9 + 10 − 9 ) = 0 15

Simple Example a = 10 9 , b = − 10 9 , c = 10 − 9 Summation order 1 ( a + b ) + c = (10 9 − 10 9 ) + 10 − 9 = 10 − 9 Summation order 2 a + ( b + c ) = 10 9 + ( − 10 9 + 10 − 9 ) = 0 16

Non-Determinism at Extreme Scale ReducDon tree shape x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 + + + + + + + + + + + + s s s 2 s 2 s 1 s 1 ( ) ( ) exact sum error bounds exact sum Causes include: dynamic task scheduling and fault recovery 17

Non-Determinism at Extreme Scale Arrangement of operands x 1 x 2 x 3 x 4 x 5 x 6 x 7 x 8 x 6 x 3 x 1 x 7 x 8 x 2 x 5 x 4 + + + + + + + + + + + + s s s 2 s 1 s 2 s 1 ( ) ( ) exact sum error bounds exact sum Causes include: dynamic task scheduling and fault recovery 18

Non-AssociaDvity + Non-Determinism • No control on the way N floaDng-point numbers are assigned to N threads Error Magnitude x0 x1 x2 x3 x4 x5 x6 x7 x8 x9 x10 x11 x12 x13 x14 x15 • Different thread orders cause round-off errors to accumulate Number of Operands in different ways, leading to different summation results 19 19

Non-AssociaDvity + Non-Determinism Error Magnitude Number of Operands 20

Non-AssociaDvity + Non-Determinism Error Magnitude Increasing concurrency == Widening interval of possible sums Number of Operands 23

Inadequacy of ConvenDonal Wisdom • In pracDce error bounds are overly pessimisDc (i.e., usually N * ε << 1) and thus unreliable predictors Worst case error bound 24

Techniques for Recovering Reproducibility • Fixed reducDon order ▪ Ensuring that all floaDng-point operaDons are evaluated in the same order from run to run • Increased precision numerical types ▪ Mixed precision - e.g. use higher-precision types for sensiDve computaDons and standard types for less sensiDve computaDons • Interval arithmeDc ▪ Replace floaDng-point types with custom types represenDng finite-length intervals of real numbers • Enhanced SummaDon Algorithms ▪ Compensated summaDon e.g., Kahan and composite precision ▪ Pre-rounded reproducible summaDon 25

Techniques for Recovering Reproducibility • Fixed reducDon order ▪ Ensuring that all floaDng-point operaDons are evaluated in the same order from run to run • Increased precision numerical types ▪ Mixed precision - e.g. use higher-precision types for sensiDve computaDons and standard types for less sensiDve computaDons • Interval arithmeDc ▪ Replace floaDng-point types with custom types represenDng finite-length intervals of real numbers • Enhanced summaDon algorithms ▪ Compensated summaDon e.g., Kahan and composite precision ▪ Pre-rounded reproducible summaDon 26

Techniques for Recovering Reproducibility • Fixed reducDon order ▪ Ensuring that all floaDng-point operaDons are evaluated in the same order from run to run • Increased precision numerical types ▪ Mixed precision - e.g. use higher precision types for sensiDve computaDons and standard types for less sensiDve computaDons • Interval arithmeDc ▪ Replace floaDng-point types with custom types represenDng finite-length intervals of real numbers • Enhanced summaDon algorithms ▪ Compensated summaDon e.g., Kahan and composite precision ▪ Pre-rounded reproducible summaDon 27

Standard SummaDon: DefiniDon 28

The Numerical Reproducibility Fair Trade: Facing the Concurrency - PowerPoint PPT Presentation

The Numerical Reproducibility Fair Trade: Facing the Concurrency Challenges at the Extreme Scale Michela Taufer With Dylan Chapp, Travis Johnston Based on our IEEE Cluster 2015 paper University of Delaware Reproducible Accuracy From Van

Numerical reproducibility of high-performance computations using floating-point or interval

Computational Reproducibility in Production Physics Applications Numerical Reproducibility at

Fair Testing - O Wings We are learning to carry out a fair test. What is a fair test? Fair

Fun & Fact Filled Events! By Atiya Raja and Kelsey Bumsted Event Ideas - Fair Trade Wine

WFTO FAIR TRADE SUMMIT NEW IDEAS WFTO FAIR TRADE SUMMIT 17/09/2019 2 A UN UNIQUE QUE

THE COLLEGE FAIR What is a college fair? When should I attend a fair? Why should I go

SC SCIENCE FAIR IENCE FAIR Calallen Independent School District SCI SCIENCE ENCE FAIR FAIR

Who grows the food that we eat every day? FAIR TRADEBANANAS VS NON FAIR TRADE BANANAS NON F

Computational Reproducibility Daniel S. Katz Jennifer Freeman Smith Computational

Rigor, Reproducibility, and Transparency David T. Redden, PhD Co-Director, CCTS BERD Chair,

Worksheets Percy Liang UCI Reproducibility Symposium September 22, 2020 The current research

Reproducibility & Generalizability @ Twitter Strengthening Reproducibility in Network Science

Everware - lowering reproducibility barriers Andrey Ustyuzhanin Yandex School of Data Analysis

Facing Up to Faults Facing Up to Faults Facing Up to Faults (v.2.0.1) (v.2.0.1) (v.2.0.1)

PubPol 201 Module 3: International Trade Policy Class 1 Introduction to Trade and Trade Policy

STAT 113 Independent vs. Paired Samples Colin Reimer Dawson Oberlin College November 16, 2017

Independence Complexes of Finite Groups Casey Pinckney Research Advisors: Dr. Alexander Hulpke,

Lecturer: Dr. Adote Anum , Dept. of Psychology Contact Information: aanum@ug.edu.gh College of

Chapter 8 ___________________________________ Experimental Design

Independent Evaluator for The Public Service Company of Colorados 2017 All-Source Solicitation

Sta$s$cs & Experimental Design with R Barbara Kitchenham

Oregon Public Schools Oregon Public Schools Di Disposition Best Practices Guide iti B t P ti

Comparing Groups 1 Two Independent Groups Parametric t.test(q1 ~ gender)