Managing Defects in HPC Software Development Presented to OLCF - PowerPoint PPT Presentation

Managing Defects in HPC Software Development Presented to OLCF Webinar Series Tom Evans ORNL, PI ExaSMR ECP Applications Project November 1, 2017

Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc 2 Defects. HPC Software

Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) 2 Defects. HPC Software

Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply 2 Defects. HPC Software

Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply • I promise that there will be no distracting manager clip-art, sliding images, dissolution, etc. 2 Defects. HPC Software

Before we start • Since I cannot see anyone in this presentation format, feel free to totally vegout, use profane gestures, etc • I am not proselytizing; these are some techniques that have worked well for us over the last 20+ years; if you violently disagree see (1) • I will try to keep this short and sweet, in the end there is only 1 concept I would like you to take away from this—assuming item (1) does not apply • I promise that there will be no distracting manager clip-art, sliding images, dissolution, etc. • If you require sparkly things in the presentation to keep you awake, please refer back to item (1). 2 Defects. HPC Software

Outline 1 Research and Software Development 2 The Complete Development Lifecycle 3 Unit Testing 4 Design-by-Contract TM 5 Summary 3 Defects. HPC Software

Research and HPC Code Challenge Manage SQE with discovery Posit Consider a new algorithm implemented in a multidimensional, parallel code. • Theory predicts second-order convergence. • Computational results are first-order instead of second-order. • Is this a code bug or an error in analysis? 4 Defects. HPC Software

Research and HPC Code • In other words, SQE and methods research are not only compatible, they are essential • This is especially true for parallel scientific software, which is much more difficult to design, test, and analyze than serial software. • We are interested in this case in performing software verification • Software verification is a method for removing defects at code construction time 5 Defects. HPC Software

What is SQE • SQE is the practice of managing the cost and quality of a software product • Guiding Principle The cost of defect resolution increases with time from defect introduction ⋆ • Things fall apart ◮ Defects in model development ◮ Defects in algorithmic selection ◮ Defects in requirements ◮ Defects in implementation 6 Defects. HPC Software

How to mitigate defects • There are many methods for defect management • Three techniques we use for software verification in an HPC environment ◮ The complete development lifecycle ◮ Unit-testing ◮ Design-by-Contract TM • This list is by no means exhaustive (or a complete SQE process) ◮ Notably missing, reviews ◮ We do them, they work, but I’m not here to talk about them • However, taken together these can help catch defects before they become an unbearable expense 7 Defects. HPC Software

Requirements Management in Scientific Software • Requirements can be very difficult to pin down in scientific software development: ◮ the vector keeps changing as new things are learned ◮ as a community we often know what we want, but aren’t necessarily good at saying it • Software verification helps disambiguate language-based requirements into functional specifications • As requirements change, software verification helps ensure that the software is keeping pace. • Agility is key in scientific software development: ◮ rapid prototyping ◮ testing new methods, algorithms, and features 8 Defects. HPC Software

Complete Development Lifecycle • The developer is responsible for the complete implementation of a feature including: ◮ Requirements ◮ Derivation ◮ Construction ◮ Deployment • Documentation and verification is implicit in each phase • Reviews and team collaboration are essential Developers are responsible for all phases of code development 9 Defects. HPC Software

Unit Testing Unit testing is a form of software verification • It ensures that each part of the software performs its contracted task • The effectiveness of unit-testing is greatly enhanced by the following two code design practices: ◮ Acyclic code design ◮ Design-by-Contract TM (see later) We practice a method of unit testing in which the unit test is written either before, or concurrently with, the executable code. 10 Defects. HPC Software

Acyclic Code Design Geometry Geometry, Physics Physics Tallier There are no physical or logical cyclic dependencies RTK_Core_Geometry <T:RTK_Array<RTK_Array<RTK_Cell>>> Geometry, Physics Domain_Transporter Boundary_Mesh Geometry, Physics Eigenvalue_Solver <<bind>> T RTK_Geometry Geometry, Physics Geometry, Physics Source_Transporter Solver T RTK_Array Geometry, Physics Fixed_Source_Solver Geometry, Physics Geometry, Physics DR_Source_Transporter DD_Source_Transporter RTK_Cell Allows hierarchical testing 11 Defects. HPC Software

An Example—Reactor Geometry Figure: Small modular reactor core model. 12 Defects. HPC Software

An Example—Reactor Geometry Sample starting neutron 1 13 Defects. HPC Software

An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) 13 Defects. HPC Software

An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) Calculate distance to boundary 3 13 Defects. HPC Software

An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) Calculate distance to boundary Process collision 3 lk Move particle 4 Tally state data 5 φ = 1 V ∑ l k k 13 Defects. HPC Software

An Example—Reactor Geometry Sample starting neutron 1 Sample distance to collision 2 d col = log ( ξ ) σ ( r , E ) x Calculate distance to boundary 3 Move particle 4 Tally state data 5 φ = 1 V ∑ l k k Repeat 2–5 6 13 Defects. HPC Software

First Level— RTK_Cell T RTK_Geometry + initalize() + distance_to_boundary() • Here is the class diagram for the + move_across_surface() + move_within_cell() <<bind>> RTK_Core_Geometry + position() RTK_Geometry part of the code <T:Core> + direction() + change_direction() + reflect() + boundary_state() T Lattice RTK_Array <T:RTK_Cell> <<bind>> + initialize() + distance_to_boundary() + update_state() + cross_surface() + find_object() <<bind>> +matid() Core <T:Lattice> RTK_Cell + initialize() + distance_to_boundary() + update_state() + cross_surface() + matid() 14 Defects. HPC Software

Managing Defects in HPC Software Development Presented to OLCF - PowerPoint PPT Presentation

Managing Defects in HPC Software Development Presented to OLCF Webinar Series Tom Evans ORNL, PI ExaSMR ECP Applications Project November 1, 2017 Before we start Since I cannot see anyone in this presentation format, feel free to totally

Properties Structure + Defects The processing determines the defects Composition Structure of

HPC @ SAO S.G. Korzennik - SAO HPC Analyst hpc@cfa February 2013 SGK ( hpc@cfa ) HPC @ SAO

Defects and Disorders in Hafnium Defects and Disorders in Hafnium Defects and Disorders in

Uni.lu HPC School 2020 PS6: HPC Containers: Singularity Uni.lu High Performance Computing (HPC)

The HPC Skill Tree A Brief Overview Kai Himstedt On Behalf of the HPC-CF Board BoF:

Whats new in HPC? Gregory Bauer To keep up-to-date on HPC HPC Guru -

UL HPC School 2017[bis] PS1: Getting Started on the UL HPC platform UL High Performance

UL HPC School 2017 PS5: Advanced Scheduling with SLURM and OAR on UL HPC clusters UL High

UL HPC School 2017 PS1: Getting Started on the UL HPC platform UL High Performance Computing

Free Software and HPC Free Software and HPC Juan Antonio A nel Cabanelas aetherlux@es.gnu.org

building software with ease kenneth.hoste@ugent.be HPC UGENT About HPC UGent: central

4/9/2018 Topics Defects and Reliability Supplemental Materials: Software Testing Software

CONTAINERS DEMOCRATIZE HPC CJ Newburn, Principal Architect for HPC, NVIDIA GTC19 S9525 -

HPC IN EUROPE Organisation of public HPC resources Context Focus on publicly-funded HPC

Computer Security Summer Scholars 2016 Ma7 Vander Werf HPC System Administrator Security in HPC

Building a Grid System for HPC HPC on Grid High Performance Computing (HPC): Use of computer

Agents Artificial Intelligence @ Allegheny College Janyl Jumadinova 27 January, 2020 Janyl

Data Parallel Programming II Mary Sheeran Example (as requested) Associative non-commutative

phidian Introduction Memory Hierarchy Modularity The Entity-Component System Design Pattern

Architecture Comparison for Concurrent Multi-Band Linear Power Amplifiers Zhen Zhang, Yifei Li,

Home Lab 3 Explained Operational Amplifiers (op-amps) Professor Peter YK Cheung Dyson School of

Comparing Two Approaches to Testing Linearity against Markov-switching Type Non-linearity Jana

NMR journey Introduction to solution NMR Alexandre Bonvin Bijvoet Center for

First-Class Signals Generators Memoization for Functional Reactive Programming Start time