Reflecting on the Goal and Baseline of Exascale Computing Thomas C. - PowerPoint PPT Presentation

Reflecting on the Goal and Baseline of Exascale Computing Thomas C. Schulthess | T. Schulthess � 1

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 TFLOP/s sustained 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 PFLOP/s sustained 1st application at > 1 TFLOP/s sustained 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 PFLOP/s sustained 1st application at > 1 TFLOP/s sustained KKR-CPA (MST) 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 PFLOP/s sustained 1st application at > 1 TFLOP/s sustained KKR-CPA (MST) LSMS (MST) 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 PFLOP/s sustained 1st application at > 1 TFLOP/s sustained KKR-CPA (MST) WL-LSMS (MST) LSMS (MST) 1,000-fold performance improvement per decade | T. Schulthess � 2

Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b 1st application at > 1 PFLOP/s sustained 1st application at > 1 TFLOP/s sustained KKR-CPA (MST) WL-LSMS (MST) 1,000x perf. improv. per decade seems hold for multiple-scattering-theory(MST)- based electronic structure for materials science LSMS (MST) 1,000-fold performance improvement per decade | T. Schulthess � 2

“Only” 100-fold performance improvement in climate codes Source: Peter Bauer, ECMWF Source: Peter Bauer, ECMWF | T. Schulthess � 3

Has the efficiency of weather & climate codes dropped 10-fold every decade? | T. Schulthess � 4

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray Y-MP @ 300kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray XT5 @ 7MW Cray Y-MP @ 300kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray XT5 @ 7MW Cray Y-MP @ 300kW IBM P5 @ 400 kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray XT5 @ 7MW Cray Y-MP @ 300kW IBM P6 @ 1.3 MW IBM P5 @ 400 kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray XT5 @ 1.8 MW Cray XT5 @ 7MW Cray Y-MP @ 300kW IBM P6 @ 1.3 MW IBM P5 @ 400 kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Floating points efficiency dropped from 50% on Cray Y-MP to 5% on today’s Cray XC (10x in 2 decades) Cray XT5 @ 1.8 MW Cray XT5 @ 7MW System size (in energy footprint) grew   much faster on “Top500” systems Cray Y-MP @ 300kW IBM P6 @ 1.3 MW IBM P5 @ 400 kW LSMS (MST) WL-LSMS (MST) KKR-CPA (MST) Source: Peter Bauer, ECMWF � 5 | T. Schulthess

Source: Christoph Schär, ETH Zurich, & Nils Wedi, ECMWF | T. Schulthess � 6

Can the delivery of a 1km-scale capability be pulled in by a decade? Source: Christoph Schär, ETH Zurich, & Nils Wedi, ECMWF | T. Schulthess � 6

Leadership in weather and climate Peter Bauer, ECMWF | T. Schulthess � 7

Leadership in weather and climate European model may be the best – but far away from sufficient accuracy and reliability! Peter Bauer, ECMWF | T. Schulthess � 7

The impact of resolution: simulated tropical cyclones 130 km 60 km 25 km Observations HADGEM3 PRACE UPSCALE, P.L. Vidale (NCAS) and M. Roberts (MO/HC) | T. Schulthess � 8

Resolving convective clouds (convergence?) Bulk convergence Structural convergence Area-averaged bulk effects upon ambient flow: Statistics of cloud ensemble: E.g., heating and moistening of cloud layer E.g., spacing and size of convective clouds Source: Christoph Schär, ETH Zurich | T. Schulthess � 9

Structural and bulk convergence (Panosetti et al. 2018) Statistics of cloud area Statistics of up- & downdrafts 0 0 10 10 8 km 4 km −1 10 −1 2 km 10 1 km 500 m −2 Factor 4 10 relative frequency relative frequency −2 10 −3 10 −3 10 grid-scale −4 10 clouds [%] −4 10 −5 71 10 64 54 −5 10 −6 10 47 43 No structural convergence Bulk statistics of updrafts converges −6 −7 10 10 −2 0 2 4 −10 −5 0 5 10 15 10 10 10 10 convective mass flux [kg m −2 s −1 ] cloud area [km 2 ] Source: Christoph Schär, ETH Zurich | T. Schulthess � 10

What resolution is needed? • There are threshold scales in the atmosphere and ocean : going from 100 km to 10 km is incremental, 10 km to 1 km is a leap. At 1km • it is no longer necessary to parametrise precipitating convection, ocean eddies, or orographic wave drag and its effect on extratropical storms; • ocean bathymetry, overflows and mixing, as well as regional orographic circulation in the atmosphere become resolved; • the connection between the remaining parametrisation are now on a physical footing. • We spend the last five decades in a paradigm of incremental advances. Here we incrementally improved the resolution of models from 200 to 20km • Exascale allows us to make the leap to 1 km. This fundamentally changes the structure of our models. We move from crude parametric presentations to an explicit, physics based, description of essential processes. • The last such step change was fifty years ago. This was when, in the late 1960s, climate scientists first introduced global climate models, which were distinguished by their ability to explicitly represent extra-tropical storms, ocean Bjorn Stevens, MPI-M gyres and boundary current. | T. Schulthess � 11

Simulation throughput: Simulate Years Per Day (SPYD) NWP Climate in production Climate spinup Simulation 10 d 100 y 5’000 y Desired wall clock time 0.1 d 0.1 y 0.5 y ratio 100 1'000 10'000 SYPD 0.27 2.7 27 | T. Schulthess � 12

Simulation throughput: Simulate Years Per Day (SPYD) NWP Climate in production Climate spinup Simulation 10 d 100 y 5’000 y Desired wall clock time 0.1 d 0.1 y 0.5 y ratio 100 1'000 10'000 SYPD 0.27 2.7 27 Minimal throughout 1 SYPD , preferred 5 SYPD | T. Schulthess � 12

Summary of intermediate goal (reach by 2021?) Horizontal resolution 1 km (globally quasi-uniform) Vertical resolution 180 levels (surface to ~100 km) Time resolution Less than 1 minute Coupled Land-surface/ocean/ocean-waves/sea-ice Atmosphere Non-hydrostatic Precision Single (32bit) or mixed precision Compute rate 1 SYPD (simulated year wall-clock day) | T. Schulthess � 13

Running COSMO 5.0 at global scale on Piz Daint Scaling to full system size: ~5300 GPU accelerate nodes available Running a near-global (±80º covering 97% of Earths surface) COSMO 5.0 simulation & IFS > Either on the hosts processors: Intel Xeon E5 2690v3 (Haswell 12c). > Or on the GPU accelerator: PCIe version ofNVIDIA GP100 (Pascal) GPU | T. Schulthess � 14

September 15, 2015 Today’s Outlook: GPU-accelerated Weather Forecasting John Russell “Piz Kesch” | T. Schulthess � 15

40 Requirements from MeteoSwiss 6x Data assimilation 35 30 25 Ensemble with multiple forecasts 24x 20 15 10 Grid 2.2 km � 1.1 km 5 10x 1 Constant budget for investments and operations | T. Schulthess � 16

Reflecting on the Goal and Baseline of Exascale Computing Thomas C. - PowerPoint PPT Presentation

Reflecting on the Goal and Baseline of Exascale Computing Thomas C. Schulthess | T. Schulthess 1 Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b | T. Schulthess 2 Tracking supercomputer performance

White Paper Runs Name Description baseline2018a Baseline Project-official baseline (official

Technical Baseline Management Technical Baseline Management September 30, 2003 Pat Hascall LAT

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

The U.S. D.O.E. Exascale Computing Project Goals and Challenges Paul Messina, ECP Director

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Exascale Computing Project: Software Technology Perspective Rajeev Thakur, Argonne National Lab.

Reflecting on TA efficiency and Reflecting on TA efficiency and effectiveness: Complementary or

Playing with C : Reflecting about the real/imaginary axis Reflecting about the real axis: Use

Hillside Marine Baseline Overview AUSTRALIAS NEXT GREAT COPPER PROJECT HILLSIDE: SOUTH

Meeting Staff Baseline Testing: How to Prepare for Workforce Disruptions May 20, 2020 Preparing

BNL Neutrino Long Baseline Neutrino Initiative N. Simos, BNL NWG Homestake Baseline = 2540 Km

The Baseline The Baseline Personal Process Personal Process AU INSY 560, Singapore 1997, Dan

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

Part 5 Usability and Security Cognitive Errors, Usability vs. Security, Groupware Antonio Cerone

Specifjcation and Implementation of Replicated List The Jupiter Protocol Revisited

Software, Architecture, and VLSI Co-Design for Efficient Task-Based Parallel Runtimes Christopher

Variations of Chord lookup algorithm in P2P networks Project Proposal Team Gulties Distributed

Predicting Chroma from Luma using Frequency Domain Intra Prediction in Codecs Based on Lapped

Pricing of Accreting Swaptions using QuantLib Dr. Andr Miemiec, 13./14. Nov. 2013 Agenda 1.

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553

Introduction Harsh realities of network analytics netbeam Demo Technology

Sambuz

Useful Links

Newsletter

Mail Us

Reflecting on the Goal and Baseline of Exascale Computing Thomas C. - PowerPoint PPT Presentation

Reflecting on the Goal and Baseline of Exascale Computing Thomas C. Schulthess | T. Schulthess 1 Tracking supercomputer performance over time? Linpack benchmark solves: Ax = b | T. Schulthess 2 Tracking supercomputer performance

White Paper Runs Name Description baseline2018a Baseline Project-official baseline (official

Technical Baseline Management Technical Baseline Management September 30, 2003 Pat Hascall LAT

HPC Future Look Exascale and Challenges Outline Future architectures Exascale initiatives

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

exascale road in China Ruibo WANG National University of Defense Technology Contents NUDT

Major Challenges to Achieve Exascale Performance Shekhar Borkar Intel Corp. April 29, 2009

The U.S. D.O.E. Exascale Computing Project Goals and Challenges Paul Messina, ECP Director

The Exascale Computing Project (ECP) Paul Messina, ECP Director Stephen Lee, ECP Deputy Director

Exascale Computing Project: Software Technology Perspective Rajeev Thakur, Argonne National Lab.

Reflecting on TA efficiency and Reflecting on TA efficiency and effectiveness: Complementary or

Playing with C : Reflecting about the real/imaginary axis Reflecting about the real axis: Use

Hillside Marine Baseline Overview AUSTRALIAS NEXT GREAT COPPER PROJECT HILLSIDE: SOUTH

Meeting Staff Baseline Testing: How to Prepare for Workforce Disruptions May 20, 2020 Preparing

BNL Neutrino Long Baseline Neutrino Initiative N. Simos, BNL NWG Homestake Baseline = 2540 Km

The Baseline The Baseline Personal Process Personal Process AU INSY 560, Singapore 1997, Dan

THE ROAD TO EXASCALE: HARDWARE AND SOFTWARE CHALLENGES JACK DONGARRA UNIVERSITY OF TENNESSEE

Part 5 Usability and Security Cognitive Errors, Usability vs. Security, Groupware Antonio Cerone

Specifjcation and Implementation of Replicated List The Jupiter Protocol Revisited

Software, Architecture, and VLSI Co-Design for Efficient Task-Based Parallel Runtimes Christopher

Variations of Chord lookup algorithm in P2P networks Project Proposal Team Gulties Distributed

Predicting Chroma from Luma using Frequency Domain Intra Prediction in Codecs Based on Lapped

Pricing of Accreting Swaptions using QuantLib Dr. Andr Miemiec, 13./14. Nov. 2013 Agenda 1.

Learning: Nearest Neighbor, Perceptrons &amp; Neural Nets Artificial Intelligence CSPP 56553

Introduction Harsh realities of network analytics netbeam Demo Technology

Sambuz

Useful Links

Newsletter

Mail Us

Learning: Nearest Neighbor, Perceptrons & Neural Nets Artificial Intelligence CSPP 56553