GPU Benefits for Earth System Science Stan Posey, Program Manager, - - PowerPoint PPT Presentation

gpu benefits for earth system science
SMART_READER_LITE
LIVE PREVIEW

GPU Benefits for Earth System Science Stan Posey, Program Manager, - - PowerPoint PPT Presentation

GPU Benefits for Earth System Science Stan Posey, Program Manager, ESS Domain NVIDIA (HQ), Santa Clara, CA Raghu Kumar, PhD, Software Engineer, Developer Technology NVIDIA, Boulder, CO ACHIEVEMENTS IN HPC AND AI TOPICS NUMERICAL MODEL


slide-1
SLIDE 1

Stan Posey, Program Manager, ESS Domain NVIDIA (HQ), Santa Clara, CA Raghu Kumar, PhD, Software Engineer, Developer Technology NVIDIA, Boulder, CO

GPU Benefits for Earth System Science

slide-2
SLIDE 2

2

TOPICS

  • ACHIEVEMENTS IN HPC AND AI
  • NUMERICAL MODEL DEVELOPMENTS
  • GPU UPDATE ON MPAS-A
slide-3
SLIDE 3

3

Piz Daint Europe’s Fastest 5,704 GPUs| 21 PF ORNL Summit #1 Top 500 27,648 GPUs| 144 PF ABCI Japan’s Fastest 4,352 GPUs| 20 PF ENI HPC4 Fastest Industrial 3,200 GPUs| 12 PF LLNL Sierra #2 Top 500 17,280 GPUs| 95 PF

World-Leading HPC Systems Deploy NVIDIA GPUs

NERSC-9 HPC System Based on “Volta-Next” GPU During 2020:

slide-4
SLIDE 4

4

Segmentation of Tropical Storms and Atmospheric Rivers on Summit using convolutional neural networks.

SC18 Gordon Bell Award: NERSC and NVIDIA Team

slide-5
SLIDE 5

5

Nearly perfect weak scaling up to 25k GPUS. 1 Exa-flop of performance. 100 years of climate model data in hours Demonstrates the power of this approach for large-scale data analysis

SC18 Gordon Bell Award: NERSC and NVIDIA Team

slide-6
SLIDE 6

6

SC18 NVIDIA Announcements on NWP Models

*Speedups comparing 2 x Skylake CPU vs. 4 x V100 GPU

*

slide-7
SLIDE 7

7

New NVIDIA AI Tech Centre at Reading University

https://blogs.nvidia.com/blog/2019/06/19/ai-technology-center-uk/

The Advanced Computing for Environmental Science (ACES) research group conducts cutting-edge research in computer science to accelerate environmental science. Environmental science depends on the analysis of large volumes of observational data and on sophisticated simulation schemes, coupling different physics on multiple time and special scales, demanding both supercomputing and specialised data analysis systems. ACES research themes address the future of the relevant computing and data systems. ACES is based in the Computer Science Department at the University of Reading.

slide-8
SLIDE 8

8

TOPICS

  • ACHIEVEMENTS IN HPC AND AI
  • NUMERICAL MODEL DEVELOPMENTS
  • GPU UPDATE ON MPAS-A
slide-9
SLIDE 9

9

Model Organizations Funding Source

E3SM-EAM, SAM US DOE: ORNL, SNL E3SM, ECP MPAS-A NCAR, UWyo, KISTI, IBM WACA II FV3/UFS NOAA SENA NUMA/NEPTUNE US Naval Res Lab, NPS ONR IFS ECMWF ESCAPE GungHo/LFRic MetOffice, STFC PSyclone ICON DWD, MPI-M, CSCS, MCH PASC ENIAC KIM KIAPS KMA CLIMA CLiMA (NASA JPL, MIT, NPS) Private, US NSF FV3 Vulcan, UW/Bretherton Private COSMO MCH, CSCS, DWD PASC GridTools AceCAST-WRF TempoQuest Venture backed

NVIDIA Collaborations With Atmospheric Models

Global: Regional:

slide-10
SLIDE 10

10

WRFg Based on ARW release 3.8.1 Several science-ready features:

Full WRF on GPU; 21 physics options Complete nesting functionality

Request download: https://wrfg.net

WRFg Collaboration with TempoQuest

WRFg Physics Options (21)

Microphysics Option

Kessler 1 WSM6 6 Thompson 8 Morrison 10 Aerosol-aware Thompson 28

Radiation

Dudhia (sw) 1 RRTMG (lw + sw) 4

Planetary boundary layer

YSU 1 MYNN 5

Surface layer

Revised MM5 1 MYNN 5

Land surface

5-layer TDS 1 Unified Noah 2 RUC 3

Cumulus

Kain-Fritsch 1; 11; 99 BMJ 2 Grell-Deveni 93 GRIMS Shallow Cumulus SHCU=3

slide-11
SLIDE 11

11

GPU Performance Study for the WRF Model on Summit

  • Jeff Adie, NVIDIA, Gökhan Sever, Rajeev Jain, DOE Argonne NL, and Stan Posey, NVIDIA

NV-WRFg Summit Scaling on 512 Nodes / 3,072 GPUs

Joint WRF and MPAS Users' Workshop 2019 NCAR, Boulder, USA

Lower is Better ORNL Summit node: 2 x P9 + 6 x V100 OpenACC, PGI 19.1 Based on NCAR WRF 3.7.1 WRF model configuration: Total 3.7B cells Thompson MP RRTM / Dudhia YSU PBL Revised MM5+TDS4

Full model + radiation Full model Full model + radiation Full model

Multiscale Coupled Urban Systems – PI, C. Catlett

http://www2.mmm.ucar.edu/wrf/users/workshops/WS2019/oral_presentations/3.3.pdf

slide-12
SLIDE 12

12

COSMO-2E (2 KM)

4 per day, 5 day forecast

COSMO-1E (1 KM)

8 per day, 33 hr forecast

IFS from ECMWF

4 per day, 18km / 9km (?)

MeteoSwiss

COSMO NWP Configurations During 2020

With V100 GPUs

MeteoSwiss Operational COSMO NWP on GPUs

Ensemble 21 members Ensemble 11 members

MeteoSwiss Roadmap

New V100 system in 2019 New EPS configurations

  • perational in 2020

New ICON-LAM in ~2022 (Pre-operational in 2020)

18 Nodes x 8 x V100 = 144 Total GPUs

slide-13
SLIDE 13

13

Source: https://www.geosci-model-dev-discuss.net/gmd-2017-230/

Large Scale COSMO HPC Demonstration Using ~5000 GPUs

COSMO 1km Near-Global Atmosphere on GPUs

slide-14
SLIDE 14

14

Source:

Strong Scaling to 4,888 x P100 GPUs Piz Daint

#6 Top500 25.3 PetaFLOPS 5320 x P100 GPUs

  • Oliver Fuhrer, et al, MeteoSwiss

3.7km GPU 3.7km CPU

Higher Is Better

19km GPU 19km CPU 1.9km GPU .93km GPU

https://www.geosci-model-dev-discuss.net/gmd-2017-230/

1.9km GPU .93km GPU > 10x ~5x

COSMO 1km Near-Global Atmosphere on GPUs

slide-15
SLIDE 15

15

NOAA FV3 GPU Strategy Includes OpenACC and GridTools

  • From Presentation by Dr. Xi Chen, NOAA GFDL, PASC 19, June 2019, Zurich, CH

NOAA FV3 and GPU Developments (Chen – 2019)

[X. Chen, others] [C. Bretherton, O. Fuhrer] [W. Putman, others]

2012: Early GPU development by NASA GSFC GMAO

slide-16
SLIDE 16

16

Loc Location - Da Date Organizations Mod

  • del

el(s) Ha Hackathon Proje ject KIS KISTI (KR (KR) - Feb KIS KISTI MPAS Physics (W (WSM6) CA CAS (C (CN) - May CM CMA GRAPES PRM advec ection ETH Zu Zurich- Ju Jun MCH, MPI-M, CS CSCS IC ICON Physics, radia iation MIT IT - Ju Jun MIT IT, CliM CliMA CliM CliMA Ocea cean Subgrid id scale le LE LES Prin rinceton - Ju Jun NOAA GFDL FV3/UFS SWE min ini-app kern rnels ls, , UFS radiation package NERSC - Ju Jul DO DOE LB LBNL E3SM E3SM MMF (E (ECP CP) Met t Offic ice - Sep Met t Offic ice, STFC NEMOVAR, WW III III Min inia iapp (?) (?) ORNL - Oct ct DO DOE ORNL, L, SNL E3SM E3SM SCR CREAM (Kokkos)

2019 ORNL Hackathons and GPU Model Progress

https://www.olcf.ornl.gov/for-users/training/gpu-hackathons/

slide-17
SLIDE 17

17

Loc Location - Da Date Organizations Mod

  • del

el(s) Ha Hackathon Proje ject KIS KISTI (KR (KR) - Feb KIS KISTI MPAS Physics (W (WSM6) CA CAS (C (CN) - May CM CMA GRAPES PRM advec ection ETH Zu Zurich- Ju Jun MCH, MPI-M, CS CSCS IC ICON Physics, radia iation MIT IT - Ju Jun MIT IT, CliM CliMA CliM CliMA Ocea cean Subgrid id scale le LE LES Prin rinceton - Ju Jun NOAA GFDL FV3/UFS SWE min ini-app kern rnels ls, , UFS radiation package NERSC - Ju Jul DO DOE LB LBNL E3SM E3SM MMF (E (ECP CP) Met t Offic ice - Sep Met t Offic ice, STFC NEMOVAR, WW III III Min inia iapp (?) (?) ORNL - Oct ct DO DOE ORNL, L, SNL E3SM E3SM SCR CREAM (Kokkos)

2019 ORNL Hackathons and GPU Model Progress

https://www.olcf.ornl.gov/for-users/training/gpu-hackathons/

slide-18
SLIDE 18

18

CliMA: New Climate Model Development

https://blogs.nvidia.com/blog/2019/07/17/clima-climate-model/

Global model Observations Super Parameterization

https://clima.caltech.edu/ Atmosphere Ocean

slide-19
SLIDE 19

19

Pushing the Envelope in Ocean Modeling

  • From Keynote Presentation by Dr. John Marshall, MIT, at Oxford AI Workshop, Sep 2019, Oxford, UK

Ocean LES Super Parameterization

CliMA: New Climate Model Development

https://clima.caltech.edu/

slide-20
SLIDE 20

20

“Optimization of an OpenACC Weather Simulation Kernel”

  • A. Gray, NVIDIA

30x Improvement from NVIDIA optimizations

  • ver original MetO code

OpenACC GPU Development for LFRic Model

OpenACC collaboration with MetOffice and SFTC: LFRic model

GungHo-MV (matrix-vector operations) OpenACC kernel developed by MetOffice NVIDIA optimizations applied to the OpenACC kernel achieved 30x improvement Improved OpenACC code provided to STFC as ‘target’ for Psyclone auto-generation

https://www.openacc.org/blog/optimizing-openacc-weather-simulation-kernel

slide-21
SLIDE 21

21

OpenACC collaboration with MetOffice and SFTC: LFRic model

GungHo-MV (matrix-vector operations) OpenACC kernel developed by MetOffice NVIDIA optimizations applied to the OpenACC kernel achieved 30x improvement Improved OpenACC code provided to STFC as ‘target’ for Psyclone auto-generation

OpenACC GPU Development for LFRic Model

https://www.openacc.org/blog/optimizing-openacc-weather-simulation-kernel

“Optimization of an OpenACC Weather Simulation Kernel”

  • A. Gray, NVIDIA

30x Improvement from NVIDIA optimizations

  • ver original MetO code
slide-22
SLIDE 22

22

ESCAPE Development of Weather & Climate Dwarfs

Batched Legendre Transform (GEMM) variable length Batched 1D FFT variable length

NVIDIA-Developed Dwarf: Spectral Transform - Spherical Harmonics

ESCAPE = Energy efficient SCalable Algorithms for weather Prediction at Exascale

slide-23
SLIDE 23

23

From “ECMWF Scalability Programme”

  • Dr. Peter Bauer,

UM User Workshop, MetOffice, Exeter, UK 15 June 2018

  • Single V100 GPU

improved SH dwarf by 14x vs. original

  • Single V100 GPU

improved MPDATA dwarf 57x vs. orig

ECMWF IFS Dwarf Optimizations – Single-GPU

SH Dwarf = 14x Advection = 57x Dwarf

Unoptimized

slide-24
SLIDE 24

24

ECMWF IFS SH Dwarf Optimization – Multi-GPU

From “ECMWF Scalability Programme”

  • Dr. Peter Bauer,

UM User Workshop, MetOffice, Exeter, UK 15 June 2018

  • Results of Spherical

Harmonics Dwarf on NVIDIA DGX System

  • Additional 2.4x gain

from DGX-2 NVSwitch for 16 GPU systems 16 GPUs per single node

slide-25
SLIDE 25

25

TOPICS

  • ACHIEVEMENTS IN HPC AND AI
  • NUMERICAL MODEL DEVELOPMENTS
  • GPU UPDATE ON MPAS-A