http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com - - PowerPoint PPT Presentation

http tempoquest com
SMART_READER_LITE
LIVE PREVIEW

http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com - - PowerPoint PPT Presentation

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com CTO, Tempo Quest Inc. GTC 2016 San Jose, CA 5 April, 2016


slide-1
SLIDE 1

Allen Huang, Ph.D. allen@tempoquest.com CTO, Tempo Quest Inc. GTC 2016 San Jose, CA 5 April, 2016

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

http://www.tempoquest.com

1

slide-2
SLIDE 2
  • Why Weather Forecast is not accurate enough

– Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC

  • Acceleration of Weather Forecasting S/W

– Same forecasts faster, much faster – Better forecasts take much more computations

  • Location, timing, intensity, next hour, tomorrow, next week, ….
  • Most of the legacy S/W can’t take advantage of the new H/W
  • Acceleration of Satellite Data Processing

– Hyperspectral Data Retrieval – Hyperspectral Data Compression

  • Summary

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

2

slide-3
SLIDE 3
  • Why Weather Forecast is not accurate enough

– Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC

  • Acceleration of Weather Forecasting S/W

– Same forecasts faster, much faster – Better forecasts take much more computations

  • Location, timing, intensity, next hour, tomorrow, next week, ….
  • Most of the legacy S/W can’t take advantage of the new H/W
  • Acceleration of Satellite Data Processing

– Hyperspectral Data Retrieval – Hyperspectral Data Compression

  • Summary

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

3

slide-4
SLIDE 4

Why are the Weather Forecast Models not accurate enough?

4

Three critical factors:

  • 1. Imperfect MODEL
  • 2. Lack of/Erroneous INITIAL

DATA/CONDITIONS

  • No data or sparse

coverage, infrequent

  • Unknown attributes;

not coupled

  • 3. Lack of COMPUTING

POWER

4

slide-5
SLIDE 5

Why are the Weather Forecast Models not accurate enough?

5

Three critical factors:

  • 1. Imperfect MODEL
  • 2. Lack of/Erroneous INITIAL

DATA/CONDITIONS

  • 3. Lack of COMPUTING POWER
  • Increasing needs of ensemble

runs

  • Increasing demands for higher

resolution

  • Increasing high frequency of

assimilations

  • Increasing model complexity

Resulting to high demand in computing resources

100,000 to 200,000 CPU cores required for:

  • Global cloud resolving

NIM @2KM resolution, 2x/day

  • Regional Models

North American (NA) Domain HRRR @<1KM, hourly

  • Ensembles

HRRR @3KM NA, 100 members, hourly Reference : 250,000 CPU cost ~$100M; use 7,000KW & ~$8M/year energy bill

5

slide-6
SLIDE 6

Why are the Weather Forecast Models not accurate enough?

6

Operational (T574~ 27km) Experiment (T1500~ 13km)

Note: Last 24h of the high resolution experiment track based

  • n 6h model output

2X resolution ≈ 10X of computing cost

6

slide-7
SLIDE 7

1 Zflops = 1021 flops 1 million trillion (1 billion billion) flop per sec, or 1 exaflops

7

slide-8
SLIDE 8
  • Why Weather Forecast is not accurate enough

– Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC

  • Acceleration of Satellite Data Processing

– Hyperspectral Data Retrieval – Hyperspectral Data Compression

  • Acceleration of Weather Forecasting S/W

– Same forecasts faster, much faster – Better forecasts take much more computations

  • Location, timing, intensity, next hour, tomorrow, next week, ….
  • Most of the legacy S/W can’t take advantage of the new H/W
  • Summary

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

8

slide-9
SLIDE 9

9

slide-10
SLIDE 10

Processing times – CPU Vs. GPU Early Result (2009)

Our experiments on the Intel i7 970 CPU running at 3.20 GHz and a single GPU out of two GPUs on NVIDIA GTX 590

Time [ms] The original Fortran code on CPU 16928 CUDA C with I/O on GPU 83.6 CUDA C without I/O on GPU 48.3

slide-11
SLIDE 11

The Fast Radiative Transfer Model

with the regression-based transmittances:

 

( ) ( ) ( ) ( )

s

p v v v v s v s v

d p R B T p B T p dp dp      

Without losing the generality of our GPU implementation, we consider the following radiative transfer model:

11

slide-12
SLIDE 12

12

slide-13
SLIDE 13
  • A forward model to concurrently compute 40 radiance spectra was further

developed to take advantage of GPU’s massive parallelism capability.

To compute one day's amount of 1,296,000 IASI spectra, the original RTM (with –O2 optimization) will take ~10 days on a 3.0 GHz CPU core; the single-input GPU-RTM will take ~ 10 minutes (with 1455x speedup), whereas the multi-input GPU-RTM will take ~ 5 minutes (with 3024x speedup).

GPU-based Multi-input RTM

slide-14
SLIDE 14

GPU Acceleration of Satellite Hyper Spectral Maximum Likelihood Retrieval

14

slide-15
SLIDE 15

GPU Acceleration of Predictive Partitioned Vector Quantization for Ultraspectral Sounder Data Compression

15

slide-16
SLIDE 16
  • Why Weather Forecast is not accurate enough

– Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC

  • Acceleration of Satellite Data Processing

– Hyperspectral Data Retrieval – Hyperspectral Data Compression

  • Acceleration of Weather Forecasting S/W

– Same forecasts faster, much faster – Accleration of Weather Research and Forecasting (WRF) Model

  • Radiation; PBL, Surface
  • Cumulus Parameterization, Cloud Microphysics and Dynamic Core
  • Summary

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

16

slide-17
SLIDE 17

CONtinental United States (CONUS) benchmark data set for 12 km resolution domain for October 24, 2001

  • The size of the CONUS 12 km domain is 433 x 308 horizontal grid points with 35 vertical

levels.

  • The test problem is a 12 km resolution 48-hour forecast over the Continental U.S.

capturing the development of a strong baroclinic cyclone and a frontal boundary that extends from north to south across the entire U.S.

17

slide-18
SLIDE 18

18

slide-19
SLIDE 19

RRTMG LW 123x / 127x (GPU) JSTARS, 7, 3660-3667, 2014 RRTMG SW 202x / 207x (GPU) JSTARS, PP, 1-11, 2015 Goddard SW 92x / 134x (GPU) JSTARS, 5, 555-562, 2012 Dudhia SW 19x / 409x MYNN SL 6x / 113x TEMF SL 5x / 214x Thermal Diffusion LS 10x / 311x [ 2.1 x ] (GPU) JSTARS, 8, 2249-2259, 2015 YSU PBL 34x / 193x [ 2.4x ] (GPU) GMD, 8, 2977-2990, 2015 TEMF PBL [14.8x ] (MIC) SPIE:doi:10.1117/12.2055040 Betts-Miller-Janjic (BMJ) convetion 55x / 105x Radiation Surface PBL CU P

GPU speedup: speedup with IO / speedup without IO MIC improvement factor in [ ]: w.r.t. 1st version multi-threading code before any improvement

slide-20
SLIDE 20

Kessler MP 70x / 816x

  • J. Comp. & GeoSci., 52, 292-299, 2012

Purdue-Lin MP 156x / 692x [ 4.2x] (GPU) SPIE: doi:10.1117/12.901825 WSM 3-class MP 150x / 331x WSM 5-class MP 202x / 350x (GPU) JSTARS, 5, 1256-1265, 2012 Eta MP 37x / 272x SPIE: doi:10.1117/12.976908 WSM 6-class MP 165x / 216x (GPU) J. Comp. & GeoSci., 83, 17-26, 2015 Goddard GCE MP 348x / 361x [ 4.7x] (GPU) JSTARS, 8, 2260-2272, 2015 Thompson MP 76x / 153x [ 2.3x] (MIC) SPIE: doi:10.1117/12.2055038 SBU 5-class MP 213x / 896x JSTARS, 5, 625-633, 2012 WDM 5-class MP 147x / 206x WDM 6-class MP 150x / 206x

  • J. Atmo. Ocean. Tech., 30, 2896, 2013

Cloud Microphysics

GPU speedup: speedup with IO / speedup without IO MIC improvement factor in [ ]: w.r.t. 1st version multi-threading code before any improvement

20

slide-21
SLIDE 21

Tempo Quest Inc. (TQI) S/W Product Pipeline

Weather/Environment Domain

  • AceCAST Lite: 6 months out

Pre AceCAST (CPU/GPU “Hybrid” WRF)

  • AceCAST: 12 months out (subject to funding)

CUDA GPU WRF

  • Beyond AceCAST: 2-3 years out (subject to funding)

DataCAST (CUDA WRF Data Assimilation) ChemCAST (CUDA WRF Chem) HurCAST (CUDA Hurricane WRF) HydroCAST (CUDA WRF Hydro) FireCAST (CUDA WRF Fire)

21

slide-22
SLIDE 22

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications

22

Thank you for your Attention Questions are Welcomed allen@tempoquest.com