http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com - PowerPoint PPT Presentation

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com CTO, Tempo Quest Inc. GTC 2016 San Jose, CA 5 April, 2016 1

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications • Why Weather Forecast is not accurate enough – Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC • Acceleration of Weather Forecasting S/W – Same forecasts faster, much faster – Better forecasts take much more computations • Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W • Acceleration of Satellite Data Processing – Hyperspectral Data Retrieval – Hyperspectral Data Compression • Summary 2

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications • Why Weather Forecast is not accurate enough – Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC • Acceleration of Weather Forecasting S/W – Same forecasts faster, much faster – Better forecasts take much more computations • Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W • Acceleration of Satellite Data Processing – Hyperspectral Data Retrieval – Hyperspectral Data Compression • Summary 3

Why are the Weather Forecast Models not accurate enough? Three critical factors: 1. Imperfect MODEL 2. Lack of/Erroneous INITIAL DATA/CONDITIONS  No data or sparse coverage, infrequent  Unknown attributes; not coupled 3. Lack of COMPUTING POWER 4 4

Why are the Weather Forecast Models not accurate enough? Three critical factors: 1. Imperfect MODEL 100,000 to 200,000 CPU cores 2. Lack of/Erroneous INITIAL required for: DATA/CONDITIONS  Global cloud resolving 3. Lack of COMPUTING POWER NIM @2KM resolution,  Increasing needs of ensemble 2x/day  Regional Models runs North American (NA) Domain  Increasing demands for higher HRRR @<1KM , hourly resolution  Ensembles HRRR @3KM NA, 100  Increasing high frequency of members, hourly assimilations  Increasing model complexity Reference : 250,000 CPU Resulting to high demand in cost ~$100M; use 7,000KW & ~$8M/year energy bill computing resources 5 5

Why are the Weather Forecast Models not accurate enough? Operational (T574~ 27km) Experiment (T1500~ 13km) Note: Last 24h of the high resolution experiment track based 2X resolution ≈ 10X of computing cost on 6h model output 6 6

1 Zflops = 10 21 flops 1 million trillion (1 billion billion) flop 7 per sec, or 1 exaflops

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications • Why Weather Forecast is not accurate enough – Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC • Acceleration of Satellite Data Processing – Hyperspectral Data Retrieval – Hyperspectral Data Compression • Acceleration of Weather Forecasting S/W – Same forecasts faster, much faster – Better forecasts take much more computations • Location, timing, intensity, next hour, tomorrow, next week, …. • Most of the legacy S/W can’t take advantage of the new H/W • Summary 8

Processing times – CPU Vs. GPU Early Result (2009) Time [ms] The original Fortran code on CPU 16928 CUDA C with I/O on GPU 83.6 CUDA C without I/O on GPU 48.3 Our experiments on the Intel i7 970 CPU running at 3.20 GHz and a single GPU out of two GPUs on NVIDIA GTX 590

The Fast Radiative Transfer Model Without losing the generality of our GPU implementation, we consider the following radiative transfer model:    d ( ) p   p    s v R B T ( ) ( p ) B T p ( ) dp v v v s v s v dp 0 with the regression-based transmittances: 11

GPU-based Multi-input RTM  A forward model to concurrently compute 40 radiance spectra was further developed to take advantage of GPU ’ s massive parallelism capability. To compute one day's amount of 1,296,000 IASI spectra, the original RTM (with – O2 optimization) will take ~10 days on a 3.0 GHz CPU core; the single-input GPU-RTM will take ~ 10 minutes (with 1455x speedup), whereas the multi-input GPU-RTM will take ~ 5 minutes (with 3024x speedup).

GPU Acceleration of Satellite Hyper Spectral Maximum Likelihood Retrieval 14

GPU Acceleration of Predictive Partitioned Vector Quantization for Ultraspectral Sounder Data Compression 15

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications • Why Weather Forecast is not accurate enough – Model is not Perfect yet – evolving scientific understanding & algorithm development – Data is not always accurate – actual and accurate initial data are expensive to collect & process – High performance computer is expensive – only can afford limited resource to deploy & operate HPC • Acceleration of Satellite Data Processing – Hyperspectral Data Retrieval – Hyperspectral Data Compression • Acceleration of Weather Forecasting S/W – Same forecasts faster, much faster – Accleration of Weather Research and Forecasting (WRF) Model • Radiation; PBL, Surface • Cumulus Parameterization, Cloud Microphysics and Dynamic Core • Summary 16

CONtinental United States (CONUS) benchmark data set for 12 km resolution domain for October 24, 2001 • The size of the CONUS 12 km domain is 433 x 308 horizontal grid points with 35 vertical levels. • The test problem is a 12 km resolution 48-hour forecast over the Continental U.S. capturing the development of a strong baroclinic cyclone and a frontal boundary that extends from north to south across the entire U.S. 17

RRTMG LW 123x / 127x (GPU) JSTARS, 7, 3660-3667, 2014 Radiation RRTMG SW 202x / 207x (GPU) JSTARS, PP, 1-11, 2015 Goddard SW 92x / 134x (GPU) JSTARS, 5, 555-562, 2012 Dudhia SW 19x / 409x MYNN SL 6x / 113x Surface TEMF SL 5x / 214x Thermal Diffusion 10x / 311x [ 2.1 x ] (GPU) JSTARS, 8, 2249-2259, 2015 LS YSU PBL 34x / 193x [ 2.4x ] (GPU) GMD, 8, 2977-2990, 2015 PBL TEMF PBL [14.8x ] (MIC) SPIE:doi:10.1117/12.2055040 Betts-Miller-Janjic 55x / 105x CU P (BMJ) convetion GPU speedup: speedup with IO / speedup without IO MIC improvement factor in [ ]: w.r.t. 1 st version multi-threading code before any improvement

Kessler MP 70x / 816x J. Comp. & GeoSci., 52, 292-299, 2012 Purdue-Lin MP 156x / 692x [ 4.2x] (GPU) SPIE: doi:10.1117/12.901825 WSM 3-class MP 150x / 331x Cloud Microphysics WSM 5-class MP 202x / 350x (GPU) JSTARS, 5, 1256-1265, 2012 Eta MP 37x / 272x SPIE: doi:10.1117/12.976908 WSM 6-class MP 165x / 216x (GPU) J. Comp. & GeoSci., 83, 17-26, 2015 Goddard GCE MP 348x / 361x [ 4.7x] (GPU) JSTARS, 8, 2260-2272, 2015 Thompson MP 76x / 153x [ 2.3x] (MIC) SPIE: doi:10.1117/12.2055038 SBU 5-class MP 213x / 896x JSTARS, 5, 625-633, 2012 WDM 5-class MP 147x / 206x WDM 6-class MP 150x / 206x J. Atmo. Ocean. Tech., 30, 2896, 2013 GPU speedup: speedup with IO / speedup without IO MIC improvement factor in [ ]: w.r.t. 1 st version multi-threading code before any improvement 20

Tempo Quest Inc. (TQI) S/W Product Pipeline Weather/Environment Domain  AceCAST Lite: 6 months out  Pre AceCAST (CPU/GPU “Hybrid” WRF)  AceCAST: 12 months out (subject to funding)  CUDA GPU WRF  Beyond AceCAST: 2-3 years out (subject to funding)  DataCAST (CUDA WRF Data Assimilation)  ChemCAST (CUDA WRF Chem)  HurCAST (CUDA Hurricane WRF)  HydroCAST (CUDA WRF Hydro)  FireCAST (CUDA WRF Fire) 21

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications Thank you for your Attention Questions are Welcomed allen@tempoquest.com 22

http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com - PowerPoint PPT Presentation

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com CTO, Tempo Quest Inc. GTC 2016 San Jose, CA 5 April, 2016

WEATHER FORECASTING ON STEROIDS AT A GLANCE Market for customized forecasts is > $8 billion

THE WORLDS FASTEST MOST PRECISE FORECASTS Legal Disclaimer Any securities (the

THE WORLDS FASTEST MOST PRECISE FORECASTS AT A GLANCE Market for customized forecasts is

http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco

WebSocket Server Implementation Chao-Wei Peng HTTP Request Model Service Http Client Http

HTTP/2 & InfoSec Anderson Dadario Topics HTTP Today Why HTTP/2 How it works

http://expyuzz4wqqyqhjn.onion http://expyuzz4wqqyqhjn.onion http://expyuzz4wqqyqhjn.onion

http://i-build.com.au/ http://facebook.com/ibuildau/ http://pinterest.com/ibuildbuildings/

WWW HTTP, Ajax, APIs, REST HTTP Hypertext Transfer Protocol Request Web Client HTTP Server

QUIC CPU Pergormance Can HTTP/3 be as efficient as HTTP/2 and HTTP 1.1? SIGCOMM EPIQ 2020,

A Comparative Review of HTTP/1.1, HTTP/2 & HTTP/3 December 3, 2018 Nancy Mogire WHAT

HTTP - Request/ Response HTTP - Documentation HTTP/1.1 is defined by RFC2616 of the IETF

HTTP Review Carey Williamson Department of Computer Science University of Calgary Credit: Most

http://faculty.washington.edu/tdavis/description.html http://www.youtube.com/watch?v=_5bGPa-QXV4

http://ar.wikipedia.org/wiki / http :// www . masraheon . com / . htm 3 .

Luxsure France Luxsure France http://www.luxsure.fr Luxsure France http://luxsure.com/people

CHERPAC Results for the CHERPAC Results for the Agricultural Scenario Agricultural Scenario

FULL YEAR 2016 RESULTS AND OUTLOOK THIERRY LE HENAFF CHAIRMAN AND CEO A STRONG GROUPS PROFILE

The Unconference Honouring diverse stakeholders & expertise in interdisciplinary spaces Dawn

NYSERDA Conference, Albany, NY November, 2011 1 E&S Environmental Chemistry, Inc., P.O. Box

Press Release 28th January 2013 SSTL to build up to twelve satellites for FORMOSAT-7s global

Technique Demo and Practice 47b Side-lying & Pregnancy Massage Technique Demo and Practice

Weir Ready: Public Education Campaign Rationale We believe that many people still don't know

Lower Darling June 2018 Introduction Current System Update Operations 2016-18

http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com - PowerPoint PPT Presentation

GPU Acceleration of Weather Forecasting and Meteorological Satellite Data Assimilation, Processing and Applications http://www.tempoquest.com Allen Huang, Ph.D. allen@tempoquest.com CTO, Tempo Quest Inc. GTC 2016 San Jose, CA 5 April, 2016

WEATHER FORECASTING ON STEROIDS AT A GLANCE Market for customized forecasts is &gt; $8 billion

THE WORLDS FASTEST MOST PRECISE FORECASTS Legal Disclaimer Any securities (the

THE WORLDS FASTEST MOST PRECISE FORECASTS AT A GLANCE Market for customized forecasts is

http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco http://ecademy.agnessco

WebSocket Server Implementation Chao-Wei Peng HTTP Request Model Service Http Client Http

HTTP/2 &amp; InfoSec Anderson Dadario Topics HTTP Today Why HTTP/2 How it works

http://expyuzz4wqqyqhjn.onion http://expyuzz4wqqyqhjn.onion http://expyuzz4wqqyqhjn.onion

http://i-build.com.au/ http://facebook.com/ibuildau/ http://pinterest.com/ibuildbuildings/

WWW HTTP, Ajax, APIs, REST HTTP Hypertext Transfer Protocol Request Web Client HTTP Server

QUIC CPU Pergormance Can HTTP/3 be as efficient as HTTP/2 and HTTP 1.1? SIGCOMM EPIQ 2020,

A Comparative Review of HTTP/1.1, HTTP/2 &amp; HTTP/3 December 3, 2018 Nancy Mogire WHAT

HTTP - Request/ Response HTTP - Documentation HTTP/1.1 is defined by RFC2616 of the IETF

HTTP Review Carey Williamson Department of Computer Science University of Calgary Credit: Most

http://faculty.washington.edu/tdavis/description.html http://www.youtube.com/watch?v=_5bGPa-QXV4

http://ar.wikipedia.org/wiki / http :// www . masraheon . com / . htm 3 .

Luxsure France Luxsure France http://www.luxsure.fr Luxsure France http://luxsure.com/people

CHERPAC Results for the CHERPAC Results for the Agricultural Scenario Agricultural Scenario

FULL YEAR 2016 RESULTS AND OUTLOOK THIERRY LE HENAFF CHAIRMAN AND CEO A STRONG GROUPS PROFILE

The Unconference Honouring diverse stakeholders &amp; expertise in interdisciplinary spaces Dawn

NYSERDA Conference, Albany, NY November, 2011 1 E&amp;S Environmental Chemistry, Inc., P.O. Box

Press Release 28th January 2013 SSTL to build up to twelve satellites for FORMOSAT-7s global

Technique Demo and Practice 47b Side-lying &amp; Pregnancy Massage Technique Demo and Practice

Weir Ready: Public Education Campaign Rationale We believe that many people still don't know

Lower Darling June 2018 Introduction Current System Update Operations 2016-18

WEATHER FORECASTING ON STEROIDS AT A GLANCE Market for customized forecasts is > $8 billion

HTTP/2 & InfoSec Anderson Dadario Topics HTTP Today Why HTTP/2 How it works

A Comparative Review of HTTP/1.1, HTTP/2 & HTTP/3 December 3, 2018 Nancy Mogire WHAT

The Unconference Honouring diverse stakeholders & expertise in interdisciplinary spaces Dawn

NYSERDA Conference, Albany, NY November, 2011 1 E&S Environmental Chemistry, Inc., P.O. Box

Technique Demo and Practice 47b Side-lying & Pregnancy Massage Technique Demo and Practice