MPAS on GPUs Using OpenACC: Portability, Scalability & Performance
- Dr. Raghu Raj Kumar
MPAS on GPUs Using OpenACC: Portability, Scalability & - - PowerPoint PPT Presentation
MPAS on GPUs Using OpenACC: Portability, Scalability & Performance Dr. Raghu Raj Kumar Project Scientist I & Group Head Special Technical Projects (STP) Group National Center for Atmospheric Research March 2018 Outline
2
3
4
5
6
Integration Setup 1% Moist coefficients 0% imp_coef 1% dyn_tend 32% small_step 4% acoustic_step 13% large_step 16% diagnostics 20% substep 5% MPI 8%
7
Software & Architecture Configuration & Accuracy
Verification Benchmark
Testing Code Refactoring
9
10
11
12
SP
DP
SP
DP
Speedup Broadwell vs V100 Speedup Broadwell vs P100 P100 with Power8(1 GPU, PGI compiled, OpenACC code) 120 Km (40K) P100 with Haswell(1 GPU, PGI compiled, OpenACC code) V100 with Haswell (1 GPU, PGI compiled, OpenACC code) Broadwell (Fully Subscribed, OpenMP Enabled, Intel compiled, Base code) Dataset 60 Km (163K)
13
14
15
0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 40k SP 40k DP 163k SP 163k DP Time (secs) Dataset Base Code OpenACC Code
16
17