A Simulation of Global Atmosphere Model NICAM
- n TSUBAME 2.5 Using OpenACC
Hisashi YASHIRO RIKEN Advanced Institute of Computational Science Kobe, Japan
GTC2015, San Jose, Mar. 17-20, 2015
A Simulation of Global Atmosphere Model NICAM on TSUBAME 2.5 Using - - PowerPoint PPT Presentation
A Simulation of Global Atmosphere Model NICAM on TSUBAME 2.5 Using OpenACC Hisashi YASHIRO RIKEN Advanced Institute of Computational Science Kobe, Japan GTC2015, San Jose, Mar. 17-20, 2015 My topic The study for Cloud computing
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
Dynamics Physics
HEVI Tracer advection
Cloud Microphysics Radiation PBL
7% 13% 6% 8% 5% 6% 6% 17% Physics Dynamics
Ratio in the elapsed time Efficiency/PEAK on the K computer GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
GTC2015, San Jose, Mar. 17-20, 2015
westmare TUBAME2.5 GPU 2MPI/node 1GPU/MPI k20x k20x k20x westmare TUBAME2.5 CPU 8MPI/node s64VIIIfx K computer 1MPI/node 8thread/MPI 2620GFLOPS 500GB/s B/F=0.2 Fat-tree IB 128GFLOPS 64GB/s B/F=0.5 Tofu 102GFLOPS 64GB/s B/F=0.6 Fat-tree IB
GTC2015, San Jose, Mar. 17-20, 2015
12.2 15.1 1.8
TSUBAME(ACC) TSUBAME(HOST) K
5 node x 1 PE - 8 thread 5 node x 2 PE - 2 GPU 5 node x 8 PE 64GB/s 64GB/s 500GB/s Memory throughput x6.8 x8.3
GTC2015, San Jose, Mar. 17-20, 2015
Peak perf.[%] 1.5 3 4.5 6 5.3 4.4 1.7 TSUBAME2.5 GPU TSUBAME2.5 CPU K computer MFLOPS/W 30 60 90 120 42 13 109
GTC2015, San Jose, Mar. 17-20, 2015
Performance[GFLOPS] 1E+01 1E+02 1E+03 1E+04 1E+05 Node 1E+00 1E+01 1E+02 1E+03 1E+04 TSUBAME2.5 GPU (MPI = GPU = Node x 2) TSUBAME2.5 CPU (MPI = CPU = Node x 8) K CPU (MPI = Node, CPU = Node x 8)
GTC2015, San Jose, Mar. 17-20, 2015
➡ We really need GPU-optimized, popular compression library: cuHDF?
GTC2015, San Jose, Mar. 17-20, 2015
(bottleneck)
➡ We really need GPU-optimized, popular compression library: cuHDF?
GTC2015, San Jose, Mar. 17-20, 2015
(reduced)
Performance[GFLOPS] 1E+01 1E+02 1E+03 1E+04 1E+05 Node 1E+00 1E+01 1E+02 1E+03 1E+04 TSUBAME2.5 GPU (MPI = GPU = Node x 2) TSUBAME2.5 CPU (MPI = CPU = Node x 8) K CPU (MPI = Node, CPU = Node x 8)
16900 4356 1156 324 100 # of horizontal grid
GTC2015, San Jose, Mar. 17-20, 2015
➡ "Precision-aware" coding, from both scientific and computational viewpoint.
GTC2015, San Jose, Mar. 17-20, 2015