Porting Scalable Parallel CFD Application Krishnababu et. al. - PowerPoint PPT Presentation

HiFUN on GPU Porting Scalable Parallel CFD Application Krishnababu et. al. HiFUN on NVIDIA GPU D. V. Krishnababu, N. Munikrishna, Nikhil Vijay Shende 1 N. Balakrishnan 2 Thejaswi Rao 3 1. S & I Engineering Solutions Pvt. Ltd., Bangalore, India 2. Aerospace Engineering, Indian Institute of Science, Banglore, India 3. NVIDIA Graphics Pvt. Ltd., Banglore, India GPU Technology Conference Silicon Valley March 26–29, 2018 1 / 18

Introduction http://www.sandi.co.in HiFUN on GPU The HiFUN Software Krishnababu et. al. Hi gh Resolution F low Solver on Un structured Meshes. A C omputational F luid D ynamics (CFD) Flow Solver. Primary product of the company SandI. Robust, fast, accurate and efficient tool. About SandI A technology company. Incubated from Indian Institute of Science, Bangalore. Promotes high end CFD technologies with uncompromising quality standards. 2 / 18

Features of HiFUN http://www.sandi.co.in/home/products HiFUN on GPU General Krishnababu et. al. 3 / 18

Features of HiFUN http://www.sandi.co.in/home/products HiFUN on Well Validated GPU Krishnababu et. al. AIAA DPW SPICES AIAA HiLiftPW 4 / 18

Features of HiFUN http://www.sandi.co.in/home/products HiFUN on Super Scalable Workload: 165 Million Volumes GPU Krishnababu et. al. Simulation CPU Cores Time (Hours/Days) 256 30/1.25 RANS 10000 1 256 108/4.5 URANS 10000 3 256 525/22 DES 10000 15 5 / 18

SandI–NVIDIA Collaboration HiFUN on ✲ HiFUN on NVIDIA Pascal, Volta GPU GPU Way Krishnababu Ahead NVLink With IBM Power CPU et. al. ✲ GTC 2018 2018 GTCx Mumbai 2016 HiFUN in GPU Apps Catalogue ✲ GTC 2016: Poster Presentation ✲ NVIDIA Innovation Award 2015 ✲ Joint Development Initiative Kicks Off 2014 6 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Krishnababu et. al. Hybrid Supercomputers Consist of CPU and NVIDIA GPU. Less power to achieve same FLOPS. Less cooling & space. GPU Thousands of computing cores sharing same RAM. Higher memory bandwidth. High data transfer overheads with CPU. 7 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Krishnababu et. al. Parallelization Model on GPU Shared memory. Many FLOPS per byte of data from CPU to GPU. Re–look at parallelization of CFD algorithms. Parallelization Challenges General purpose algorithms. Implicit: Global data dependence. Complex multi–layered unstructured data structure. 8 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Constraints Krishnababu et. al. No compromise on distributed memory scalability. Source code maintainability should not suffer. Software portability should not suffer. Parallel Strategy Accelerate single node performance via offload model. Hybrid: MPI and OpenACC directives. Offload Model Computationally intensive part is offloaded to GPU. Optimal data communication between CPU & GPU. 9 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Krishnababu et. al. Onera M6 NASA CRM Trap Wing Configurations & Workloads (Million) Onera M6 Wing: 1.1, 9.3, 12.12, 15.4 NASA CRM: 6.2, 26.5, 30 NASA Trap Wing: 20, 66 Simulation Type Steady RANS Simulations 10 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Computing Platform: NVIDIA PSG Krishnababu Node configuration et. al. Two Hexa–deca core Intel(R) Xeon(R) Haswell processors. Eight NVIDIA Tesla K–80 GPUs. GPU Memory = 12 GB. Total CPU Memory per node = 256 GB. Infiniband interconnect Software PGI Compiler 16.7 OPENMPI 1.10.2 OpenACC 2.0 11 / 18

HiFUN on NVIDIA GPU Parallel Performance Parameters HiFUN on GPU Ideal Speed–up Krishnababu et. al. Ratio of number of nodes used for a given run to reference number of nodes. Actual Speed–up Ratio of time/iteration using reference number of nodes to time/iteration using number of nodes for given run. Accelerator Speed–up Ratio of time per iteration obtained using given no. of CPUs to time per iteration obtained using same no. of CPUs working in tandem with GPUs. 12 / 18

HiFUN on NVIDIA GPU Single Node Performance HiFUN on GPU Krishnababu et. al. Accelerator Speed–up on 2 GPU Observations Increase in grid size increases GPU utilization and accelerator speed–up. Important to load GPU completely. 13 / 18

HiFUN on NVIDIA GPU Single Node Performance HiFUN on GPU Krishnababu et. al. Varying GPUs % Increase Observations Increase in no. of GPUs increase accelerator speed–up. Use of 4 GPUs per node is optimal. 14 / 18

HiFUN on NVIDIA GPU Single Node Performance HiFUN on GPU Krishnababu et. al. Time to RANS Solution (Hours) Observations Time to solution on 1 million grid ∼ 15 minutes. Time to solution on 30 million grid ∼ half a day. Single node serves as a desktop supercomputer. 15 / 18

HiFUN on NVIDIA GPU Multi–node Performance HiFUN on GPU Krishnababu et. al. Parallel Speed–up: 66 Million Workload Observations Near linear speed–up using 2 GPUs per node. Drop in speed–up for larger no. nodes and/or higher GPUs due to lower GPU utilization. 16 / 18

HiFUN on NVIDIA GPU Multi–node Performance HiFUN on GPU Krishnababu et. al. Normalized Time Per Iteration: 66 Million Workload Observations Drop in time/iter with increase in no. of nodes and/or GPUs. Time to solution with 8 nodes ∼ 4 hours. 17 / 18

HiFUN on NVIDIA GPU HiFUN on GPU Krishnababu et. al. Concluding Remarks Offload model to port HiFUN on GPU. GPU based computing node is powerful enough to serve as desktop supercomputer. HiFUN is ideally suited to solve grand challenge problems on GPU based hybrid supercomputers. OpenACC directives based offload model is an attractive option for porting legacy CFD codes on GPU. 18 / 18

Porting Scalable Parallel CFD Application Krishnababu et. al. - PowerPoint PPT Presentation

HiFUN on GPU Porting Scalable Parallel CFD Application Krishnababu et. al. HiFUN on NVIDIA GPU D. V. Krishnababu, N. Munikrishna, Nikhil Vijay Shende 1 N. Balakrishnan 2 Thejaswi Rao 3 1. S & I Engineering Solutions Pvt. Ltd., Bangalore,

Science Clouds and CFD NIA CFD Conference: Future Directions in CFD Research, A Modeling and

CFD Introduction Lecture 15 ME EN 412 Andrew Ning aning@byu.edu Outline CFD Overview CFD

Porting Go to NetBSD/arm64 Maya Rashish <coypu@sdf.org> Porting Go to NetBSD/arm64

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President & CEO

CFD Analysis ME 24-688 Introduction to CAD/CAE Tools Lecture Topics Team Project 2 Discussion

Joint Community Facilities Agreement for Transbay CFD December 9, 2014 Role of JCFA in CFD

Travis Unified School District CFD No. 1 and CFD No. 2 Governing Board Meeting October 8, 2019

Lunch & Learn | Phillip CFD FOR INTERNAL CIRCULATION ONLY Jasvind Singh CFD Dealer

Current status in CFD Resistance & Propulsion Application of CFD in the maritime and

CFD Acceleration with FPGA Launching byteLAKEs CFD Suite Krzysztof Rojek, CTO at byteLAKE,

Porting Porting Biological Biological Applications Applications in Grid: An in Grid: An

PORTING THE HAMMER FILE SYSTEM TO LINUX Daniel Lorch June 10, 2009 Outline 2/13 Motivation 1.

Porting OpenVMS to x86-64 Update Clair Grant Camiel Vanderhoeven April 8, 2016 Porting OpenVMS

Porting GASNet to Portals: Porting GASNet to Portals: Partitioned Global Address Space (PGAS)

Prex: Finding Guidance for Forward and Backward Porting of Linux Device Drivers Julia Lawall,

Security- -Enhanced Darwin: Enhanced Darwin: Security Porting SELinux to Mac OS X Porting

Flavour cocoa origins Laurent Pipitone Director of the Economics and Statistics Division

Controlled Foreign Corporation Certificate Course on International Taxation, Chennai Arpit Jain

Club Finance Council ASUCD Spring 2017 Grace Cheng, CFC Chair Amanda Griffith, CFC Student

Community Facilities Districts Special Tax Reduction Plan Presented to the Board of Education

COUNTY OF MAUI PUBLIC INFRASTRUCTURE FINANCING WATER AND INFRASTRUCTURE COMMITTEE COMMUNITY

Joint Public Safety Training Academy Project Pre-Submittal / Networking Conference October 25,

Casitas Munic ipal Water Distr ic t Pr oposed Community F ac ilities Distr ic t for the

TJPA 2020 Bond Sale March 12, 2020 Series 2020 Tax Allocation Bonds 1. Refinance TIFIA Loan

Porting Scalable Parallel CFD Application Krishnababu et. al. - PowerPoint PPT Presentation

HiFUN on GPU Porting Scalable Parallel CFD Application Krishnababu et. al. HiFUN on NVIDIA GPU D. V. Krishnababu, N. Munikrishna, Nikhil Vijay Shende 1 N. Balakrishnan 2 Thejaswi Rao 3 1. S & I Engineering Solutions Pvt. Ltd., Bangalore,

Science Clouds and CFD NIA CFD Conference: Future Directions in CFD Research, A Modeling and

CFD Introduction Lecture 15 ME EN 412 Andrew Ning aning@byu.edu Outline CFD Overview CFD

Porting Go to NetBSD/arm64 Maya Rashish &lt;coypu@sdf.org&gt; Porting Go to NetBSD/arm64

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President &amp; CEO

CFD Analysis ME 24-688 Introduction to CAD/CAE Tools Lecture Topics Team Project 2 Discussion

Joint Community Facilities Agreement for Transbay CFD December 9, 2014 Role of JCFA in CFD

Travis Unified School District CFD No. 1 and CFD No. 2 Governing Board Meeting October 8, 2019

Lunch &amp; Learn | Phillip CFD FOR INTERNAL CIRCULATION ONLY Jasvind Singh CFD Dealer

Current status in CFD Resistance &amp; Propulsion Application of CFD in the maritime and

CFD Acceleration with FPGA Launching byteLAKEs CFD Suite Krzysztof Rojek, CTO at byteLAKE,

Porting Porting Biological Biological Applications Applications in Grid: An in Grid: An

PORTING THE HAMMER FILE SYSTEM TO LINUX Daniel Lorch June 10, 2009 Outline 2/13 Motivation 1.

Porting OpenVMS to x86-64 Update Clair Grant Camiel Vanderhoeven April 8, 2016 Porting OpenVMS

Porting GASNet to Portals: Porting GASNet to Portals: Partitioned Global Address Space (PGAS)

Prex: Finding Guidance for Forward and Backward Porting of Linux Device Drivers Julia Lawall,

Security- -Enhanced Darwin: Enhanced Darwin: Security Porting SELinux to Mac OS X Porting

Flavour cocoa origins Laurent Pipitone Director of the Economics and Statistics Division

Controlled Foreign Corporation Certificate Course on International Taxation, Chennai Arpit Jain

Club Finance Council ASUCD Spring 2017 Grace Cheng, CFC Chair Amanda Griffith, CFC Student

Community Facilities Districts Special Tax Reduction Plan Presented to the Board of Education

COUNTY OF MAUI PUBLIC INFRASTRUCTURE FINANCING WATER AND INFRASTRUCTURE COMMITTEE COMMUNITY

Joint Public Safety Training Academy Project Pre-Submittal / Networking Conference October 25,

Casitas Munic ipal Water Distr ic t Pr oposed Community F ac ilities Distr ic t for the

TJPA 2020 Bond Sale March 12, 2020 Series 2020 Tax Allocation Bonds 1. Refinance TIFIA Loan

Porting Go to NetBSD/arm64 Maya Rashish <coypu@sdf.org> Porting Go to NetBSD/arm64

Challenges in Application Porting and Abstraction Presented by: Raj Johnson, President & CEO

Lunch & Learn | Phillip CFD FOR INTERNAL CIRCULATION ONLY Jasvind Singh CFD Dealer

Current status in CFD Resistance & Propulsion Application of CFD in the maritime and