Implementing High-Resolution Fluid Dynamics Solver in a Performance - PowerPoint PPT Presentation

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances Implementing High-Resolution Fluid Dynamics Solver in a Performance Portable Way Applications to astrophysical compressible fluid dynamics Pierre Kestener CEA Saclay, DRF, Maison de la Simulation, FRANCE GPU Technology Conference (GTC) 2017, San Jose, May. 8, 2017 1 / 20

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances Content Motivations : computational sciences and software engineering Kokkos: library for performance portability RamsesGPU : CFD applications for astrophysics Refactoring Hydrodynamics and MHD kernels Same performance between old CUDA kernels and new Kokkos Kernels ? Implementing high-order numerical schemes with Kokkos Performance measurements on IBM Power8 + Nvidia Pascal P100 OpenMP scaling on Power8 (device Kokkos::OpenMP) GPU performance on Pascal P100 (device Kokkos::Cuda) Perpectives / Future applications and developments 2 / 20

Introduction Hydro - 2nd order finite volume schemes Motivations MOOD - High-order finite volume schemes Performance portability / Kokkos Kokkos - RamsesGPU / MOOD performances Motivations of this work - 1 RAMSES-GPU is developped in CUDA/C++ for astrophysics applications on regular grid ∼ 70k lines of code (out of which ∼ 16k in CUDA) Development started in 2009 ! A lot of optimization techniques accumulated over the years are not so critically important anymore on today’s GPU; both GPU hardware/sofware have tremendously evolved (in orders of magnitude in memory bandwidth, number of registers per SM, c++11, ...) Collaborations with domain scientists are hard when required software skills include CUDA. 2016-2017 is the right time to refactor code, sparkle new ways to develop scientific software at a higher abstraction level Science cases applications : MRI in accretion disk ( Pm = 4 ) : (256 GPU) at 800 × 1600 × 800 MHD Driven turbulence: (Mach ∼ 10 ) : resolution 2016 3 (486 GPUs) 3 / 20

Introduction Hydro - 2nd order finite volume schemes Motivations MOOD - High-order finite volume schemes Performance portability / Kokkos Kokkos - RamsesGPU / MOOD performances Motivations of this work - 2 Computationnal science ground - Computational Fluid Dynamics High-order numerical schemes for compressible hydrodynamics CFD - Euler system of partial differential equations How fast the numerical solution converges to the reference solution when increase space resolution ? For high-order numerical methods, one expects the error to decrease as | f − f r | ≤ h − N MOOD numerical schemes , introduced in 2011 by Diot, Chain, Loubère; very compute intensive (ref: Diot PhD thesis) Reference number to keep in mind ∼ 1 µ s /it/cell : time to update a cell in a mesh (serial, CPU, low-order scheme). 4 / 20

Introduction Hydro - 2nd order finite volume schemes Motivations MOOD - High-order finite volume schemes Performance portability / Kokkos Kokkos - RamsesGPU / MOOD performances Motivations of this work - 3 Software engineering Refactoring existing C++/CUDA code As much as possible performance portable code: write the code once, and let the user run it on the available target platform with performance as good as possible. Prefer a high-level approach among: Directive-based: OpenACC, OpenMP ease of use, incremental approach, for large legacy code bases, ... External smart library implementing parallel programming patterns (for, reduce, scan, ....): Kokkos, RAJA, agency, arrayFire libraries are such possibilities parallel programing patterns as 1 st class concepts, architecture adapted data containers, c++ integration / engineering, ... Other high-level approaches (more experimental): SYCL (Khronos Group standard ), hpx (heavy use of new c++ standards (11,14,17): std::future, std::launch::async , distributed parallelism, ...) 5 / 20

Introduction Hydro - 2nd order finite volume schemes Motivations MOOD - High-order finite volume schemes Performance portability / Kokkos Kokkos - RamsesGPU / MOOD performances C++ Kokkos library summary See GTC2017 session S7344 - Kokkos ? The C++ Performance Portability Programming Model (C. Trott and H.C. Edwards). Framework for efficient node-level parallelism Provides some high-level (abstract) concepts as template C++ classes: A kokkos device: Kokkos::Cuda, Kokkos::OpenMP, Kokkos::Pthreads, Kokkos::Serial ,... concepts controlled by C++ template meta-programing: execution space, memory space, memory layout, ... Computationnal parallel patterns (for, reduce, scan, ...) controlled with a execution policy (i.e. how many iterations, teams, ...) Kokkos::View : A multi-dimensionnal data container with hardware adapted memory layout - Kokkos::View<double **> data("data",NX,NY); // 2D array with sizes known at runtime - How do I access data ? data(i,j) ! Mostly a header library (C++ metaprograming) 6 / 20

Introduction Hydro - 2nd order finite volume schemes Motivations MOOD - High-order finite volume schemes Performance portability / Kokkos Kokkos - RamsesGPU / MOOD performances C++ Kokkos library summary Most commonly in a C/C++, multi-dimensionnal array access is done through index linearization (row or column-major in 2D): index = i + nx ∗ j In Kokkos, one should/must avoid this index linearization, let Kokkos::View do its job (decided at compile-time, hardware adapted) : 1D Kokkos::View with index linearization + 1D Iteration range 2D Kokkos::View + 1D Iteration range (used in this work) 2D Kokkos::View + 2D ( Kokkos::MDRange Kernel policy) : still an experimental feature Kokkos::MDRange is functional, but was generating kernels with some performance loss, will surely be solved shortly by Kokkos core developpers. See also new developpement on hierarchical task-data parallelism, session S7253 (Monday 8th, room 211B). 7 / 20

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances Compressible hydrodynamics : Euler system of equations Euler equations as conservative law system ∂ t U +∇ ∇ ∇ . F ( U ) = 0 ∂ρ ∂ t +∇ .( ρ v ) = 0 U n � � ∂ρ v i ∂ t +∇ ∇ ∇ . ρ v ⊗ v + P Id = 0 � � ∂ρ E +∇ . v ( ρ E + P ) = 0 ∂ t ( + dissipative terms (viscous, resistive) + MHD with shearing box setup) Formal 1st order discretization: � i + ∆ t U n + 1 = U n | e i j | F ( ˜ F ( ˜ F ( ˜ U i , ˜ U i , ˜ U i , ˜ U j ) U j ) U j ) i | V i | j In high-order scheme, use Runge-Kutta time integration + quadrature rules for computing the numerical fluxes F F F 8 / 20

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances A Finite volume solver - MUSCL-Hancock 2 nd order MUSCL-Hancock Read paramfile A priori limiting (to avoid spurious oscillations) Write t < t end restart file Slope computations: linear reconstruction inside each cell Compute dt δ U i = MINMOD ( U i − U i − 1 , U i + 1 − U i ) CFL condition Reconstruct states U le f t and U r ight on Compute limited slopes both sides of a given edge using limited slopes Reconstruct states at edges This numerical scheme is already available in C++/CUDA in RAMSES-GPU Compute fluxes Refactored with Kokkos U n +1 = U n i + ∆ t � j F i,j i 9 / 20

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances A Finite volume solver - MOOD High-order MOOD (Multi-Dim Optimal Order Detection) Read paramfile A posteriori limiting Introduced in 2011 by Clain, Diot and Loubère Write t < t end restart file Reconstructing multivariate polynomials of degree d Define a stencil large enough to perform a least square Compute dt estimation of the n − dimensionnal multivariate polynomial CFL condition interpolating cell-average values of U j in stencil Runge-Kutta Compute polynomial coeff if N is the number of cells in stencil, the linear system to solve decrease d (one per cell), using QR decomposition       Reconstruct States L i 1 u x w i 1 ( u 1 − u i ) Compute fluxes        L i 2   u y   w i 2 ( u 2 − u i )              L i 3 u xx w i 3 ( u 3 − u i )       Fluxes =       u yy . . valid ?       . .       . . .       . L iN . w iN ( u N − u i ) U n +1 = U n i + ∆ t � j F i,j i 10 / 20

Implementing High-Resolution Fluid Dynamics Solver in a Performance - PowerPoint PPT Presentation

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances Implementing High-Resolution Fluid Dynamics Solver in a Performance Portable Way Applications to astrophysical

Fluid Dynamics CSE169: Computer Animation Steve Rotenberg UCSD, Spring 2016 Fluid Dynamics

Fluid Dynamics CSE169: Computer Animation Steve Rotenberg UCSD, Winter 2017 Fluid Dynamics

Contents Contents Fluid

Supercritical Fluid Chromatography 1. What is supercritical fluid 2. Supercritical Fluid

Fluid Filled Cables Alan Ainsley Asset Integrity Cable System Engineer NGT Overview

SPARSE FLUID SIMULATION IN DIRECTX Alex Dunn Graphics Dev. Tech. AGENDA We want more fluid in

1. Computational Fluid a. Computational Fluid Dynamics is in the domain of Computational Science

SIGBI Limited General Meeting 2019 Resolutions 1-6 Resolution 1 Resolution 2 Resolution 3

Patagonia Gold Plc 2009 Patagonia Gold VOTING ORDINARY SPECIAL Resolution 1 Resolution 2

SAT Solver as coNP Solver Beyond Resolution Norbert Manthey International Center for

Fluid Dynamics and Thermal Engineering R&D Group Universidad Politcnica de Cartagena

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

The Single Resolution Mechanism Elke Knig Chair of the Single Resolution Board FDIC Systemic

FLUID MECHANICS IN INTRODUCTION Fluid :- A fluid may be defined As a substance which is

1 This information is confidential and was prepared by Brevini Fluid Power. It is not be relied on

FLEXIBLE USE OF AIRSPACE AND IAF 1 FLEXIBLE USE OF AIRSPACE INTRODUCTION INDIAN AIRSPACE

Upper H Harbor T Terminal Community Advisory Committee (CAC) Meeting #3 Monday, September 23,

Multi-State MBQIP Educational Collaborative Illinois Department of Public Health grantee

Educational Flexibility Program Other than statutory and regulatory requirements included in the

Subdivision of Fluid Flow Why Subdivision of Flows? Fluid flow governed by non-linear partial

Paul Drosinis Paul Drosinis UBC Phys 420 Introduction Short history on fluid dynamics

Computation Fluid Dynamics ANSYS Software Key Features and Best Practices Courtesy of Borg

Mesh and Fluid Dynamics eonore Gauci + El eric Alauzet + and Alain Dervieux

Implementing High-Resolution Fluid Dynamics Solver in a Performance - PowerPoint PPT Presentation

Introduction Hydro - 2nd order finite volume schemes MOOD - High-order finite volume schemes Kokkos - RamsesGPU / MOOD performances Implementing High-Resolution Fluid Dynamics Solver in a Performance Portable Way Applications to astrophysical

Fluid Dynamics CSE169: Computer Animation Steve Rotenberg UCSD, Spring 2016 Fluid Dynamics

Fluid Dynamics CSE169: Computer Animation Steve Rotenberg UCSD, Winter 2017 Fluid Dynamics

Contents Contents Fluid

Supercritical Fluid Chromatography 1. What is supercritical fluid 2. Supercritical Fluid

Fluid Filled Cables Alan Ainsley Asset Integrity Cable System Engineer NGT Overview

SPARSE FLUID SIMULATION IN DIRECTX Alex Dunn Graphics Dev. Tech. AGENDA We want more fluid in

1. Computational Fluid a. Computational Fluid Dynamics is in the domain of Computational Science

SIGBI Limited General Meeting 2019 Resolutions 1-6 Resolution 1 Resolution 2 Resolution 3

Patagonia Gold Plc 2009 Patagonia Gold VOTING ORDINARY SPECIAL Resolution 1 Resolution 2

SAT Solver as coNP Solver Beyond Resolution Norbert Manthey International Center for

Fluid Dynamics and Thermal Engineering R&amp;D Group Universidad Politcnica de Cartagena

Systerel Smart Solver Forum Mthodes Formelles October 2014 S3 S3 for C Systerel Smart Solver

A CDCL(LA) Solver SPASS-SATT A CDCL(LA) Solver Translation: fun (=SPASS) sated (=SATT)

The Single Resolution Mechanism Elke Knig Chair of the Single Resolution Board FDIC Systemic

FLUID MECHANICS IN INTRODUCTION Fluid :- A fluid may be defined As a substance which is

1 This information is confidential and was prepared by Brevini Fluid Power. It is not be relied on

FLEXIBLE USE OF AIRSPACE AND IAF 1 FLEXIBLE USE OF AIRSPACE INTRODUCTION INDIAN AIRSPACE

Upper H Harbor T Terminal Community Advisory Committee (CAC) Meeting #3 Monday, September 23,

Multi-State MBQIP Educational Collaborative Illinois Department of Public Health grantee

Educational Flexibility Program Other than statutory and regulatory requirements included in the

Subdivision of Fluid Flow Why Subdivision of Flows? Fluid flow governed by non-linear partial

Paul Drosinis Paul Drosinis UBC Phys 420 Introduction Short history on fluid dynamics

Computation Fluid Dynamics ANSYS Software Key Features and Best Practices Courtesy of Borg

Mesh and Fluid Dynamics eonore Gauci + El eric Alauzet + and Alain Dervieux

Fluid Dynamics and Thermal Engineering R&D Group Universidad Politcnica de Cartagena