Chris Gottbrath, Nov 2016
SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 - - PowerPoint PPT Presentation
SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 - - PowerPoint PPT Presentation
ACCELERATING CFD AND RESERVOIR SIMULATIONS WITH ALGEBRAIC MULTI GRID Chris Gottbrath, Nov 2016 Challenges What is Algebraic Multi-Grid (AMG)? Why use AMG? AGENDA When to use AMG? NVIDIA AmgX Results 2 CHALLENGES Computational Fluid
2
AGENDA
Challenges What is Algebraic Multi-Grid (AMG)? Why use AMG? When to use AMG? NVIDIA AmgX Results
3
CHALLENGES
Large Volumes Complex Geometry High Velocities Turbulence Multiple fluids
Computational Fluid Dynamics and Reservoir Simulations
4
ACCURACY & PERFORMANCE MATTERS
Quality Matters
Designing critical systems Oil field management -- $$$
Time to solution matters
Limits size and accuracy Limits coverage
5
WHAT IS AMG – ALGEBRAIC MULTI GRID
System: Ax – b = 0 A = matrix - system x,b = vector - state Residual Function : r= b-Ax
Start with a proposed solution vector x Iterative process – reduces |r| Trick: approximate solutions
Make a small system “based on” the big one Solve that to get a good approximation Project back to full system
A way to efficiently solve very large systems
Fast, powerful, general technique!
6
ALGEBRAIC MULTI GRID – THE V CYCLE
Repeat a few times Full Size System Full Size System Medium System Tiny System Solve Tiny System Medium System Coarsen + Smooth Prolong+ Smooth Prolong+ Smooth Coarsen + Smooth
7
Coarsening
8
Coarsening
9
Why use AMG?
Handles Complex Geometries Complex Physics Huge Systems High Resolution Fast Algorithm – Each iteration reduces |r| by 2-10x. Converge with 6-20 iterations It runs really well on NVIDIA GPUs!
A Powerful Solver
10
https://developer.nvidia.com/amgx
NVIDIA AmgX
Unstructured Implicit Linear Systems - Solved > 15x Speedup vs HYPRE
14x 14x 16x 15x 17x 15x
0x 2x 4x 6x 8x 10x 12x 14x 16x 18x
HYPRE K40 M40 P100-SXM2 Speedup vs. HYPRE
- Florida Matrix Collection; Total Time to Solution
- HYPRE AMG Package (http://acts.nersc.gov/hypre) on Intel Xeon E5-2697 v4@2.3GHz, 3.6GHz
Turbo, Hyperthreading off
- AmgX on K40m, M40, P100; Base clocks
- Host system: Intel Xeon Haswell single-socket 16-core E5-2698 v3 @ 2.3GHz, 3.6GHz Turbo
- CentOS 7.2 x86-64 with 128GB System Memory
The AmgX library provides a configurable and scalable GPU accelerated algebraic multi-grid solver for large sparse linear systems.
CFD and Reservoir Simulation Scales up to hundreds of GPUs Rich collection of algorithms Accelerate existing simulations
11
6 CPU cores + K80 GPU
1.8x
8 CPU cores
135 247
Distributed ANSYS Mechanical 16.0 with Ivy Bridge (Xeon E5-2697 V2 2.7 GHz) 8-core CPU and a Tesla K80 GPU.
ANSYS Mechanical 16.0 on Tesla K80
Simulation productivity (with HPC Pack)
Higher is Better
V15sp-4 Model
Turbine geometry 3,200,000 DOF SOLID187 FEs Static, nonlinear Distributed ANSYS 16.0 Direct sparse solver 6 CPU cores + K80 GPU
2.3x
8 CPU cores
159 371
V15sp-5 Model
Ball Grid Array geometry 6,000,000 DOF Static, nonlinear Distributed ANSYS 16.0 Direct sparse solver
ANSYS Mechanical jobs/day
V15sp-5 Model V15sp-4 Model THE Industry Standard CFD is accelerated with AmgX
12
1150 197 98
500 1000 1500 CPU GPU Custom AmgX
AmgX in Reservoir Simulation
Application Time (seconds)
Lower is Better
3-phase Black Oil Reservoir Simulation. 400K grid blocks solved fully implicitly. CPU: Intel Xeon CPU E5-2670 GPU: NVIDIA Tesla K10
13
Minimal Example With Config
//One header #include “amgx_c.h” //Read config file AMGX_create_config(&cfg, cfgfile); //Create resources based on config AMGX_resources_create_simple(&res, cfg); //Create solver object, A,x,b, set precision AMGX_solver_create(&solver, res, mode, cfg); AMGX_matrix_create(&A,res,mode); AMGX_vector_create(&x,res,mode); AMGX_vector_create(&b,res,mode); //Read coefficients from a file AMGX_read_system(&A,&x,&b, matrixfile); //Setup and Solve Loop AMGX_solver_setup(solver,A); AMGX_solver_solve(solver, b, x); //Download Result AMGX_download_vector(&x) solver(main)=FGMRES main:max_iters=100 main:convergence=RELATIVE_MAX main:tolerance=0.1 main:preconditioner(amg)=AMG amg:algorithm=AGGREGATION amg:selector=SIZE_8 amg:cycle=V amg:max_iters=1 amg:max_levels=10 amg:smoother(smoother)=BLOCK_JACOBI amg:relaxation_factor= 0.75 amg:presweeps=1 amg:postsweeps=2 amg:coarsest_sweeps=4 determinism_flag=1
14
AmgX
Flexible & powerful technique Simple to adopt Get results 15x faster with Pascal Scales out to handle large systems http://developer.nvidia.com/amgx
GPU acceleration for your simulation
Thanks! Questions?
http://developer.nvidia.com/amgx cgottbrath@nvidia.com
16
Numerical Models
AmgX is only used for PDE type models It works best with unstructured implicit code. Other code can be Accelerated using the CUDA libraries such as cuRAND, cuBLAS, cuFFT, cuSPARSE, and cuSOLVER
There are a lot of different ways to model
PDE based models
- Grid
- Structured
- Unstructured
- Method
- Implicit
- Explicit
Event Based Models
- Lattice
Boltzman
- Monte Carlo
- Finite State
Machine
- N-body / SPH
17
Second Order PDEs
Hyperbolic Parabolic Elliptic
Helmholtz
Pure Hyperbolic
Use Other Solvers Use AmgX
Helmholtz problems are currently not well suited Many physical problem domains can be modeled using 2nd order PDE These can be classified based on how they behave mathematically
- Parabolic – Solutions smooth out
- ver time.
- Heat transfer
- Elliptic – Smooth within volume,
potentially discontinuous boundary values.
- Subsonic fluid flow
- Hyperbolic – Discontinuities
(shocks) will persist
- Wave equation
- Supersonic fluid flow
18
KEY FEATURES
Classical and Aggregation AMG Robust Aggressive Coarsening Algorithms
Krylov methods: CG, GMRES, BiCGStab, IDR Smoothers / Solvers
Jacobi-L1,Block-Jacobi, ILU[0,1,2], DILU, Dense LU, KPZ- Polynomial, Chebyshev