CONVERGENCE ACCELERATION TECHNIQUES FOR DUAL TIME STEPPING Niki A. - - PowerPoint PPT Presentation
CONVERGENCE ACCELERATION TECHNIQUES FOR DUAL TIME STEPPING Niki A. - - PowerPoint PPT Presentation
CONVERGENCE ACCELERATION TECHNIQUES FOR DUAL TIME STEPPING Niki A. Loppi Brian C. Vermeire Peter E. Vincent AI & HPC Solution Architect Aerospace Engineering Department of Aeronautics NVIDIA Concordia University Imperial College London
OVERVIEW
- Incompressible flows require a divergence free velocity field
- Artificial Compressibility Method (ACM) is a suitable approach
- A range of novel convergence acceleration techniques
- Locally Adaptive Pseudo-Timestepping (LAPTS)
- Polynomial Multigrid (P-MG)
- Optimal explicit Runge-Kutta Methods
ARTIFICIAL COMPRESSIBILITY
- An alternative to pressure projection in steady state
- ACM uses a pseudo time problem to enforce incompressibility
- Dual time-stepping can extend the ACM unsteady flows
- This introduces a global hyperbolic problem in pseudo-time
- Leverage the explicit solver technology already in PyFR
ARTIFICIAL COMPRESSIBILITY
Conservation law ∂u ∂τ + Ic ∂u ∂t + ∂F ∂x + ∂G ∂y + ∂H ∂z = 0 ∂u ∂τ = Rn+1,m − Ic 2Δt (3un+1,m − 4un + un−1) Physical time u(k) = u(0) − αmΔτ (R(k−1) − Ic 2Δt (3u(k−1) − 4un + un−1)) Pseudo time Algorithm
(1)
OVERVIEW
- ACM performance relies on rapid convergence in pseudo-time
- A range of novel convergence acceleration techniques in PyFR
- Polynomial Multigrid (P-MG)
- Locally Adaptive Pseudo-Timestepping (LAPTS)
- Optimal explicit Runge-Kutta Methods
POLYNOMIAL MULTIGRID
- Leverage lower polynomial degrees to accelerate convergence
- Less strict CFL limits on the coarser levels
- Less expensive per iteration on the coarser levels
- Low-frequency error is converged faster on coarse levels
- Correction from coarse levels is then prolongated to fine levels
POLYNOMIAL MULTIGRID
Iterate Restrict Iterate Restrict Iterate Iterate Prolongate Iterate Prolongate
POLYNOMIAL MULTIGRID
- Unsteady Circular Cylinder
~ 6.2x Speedup
POLYNOMIAL MULTIGRID
- Incompressible Taylor Green Vortex
~ 3.5x Speedup
LAPTS
- Convergence is accelerated by using local pseudo-time steps
- Maximum permissible step size is limited by local CFL criteria
- Element size
- Polynomial degree
- Local wave speeds and viscous effects
- Runge-Kutta scheme properties
- This limit is estimated via embedded pair Runge-Kutta schemes
LAPTS
- Embedded pair gives an estimate of the truncation error
- Pseudo-time step size is the adapted using a PI-controller
- For each element
- For each field variable
- Scaled up on coarser grid levels when combined with P-MG
LAPTS
- Unsteady Circular Cylinder
~ 4.1x Speedup
LAPTS
- SD7003 Airfoil
~ 2.4x Speedup
OPTIMAL RUNGE-KUTTA SCHEMES
- Properties of Runge-Kutta scheme limit pseudo-time step size
- Each Runge-Kutta scheme has a stability polynomial
- Each stability polynomial has a region of absolute stability
- Pseudo-time step is limited by the size of this region
- For the ACM, first-order in pseudo-time time is sufficient
OPTIMAL RUNGE-KUTTA SCHEMES
Ps,1(z) = 1 + z +
s
∑
j=2
γjzj, z = Δτωδ |Ps,1(Δτωδ)| − 1 ≤ 0, ∀ωδ {γ2, γ3, . . . , γs} Stability polynomial
Optimise to yield maximum Δτ subject to
OPTIMAL RUNGE-KUTTA SCHEMES
- Optimal stability polynomials can be used for embedded pairs
- Divergence of a “test” scheme controls pseudo-time step
- Allows automatic pseudo-time step size selection
- Unsteady Circular Cylinder
~ 2.1x Speedup
OPTIMAL RUNGE-KUTTA SCHEMES
OPTIMAL RUNGE-KUTTA SCHEMES
- Turbulent Jet
~ 2x Speedup
PERFORMANCE
Speed Up for Cylinder Benchmark
5 10 15 20 25
RK4 RK-Opt LTS PMG RK-Opt+LTS+PMG
- Advancements in numerical methods (2015 - 2020)
~ 21x Speedup
PERFORMANCE
- Advancements in hardware (2015 - 2020)
Peak DP TFLOP/s 5 10 15 20 K20 P100 V100 A100
~ 16x Speedup
PERFORMANCE
- Combined ~350x speedup (2015 - 2020)
Peak DP TFLOP/s 5 10 15 20 K20 P100 V100 A100
Speed Up for Cylinder Benchmark
5 10 15 20 25
RK4 LTS RK-Opt+LTS+PMG
RESULTS
- DARPA SUBOFF at Re = 1.2×106
RESULTS
RESULTS
CONFIGURATION
P-MG Optimal Runge Kutta LAPTS
REFERENCES
- NA Loppi, FD Witherden, A Jameson, PE Vincent, A high-order cross-platform incompressible Navier–Stokes solver via
artificial compressibility with application to a turbulent jet, Computer Physics Communications 233, 193-205, 2018.
- BC Vermeire, NA Loppi, PE Vincent, Optimal Runge–Kutta schemes for pseudo time-stepping with high-order
unstructured methods, Journal of Computational Physics 383, 55-71, 2019.
- NA Loppi, FD Witherden, A Jameson, PE Vincent, Locally adaptive pseudo-time stepping for high-order Flux
Reconstruction, Journal of Computational Physics 399, 2019.
- BC Vermeire, NA Loppi, PE Vincent, Optimal embedded pair Runge-Kutta schemes for pseudo-time stepping, Journal of
Computational Physics, 415, 2020.