Experiences using 2decomp&fft to solve Partial Differential - - PowerPoint PPT Presentation

experiences using 2decomp fft to solve partial
SMART_READER_LITE
LIVE PREVIEW

Experiences using 2decomp&fft to solve Partial Differential - - PowerPoint PPT Presentation

Experiences using 2decomp&fft to solve Partial Differential Equations using Fourier Spectral Methods Sudarshan Balakrishnan, Abdullah H. Bargash, Gong Chen, Brandon Cloutier, Ning Li, Dave Malicke, Benson Muite, Michael Quell, Paul Rigge,


slide-1
SLIDE 1

Experiences using 2decomp&fft to solve Partial Differential Equations using Fourier Spectral Methods

Sudarshan Balakrishnan, Abdullah H. Bargash, Gong Chen, Brandon Cloutier, Ning Li, Dave Malicke, Benson Muite, Michael Quell, Paul Rigge, Damian San Rom´ an Alerigi Mamdouh Solimani, Andre Souza, Abdulaziz S.Thiban, Mark Van Moer, Jeremy West

Tartu ¨ Ulikool benson.muite@ut.ee http://math.ut.ee/˜benson http: //en.wikibooks.org/wiki/Parallel_Spectral_Numerical_Methods

slide-2
SLIDE 2

Outline

Project Aim Fourier Series The Heat Equation The Allen Cahn Equation The Gray-Scott Equations Nonlinear Schr¨

  • dinger Equation

Navier-Stokes Equation Maxwell’s Equations Possible Further Work

slide-3
SLIDE 3

Project Aim

Teaching tool for use in mathematics and computer science courses from 1st year undergraduate to postgraduate level Research tool: investigate partial differential equations, investigate computer performance Help paper reproducibility and verifiability Reduce coding time Rely on 2decomp&fft for MPI parallelization (http://2decomp.org) Material has been tested in a course on Multivariable calculus and on an introduction to partial differential equations

slide-4
SLIDE 4

Fourier series: Separation of Variables 1

dy dt = y (1) dy y = dt (2) dy y =

  • dt

(3) ln y + a = t + b (4) eln y+a = et+b (5) eln yea = eteb (6) y = eb ea et (7) y(t) = cet. (8)

slide-5
SLIDE 5

Fourier series: Separation of Variables 2

ut = uxx (9) Suppose u = X(x)T(t)

dT dt (t)

T(t) =

d2X dx2 (x)

X(x) = −C, (10) Solving each of these separately and then using linearity we get a general solution

  • n

αn exp(−Cnt) sin(

  • Cnx) + βn exp(−Cnt) cos(
  • Cnx)

(11)

slide-6
SLIDE 6

Fourier series: Separation of Variables 3

How do we find a particular solution? Suppose u(x, t = 0) = f(x) Suppose u(0, t) = u(2π, t) and ux(0, t) = ux(2π, t) then recall 2π sin(nx) sin(mx) = π m = n m = n , (12) 2π cos(nx) cos(mx) = π m = n m = n , (13) 2π cos(nx) sin(mx) = 0. (14)

slide-7
SLIDE 7

Fourier series: Separation of Variables 4

So if f(x) =

  • n

αn sin(nx) + βn cos(nx). (15) then αn = 2π f(x) sin(nx)dx 2π sin2(nx)dx (16) βn = 2π f(x) cos(nx)dx 2π cos2(nx)dx . (17) and u(x, t) =

  • n

exp(−n2t) [αn sin(nx) + βn cos(nx)] (18) The Fast Fourier Transform allows one to find good approximations to αn and βn when the solution is found at a finite number of evenly spaced grid points

slide-8
SLIDE 8

The 1D Heat Equation: Finding derivatives and timestepping

Let u(x) =

  • k

ˆ uk exp(ikx) (19) then dνu dxν =

  • (ik)νˆ

uk. (20) Consider ut = uxx, which is approximated by ∂ˆ uk ∂t = α(ik)2ˆ uk (21) ˆ un+1

k

− ˆ un

k

h = α(ik)2ˆ un+1

k

(22) ˆ un+1

k

(1 − αh(ik)2) = ˆ un

k

(23) ˆ un+1

k

= ˆ un

k

(1 − αh(ik)2). (24)

slide-9
SLIDE 9

The 2D Allen Cahn Equation

Consider ut = ǫ(uxx + uyy) + u − u3, which is approximated by ∂ˆ u ∂t = ǫ

  • (ikx)2 + (iky)2

ˆ u + ˆ u − ˆ u3 (25) ˆ un+1 − ˆ un h = ǫ

  • (ikx)2 + (iky)2

ˆ un+1 + ˆ un − ˆ (un)3(26)

slide-10
SLIDE 10

The 3D Gray-Scott Equations

∂u ∂t = Du∆u + α (1 − u) − uv2, (27) ∂v ∂t = Dv∆v − βv + uv2. (28) Solved using a splitting method (More information on splitting for this at http://arxiv.org/abs/1310.3901) http://web.student.tuwien.ac.at/˜e1226394/

slide-11
SLIDE 11

The 2D nonlinear Schr¨

  • dinger Equation

iψt + ψxx + ψyy = |ψ|2ψ Solved using Fast Fourier Transform and splitting

slide-12
SLIDE 12

The 2D nonlinear Schr¨

  • dinger Equation

Table: Computation times in seconds for 20 time steps of 10−5 for a Fourier split step method for the cubic nonlinear Schr¨

  • dinger

equation on [−5π, 5π]2.

Grid GPU GPU GPU Xeon Phi CPU Size (Cuf) (Cuda) (OpenACC) (61 cores) (1 core) 2562 0.00802 0.0116 0.0130 0.0122 0.442 5122 0.0234 0.0315 0.0369 0.0291 1.94 10242 0.0851 0.105 0.132 0.118 12.7 20482 0.334 0.415 0.527 0.422 57.2 40962 1.49 2.02 2.30 1.626 329

slide-13
SLIDE 13

The Real Cubic Klein-Gordon Equation

utt − ∆u + u = |u|2u (29) E(u, ut) = 1 2|ut|2 + 1 2|u|2 + 1 2 |∇u|2 − 1 4 |u|4 dx (30) un+1 − 2un + un−1 (δt)2 − ∆un+1 + 2un + un−1 4 + un+1 + 2un + un−1 4 (31) = ±

  • un

2 un (32) Parallelization done using 2decomp library for FFT and processing independent loops Other time stepping algorithms possible, including splitting

slide-14
SLIDE 14

Simulations and Videos by Brian Leu, Albert Liu, and Parth Sheth

104 105 106 Number of Cores 100 101 102 Computation Time (s)

Measured Ideal

Figure: Strong scaling on Mira for a 40963 discretization.

http://www-personal.umich.edu/˜alberliu/ http://www-personal.umich.edu/˜brianleu/ http://www-personal.umich.edu/˜pssheth/

slide-15
SLIDE 15

The 2D and 3D Navier Stokes Equation

Consider incompressible case only ρ ∂u ∂t + u · ∇u

  • =

−∇p + µ∆u (33) ∇ · u = 0. (34) p pressure, µ viscosity, ρ, density 2D u(x, y) = (u(x, y), v(x, y)) 3D u(x, y, z) = (u(x, y, z), v(x, y, z), w(x, y, z))

slide-16
SLIDE 16

2D Vorticity-Streamfunction Formulation

ω = ∇ × u = ∂v ∂x − ∂u ∂y = −∆ψ ρ ∂ω ∂t + u ∂ω ∂x + v ∂ω ∂y

  • = µ∆ω

(35) and ∆ψ = −ω. (36)

slide-17
SLIDE 17

Time Discretization

ρ ωn+1,k+1 − ωn δt (37) +1 2

  • un+1,k ∂ωn+1,k

∂x + vn+1,k ∂ωn+1,k ∂y + un ∂ωn ∂x + vn ∂ωn ∂y

  • = µ

2∆

  • ωn+1,k+1 + ωn

, and ∆ψn+1,k+1 = −ωn+1,k+1, (38) un+1,k+1 = ∂ψn+1,k+1 ∂y , vn+1,k+1 = −∂ψn+1,k+1 ∂x . (39) Fixed point iteration used to obtain nonlinear terms

slide-18
SLIDE 18

Example Videos

http://www-personal.umich.edu/˜cloutbra/ research.html Simulations on a single NVIDIA Fermi GPU about 20 times faster than a 16 core CPU

slide-19
SLIDE 19

3D Equivalent Formulation

Simplification of equation with periodic boundary conditions ρ ∂u

∂t + u · ∇u

  • = −∇p + µ∆u

(40) ∇ · u = 0 (41) so ∇ ·

  • ρ

∂u

∂t + u · ∇u

  • = ∇ · [−∇p + µ∆u]

(42) ρ∇ · (u · ∇u) = −∆p (43) p = ∆−1 [∇ · (u · ∇u)] (44) so ρ ∂u

∂t + u · ∇u

  • = −ρ∇
  • ∆−1 [∇ · (u · ∇u)]
  • + µ∆u(45)
slide-20
SLIDE 20

3D Equivalent Formulation - Implicit Midpoint Time Discretization

ρ un+1,j+1 − un δt + un+1,j + un 2 · ∇ un+1,j + un 2

  • = ρ∇
  • ∆−1

∇ ·

  • (un+1,j + un) · ∇(un+1,j + un)
  • 4

+ µ∆un+1,j+1 + un 2 , Video of Taylor Green Vortex http://vimeo.com/87981782

slide-21
SLIDE 21

3D Equivalent Formulation - Carpenter-Kennedy Discretization

1: procedure RUNGE–KUTTA(u) 2:

h = 0

3:

u = un

4:

for k = 1 → 5 do

5:

h ← g(u) + βkh

6:

µ = 0.5δt(αk+1 − αk)

7:

v − µl(v) = u + γkδth + µl(u)

8:

u ← v

9:

end for

10:

un+1 = u

11: end procedure

slide-22
SLIDE 22

Performance

δt = 0.005 for 5123 and δt = 0.01 for 2563 grid points. For IMR scheme, fixed point iteration procedure was stopped once the difference between two successive iterates was less than 10−10 in l∞ norm of velocity fields. Method Grid Size Cores Time Steps Time (s)

Core Hours Timestep

IMR 2563 512 1000 4060 0.578 IMR 5123 1024 500 9899 5.68 CK 5123 4096 2000 7040 4.0

Table: Performance of Fourier pseudospectral code on Shaheen. IMR is an abbreviation for implicit midpoint rule and CK is an abbreviation for Carpenter–Kennedy.

slide-23
SLIDE 23

Kinetic Energy Evolution

2 4 6 8 10 0.08 0.09 0.1 0.11 0.12 0.13 Time Energy Kinetic Energy IMR 256 IMR 512 CK 512 Reference

Figure: KE of solutions are so close they are almost indistinguishable

slide-24
SLIDE 24

Kinetic Energy Dissipation Rate

2 4 6 8 10 0.002 0.004 0.006 0.008 0.01 0.012 0.014 Time Energy Dissipation Rate Kinetic Energy Dissipation Rate IMR 256 IMR 512 CK 512 Reference

Figure: Plot during the initial stage, where flow is essentially inviscid and laminar. Fully developed turbulent flow is observed around tmax ≈ 8.

slide-25
SLIDE 25

Kinetic Energy Dissipation Rate

2 4 6 8 10 −5 5 10 15x 10

−4

Time Energy Dissipation Rate Kinetic Energy Dissipation Rate IMR 256 − Reference IMR 512 − Reference CK 512 − Reference

Figure: Difference in kinetic energy dissipation rates between the current discretizations and the reference solution.

slide-26
SLIDE 26

Vorticity

Figure: Square of the vorticity in the plane centered at (π, 0, 0) with normal vector (1, 0, 0).

slide-27
SLIDE 27

Discrete energy equality for midpoint rule

u(t = T)2

l2 − u(t = 0)2 l2 = −µ

T ∇u2

l2dt

uN2

l2 − u02 l2 = −µ

4

N−1

  • n=0
  • Un + Un+1
  • 2

l2 δt.

slide-28
SLIDE 28

Conclusion on Navier Stokes Equations

At almost the same computational cost, both 2nd-order accurate IMR and 4th-order Carpenter-Kennedy time stepping method, capture same amount of detail of the flow for 5123. Simulations with 2563 grid points resulted in poor spatial convergences.

slide-29
SLIDE 29

The 3D Maxwell’s Equations

  • Dt − ∇ ×

H = 0

  • Bt + ∇ ×

E = 0 with the relation between the electromagnetic components given by the constitutive relations:

  • D = εo(x, y, z)

E

  • B = µo(x, y, z)

H Maxwell Simulation http://vimeo.com/71822380

slide-30
SLIDE 30

Further Work

Integration with other codes or simple examples for other spatial discretizations (one other example https://code.google.com/p/incompact3d/) Uniform interface for users with no programming background “Use MPI” vs. “Include mpif.h” C/C++ examples Better archiving and documentation procedure – currently wikibooks + github Integration with accelerators Integration with visualization tools Magnetohydrodynamics

slide-31
SLIDE 31

Conclusion

Easy to program numerical method which can be used to study semilinear partial differential equations Method parallelizes well on hardware with good communications so a good tool to introduce parallel programming ideas Research tool to investigate and provide conjectures for behavior of solutions to partial differential equations Research tool to investigate computer hardware performance and correctness Better user interface and integration with visualization would help make it easier for those without strong programming backgrounds

slide-32
SLIDE 32

Acknowledgements

This research used resources of the Oak Ridge Leadership Computing Facility at the Oak Ridge National Laboratory, which is supported by the Office of Science of the U.S. Department of Energy under Contract No. DE-AC05-00OR22725. This research used the resources of the Supercomputing Laboratory at King Abdullah University of Science and Technology. This research used resources of the Keeneland Computing Facility at the Georgia Institute of Technology, which is supported by the National Science Foundation under Contract OCI-0910735. This material is based upon work supported by the National Science Foundation under Grant Number 1137097 and by the University of Tennessee through the Beacon Project.

slide-33
SLIDE 33

Acknowledgements and References

  • W. Auzinger, O. Batrashev, K. Hillewaert, D. Ketcheson, O. Koch, B.

Leu, A. Liu, B. Palen, M. Parsani, W. Schlag, P . Sheth, M. Warnez The Blue Waters Undergraduate Petascale Education Program administered by the Shodor foundation The Division of Literature, Sciences and Arts at the University of Michigan Carpenter M.H. and Kennedy C. A. “Fourth-Order 2N-Storage Runge–Kutta schemes” NASA Langley Research Center Technical Memorandum 109112, (1994). http://en.wikibooks.org/wiki/Parallel_Spectral_ Numerical_Methods http://2decomp.org Beamer https://en.wikipedia.org/wiki/Beamer_(LaTeX)