The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of - - PowerPoint PPT Presentation

the n body problem 2 3
SMART_READER_LITE
LIVE PREVIEW

The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of - - PowerPoint PPT Presentation

The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.21] Thursday, March 27, 2008 1 Todays sources CS 267 at UCB (Demmel & Yelick) Computational Science and


slide-1
SLIDE 1

The n-body problem (2/3)

  • Prof. Richard Vuduc

Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.21] Thursday, March 27, 2008

1

slide-2
SLIDE 2

Today’s sources

CS 267 at UCB (Demmel & Yelick) Computational Science and Engineering, by Gilbert Strang Lectures 18–21 of M. Ricotti’s Astro 415 course at U. Maryland Jim Stone (Princeton) Andrey Kravtsov (U. Chicago) Mike Heath (UIUC) Changa (ChaNGa, U. Washington; based on CHARM++)

2

slide-3
SLIDE 3

Review: Parallel ODE solvers

3

slide-4
SLIDE 4

Example: Stellar dynamics in a globular cluster

4

slide-5
SLIDE 5

Rewrite as 1st-order ODE

n particles ⇒ 6n-element vector

Fi mi = ¨ ri = −G

  • j=i

mj ri − rj |ri − rj|3 ⇓ d dt

  • ri

vi

  • =
  • vi

Fi/mi

  • 5
slide-6
SLIDE 6

Fi mi = ¨ ri = −G

  • j=i

mj ri − rj (|ri − rj|2 + ǫ2)

3 2

ri(t0), vi(t0) = . . . t ≥ t0 d dt

  • ri(t)

vi(t)

  • =
  • vi(t)

Fi(r)/mi

  • Given:

Solve: n particles ⇒ 6n-element vector Computational tasks:

  • 1. Solve ODE
  • 2. Inner loop: compute forces

6

slide-7
SLIDE 7

IVP ODE solvers

Many algorithms

Taylor series (e.g., Euler) Runge-Kutta Extrapolation Multistep Multivalue

Design space

Single vs. multistep Explicit vs. implicit

  • No. of req’d function/

derivative evaluations

7

slide-8
SLIDE 8

Sources of parallelism

Multistage, e.g., Runge-Kutta: Stage evaluation Evaluating right-hand side, e.g., forces in gravitational n-body Solving linear/non-linear systems, e.g., implicit methods Partitioning equations in multiple tasks – waveform relaxation

d dt   y(k+1)

1

y(k+1)

2

  ←    f1

  • t, y(k+1)

1

, y(k)

2

  • f2
  • t, y(k)

1 , y(k+1) 2

 

8

slide-9
SLIDE 9

Leap-frog integrator

vk+ 1

2 ≡ ˙

r

  • t + h

2

  • rk+1

← rk + h · vk+ 1

2

vk+ 3

2

← vk+ 1

2 + h · g(rk+1)

h ˙ r 3

2

˙ r 1

2

˙ r 5

2

˙ r 7

2

˙ r 9

2

r4 r3 r2 r1 r0

9

slide-10
SLIDE 10

Time reversibility of leap-frog Source: http://www.physics.drexel.edu/courses/Comp_Phys/Integrators/leapfrog/ Energy

10

slide-11
SLIDE 11

Efficiently computing forces: Particle-mesh (PM) approach

11

slide-12
SLIDE 12

Method 1: Direct summation (“Particle-Particle”)

Most straightforward and accurate Expensive ⇒ O(N2) Parallelization?

12

slide-13
SLIDE 13

Method 2: Particle-Mesh

Idea: Gravitational field has a scalar potential Potential is given by Poisson’s equation We know how to solve this! Just need a sensible ρ(r)

∇2φ(r) = ρ(r) ∇ × F(r) ≡ 0 = ⇒ F(r) = −∇φ(r)

13

slide-14
SLIDE 14

ρ: Create a mesh and assign particles to cells: Nearest grid point

Coarse ρ

Charge-in-cell or particle-in-cell (PIC)

Assign “shape” or “cloud” to each particle Smooths ρ

Boundary conditions?

Periodic or multipole

Last step: ρ → Φ → F

Finite differencing + interpolation

14

slide-15
SLIDE 15

Particle-in-cell: Replace points by distribution function

15

slide-16
SLIDE 16

PM: Pros & cons

Pro: Poisson is warm and fuzzy!

Many accurate and efficient solvers (e.g., multigrid, FFT)

Con: Limitations of meshes

How to choose no. of grid points? Cannot resolve interactions on scales smaller than grid size

Hybrid PP-PM (“P3M”) methods possible

16

slide-17
SLIDE 17

Efficiently computing forces: Tree codes

17

slide-18
SLIDE 18

Approximating long-distance interactions

D D R

= center of mass

D R < θ

18

slide-19
SLIDE 19

Repeat recursively

D R

D

D1 R1

19

slide-20
SLIDE 20

Idea: Organize particles in space in a tree

20

slide-21
SLIDE 21

Adaptive quad-tree

21

slide-22
SLIDE 22

Source: M. Warren & J. Salmon, In Supercomputing 1993.

22

slide-23
SLIDE 23

Barnes-Hut algorithm (1986)

Algorithm:

Build tree For each node, compute center-of-mass and total mass For each particle, traverse tree to compute force on it

23

slide-24
SLIDE 24

D R < θ

D R

24

slide-25
SLIDE 25

Other trees are possible, e.g., kd-tree

25

slide-26
SLIDE 26

Fast multipole method of Greengard & Rokhlin (1987)

Differences from Barnes-Hut

Computes potential, not force Uses more than center-of- and total-mass ⇒ more accurate & expensive Accesses fixed set of boxes at every level, independent of “D / R”

Increasing accuracy

BH: Fixed info / box, more boxes FMM: Fixed no. of boxes; more info / box

26

slide-27
SLIDE 27

FMM computes compact expression for potential

r |F(r)| = 1 r2 ⇓ F(r) = −∇φ(r)

27

slide-28
SLIDE 28

Potential in 3-D

3-D:

φ(r) = − 1 |r| = − 1

  • x2 + y2 + z2

F(r) = − ∂φ ∂x, ∂φ ∂y , ∂φ ∂z

  • = −

x r3 , y r3 , z r3

  • 28
slide-29
SLIDE 29

Potential in 2-D

φ(x, y) =

n

  • k=1

mk ln

  • (x − xk)2 + (y − yk)2

2-D:

φ(r) = ln |r| = ln

  • x2 + y2

F(r) = − ∂φ ∂x, ∂φ ∂y

  • = −

x r2 , y r2

  • For n points in the plane:

29

slide-30
SLIDE 30

“Complex” representation

2-D: Complex plane:

φ(x, y) =

n

  • k=1

mk ln

  • (x − xk)2 + (y − yk)2

z ≡ x + iy φ(z) =

n

  • k=1

mk ln |z − zk| = Re n

  • k=1

mk ln(z − zk)

  • 30
slide-31
SLIDE 31

2-D multipole expansion

n

  • k=1

mk ln(z − zk) = M ln z +

  • k

mk ln

  • 1 − zk

z

  • =

M ln z +

  • k

mk

  • d=1

1 d zk z d = M ln z +

  • d=1

n

  • k=1

mkzd

k

  • ≡αd

1 zd = M ln z +

  • d=1

αd zd

Can approx. by truncation

31

slide-32
SLIDE 32

2-D multipole expansion

Error(p) ∼ max |zk| |z| p+1

αd ≡

n

  • k=1

mkzd

k n

  • k=1

mk ln(z − zk) = M ln z +

  • d=1

αd zd ≈ M ln z +

p

  • d=1

αd zd + Error(p)

32

slide-33
SLIDE 33

Error(p) ∼ max |zk| |z| p+1

Error outside larger box is O(cp+1)

c ∼

D √ 2 3 2D ≈ .47

33

slide-34
SLIDE 34

Outer expansion

Outer(N) = {M, α1, α2, …, αp, zN}

Use to evaluate ϕ(z) outside node N due to those inside N Centered at zN Evaluation costs is O(p) Cost linear with no. of bits of precision

αd ≡

n

  • k=1

mkzd

k n

  • k=1

mk ln(z − zk) ≈ M ln z +

p

  • d=1

αd zd

34

slide-35
SLIDE 35

n

  • k=1

mk ln(z − zk) ≈ M ln z +

p

  • d=1

αd zd

35

slide-36
SLIDE 36

φ1(z) = M1 ln(z − z1) +

p

  • d=1

α(1)

d

(z − z1)d φ2(z) = M2 ln(z − z2) +

p

  • d=1

α(2)

d

(z − z2)d

For z outside dashed black box: Outer(N2) = Outer_shift(Outer(N1), z2)

φ1(z) ∼ φ2(z) = ⇒     α(2)

1

. . . α(2)

d

    ≈ A(z1) ·     α(1)

1

. . . α(1)

d

   

36

slide-37
SLIDE 37

Inner expansion

Outer(N) = {M, α1, α2, …, αp, zN} Similarly, Inner(N) evaluates potential inside N from particles outside.

n

  • k=1

mk ln(z − zk) ≈ M ln z +

p

  • d=1

αd zd

p

  • d=1

βd(z − zN)d

37

slide-38
SLIDE 38

Inner expansion

Inner(N1) = Inner_shift(Inner(N2), z1) Need Inner(N4) = Convert(Outer(N3))

38

slide-39
SLIDE 39

FMM algorithm

Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N)

39

slide-40
SLIDE 40

FMM algorithm

Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N)

40

slide-41
SLIDE 41

FMM algorithm

Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N)

41

slide-42
SLIDE 42

Building Outer(N)

42

slide-43
SLIDE 43

FMM algorithm

Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N)

43

slide-44
SLIDE 44

Building Inner(N)

44

slide-45
SLIDE 45

FMM algorithm

Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N)

45

slide-46
SLIDE 46

Administrivia

46

slide-47
SLIDE 47

Upcoming schedule changes

Some adjustment of topics (TBD) Tu 4/1 — Esteemed colleague

  • R. Riegel on “Dual tree algorithms in statistics”

Th 4/3 — Attend talk by Dr. Douglass Post from DoD HPC Modernization Program, 9:30–10:30am, room MiRC 102A&B No HW 2 (optional assignment possible) Project checkpoint (3 page max): Tu 4/8

47

slide-48
SLIDE 48

“In conclusion…”

48

slide-49
SLIDE 49

Ideas apply broadly

Physical sciences, e.g.,

Plasmas Molecular dynamics Electron-beam lithography device simulation Fluid dynamics

“Generalized” n-body problems: Talk to your classmate, Ryan Riegel

49

slide-50
SLIDE 50

Backup slides

50

slide-51
SLIDE 51

2-D multipole expansion

φ(z) = Re n

  • k=1

mk ln(z − zk)

n

  • k=1

mk ln(z − zk) = M ln z +

  • k

mk ln

  • 1 − zk

z

  • =

M ln z +

  • d=1

αd zd

  • Approx. by truncating sum

51