The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.21] Thursday, March 27, 2008 1
Today’s sources CS 267 at UCB (Demmel & Yelick) Computational Science and Engineering , by Gilbert Strang Lectures 18–21 of M. Ricotti’s Astro 415 course at U. Maryland Jim Stone (Princeton) Andrey Kravtsov (U. Chicago) Mike Heath (UIUC) Changa (ChaNGa, U. Washington; based on CHARM++) 2
Review: Parallel ODE solvers 3
Example: Stellar dynamics in a globular cluster 4
Rewrite as 1st-order ODE F i r i − r j � − G m j = ¨ r i = | r i − r j | 3 m i j � = i ⇓ � � � � d r i v i = F i /m i v i dt n particles ⇒ 6 n -element vector 5
Given: F i r i − r j � = ¨ = − G m j r i 3 m i ( | r i − r j | 2 + ǫ 2 ) 2 j � = i r i ( t 0 ) , v i ( t 0 ) = . . . Solve: t t 0 ≥ � � � � d r i ( t ) v i ( t ) = v i ( t ) F i ( r ) /m i dt n particles ⇒ 6 n -element vector Computational tasks: 1. Solve ODE 2. Inner loop: compute forces 6
IVP ODE solvers Many algorithms Design space Taylor series ( e.g. , Euler) Single vs. multistep Runge-Kutta Explicit vs. implicit Extrapolation No. of req’d function/ derivative evaluations Multistep Multivalue 7
Sources of parallelism Multistage, e.g. , Runge-Kutta: Stage evaluation Evaluating right-hand side, e.g. , forces in gravitational n-body Solving linear/non-linear systems, e.g. , implicit methods Partitioning equations in multiple tasks – waveform relaxation � � t, y ( k +1) , y ( k ) y ( k +1) f 1 1 2 d 1 ← dt y ( k +1) � � t, y ( k ) 1 , y ( k +1) f 2 2 2 8
Leap-frog integrator r 0 r 1 r 2 r 3 r 4 ˙ ˙ ˙ ˙ ˙ r 1 r 3 r 5 r 7 r 9 2 2 2 2 2 h t + h � � 2 ≡ ˙ r k + h · v k + 1 v k + 1 r r k +1 ← 2 2 2 + h · g ( r k +1 ) v k + 3 v k + 1 ← 2 9
Time reversibility of leap-frog Energy Source: http://www.physics.drexel.edu/courses/Comp_Phys/Integrators/leapfrog/ 10
Efficiently computing forces: Particle-mesh (PM) approach 11
Method 1: Direct summation (“Particle-Particle”) Most straightforward and accurate Expensive ⇒ O(N 2 ) Parallelization? 12
Method 2: Particle-Mesh Idea: Gravitational field has a scalar potential ∇ × F ( r ) ≡ 0 = F ( r ) = −∇ φ ( r ) ⇒ Potential is given by Poisson’s equation ∇ 2 φ ( r ) = ρ ( r ) We know how to solve this! Just need a sensible ρ (r) 13
ρ : Create a mesh and assign particles to cells: Nearest grid point Coarse ρ Charge-in-cell or particle-in-cell ( PIC ) Assign “shape” or “cloud” to each particle Smooths ρ Boundary conditions? Periodic or multipole Last step: ρ → Φ → F Finite differencing + interpolation 14
Particle-in-cell: Replace points by distribution function 15
PM: Pros & cons Pro: Poisson is warm and fuzzy! Many accurate and efficient solvers ( e.g. , multigrid, FFT) Con: Limitations of meshes How to choose no. of grid points? Cannot resolve interactions on scales smaller than grid size Hybrid PP-PM (“P 3 M”) methods possible 16
Efficiently computing forces: Tree codes 17
Approximating long-distance interactions D R D D R < θ = center of mass 18
Repeat recursively R D D R 1 D 1 19
Idea: Organize particles in space in a tree 20
Adaptive quad-tree 21
Source: M. Warren & J. Salmon, In Supercomputing 1993. 22
Barnes-Hut algorithm (1986) Algorithm: Build tree For each node , compute center-of-mass and total mass For each particle , traverse tree to compute force on it 23
D R D R < θ 24
Other trees are possible, e.g. , kd-tree 25
Fast multipole method of Greengard & Rokhlin (1987) Differences from Barnes-Hut Computes potential, not force Uses more than center-of- and total-mass ⇒ more accurate & expensive Accesses fixed set of boxes at every level, independent of “D / R” Increasing accuracy BH: Fixed info / box, more boxes FMM: Fixed no. of boxes; more info / box 26
FMM computes compact expression for potential 1 | F ( r ) | = r 2 r ⇓ F ( r ) = −∇ φ ( r ) 27
Potential in 3-D 3-D: − 1 1 φ ( r ) = | r | = − x 2 + y 2 + z 2 � � x � ∂φ � ∂ x, ∂φ ∂ y , ∂φ � r 3 , y r 3 , z F ( r ) = = − − r 3 ∂ z 28
Potential in 2-D 2-D: x 2 + y 2 � φ ( r ) = ln | r | = ln � x � ∂φ � ∂ x, ∂φ � r 2 , y F ( r ) = = − − r 2 ∂ y For n points in the plane: n ( x − x k ) 2 + ( y − y k ) 2 � � φ ( x, y ) = m k ln k =1 29
“Complex” representation n 2-D: ( x − x k ) 2 + ( y − y k ) 2 � � φ ( x, y ) = m k ln k =1 Complex plane: x + iy z ≡ n � φ ( z ) = m k ln | z − z k | k =1 � n � � = Re m k ln( z − z k ) k =1 30
2-D multipole expansion n � � 1 − z k � � m k ln( z − z k ) = M ln z + m k ln z k =1 k ∞ � z k � d 1 � � = M ln z + m k d z k d =1 � n � ∞ 1 � � m k z d = M ln z + k z d d =1 k =1 � �� � ≡ α d ∞ � Can approx. α d = M ln z + by truncation z d d =1 31
2-D multipole expansion n � m k z d ≡ α d k k =1 n ∞ α d � � m k ln( z − z k ) = M ln z + z d k =1 d =1 p α d � M ln z + z d + Error( p ) ≈ d =1 � p +1 � max | z k | Error( p ) ∼ | z | 32
� p +1 � max | z k | Error( p ) ∼ | z | Error outside larger box is O( c p +1 ) D √ 2 c ∼ 2 D ≈ . 47 3 33
Outer expansion n � m k z d ≡ α d k k =1 p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 Outer(N) = {M, α 1 , α 2 , …, α p , z N } Use to evaluate ϕ (z) outside node N due to those inside N Centered at z N Evaluation costs is O(p) Cost linear with no. of bits of precision 34
p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 35
p α (1) � d φ 1 ( z ) = M 1 ln( z − z 1 ) + ( z − z 1 ) d d =1 p α (2) � d φ 2 ( z ) = M 2 ln( z − z 2 ) + ( z − z 2 ) d d =1 For z outside dashed black box: φ 1 ( z ) φ 2 ( z ) ∼ α (2) α (1) 1 1 . . . . = A ( z 1 ) · ⇒ ≈ . . α (2) α (1) d d Outer(N2) = Outer_shift(Outer(N1), z 2 ) 36
Inner expansion Outer(N) = {M, α 1 , α 2 , …, α p , z N } p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 Similarly, Inner(N) evaluates potential inside N from particles outside. p � β d ( z − z N ) d d =1 37
Inner expansion Need Inner(N4) = Inner(N1) = Inner_shift(Inner(N2), z 1 ) Convert(Outer(N3)) 38
FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 39
FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 40
FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 41
Building Outer(N) 42
FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 43
Building Inner(N) 44
FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 45
Administrivia 46
Upcoming schedule changes Some adjustment of topics (TBD) Tu 4/1 — Esteemed colleague R. Riegel on “Dual tree algorithms in statistics” Th 4/3 — Attend talk by Dr. Douglass Post from DoD HPC Modernization Program, 9:30–10:30am, room MiRC 102A&B No HW 2 (optional assignment possible) Project checkpoint (3 page max): Tu 4/8 47
“In conclusion…” 48
Ideas apply broadly Physical sciences, e.g. , Plasmas Molecular dynamics Electron-beam lithography device simulation Fluid dynamics “Generalized” n-body problems: Talk to your classmate, Ryan Riegel 49
Backup slides 50
2-D multipole expansion � n � � φ ( z ) = Re m k ln( z − z k ) k =1 ⇓ n 1 − z k � � � � m k ln( z − z k ) = M ln z + m k ln z k =1 k ∞ α d � = M ln z + z d d =1 Approx. by truncating sum 51
Recommend
More recommend