the n body problem 2 3

The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of - PowerPoint PPT Presentation

The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.21] Thursday, March 27, 2008 1 Todays sources CS 267 at UCB (Demmel & Yelick) Computational Science and


  1. The n-body problem (2/3) Prof. Richard Vuduc Georgia Institute of Technology CSE/CS 8803 PNA: Parallel Numerical Algorithms [L.21] Thursday, March 27, 2008 1

  2. Today’s sources CS 267 at UCB (Demmel & Yelick) Computational Science and Engineering , by Gilbert Strang Lectures 18–21 of M. Ricotti’s Astro 415 course at U. Maryland Jim Stone (Princeton) Andrey Kravtsov (U. Chicago) Mike Heath (UIUC) Changa (ChaNGa, U. Washington; based on CHARM++) 2

  3. Review: Parallel ODE solvers 3

  4. Example: Stellar dynamics in a globular cluster 4

  5. Rewrite as 1st-order ODE F i r i − r j � − G m j = ¨ r i = | r i − r j | 3 m i j � = i ⇓ � � � � d r i v i = F i /m i v i dt n particles ⇒ 6 n -element vector 5

  6. Given: F i r i − r j � = ¨ = − G m j r i 3 m i ( | r i − r j | 2 + ǫ 2 ) 2 j � = i r i ( t 0 ) , v i ( t 0 ) = . . . Solve: t t 0 ≥ � � � � d r i ( t ) v i ( t ) = v i ( t ) F i ( r ) /m i dt n particles ⇒ 6 n -element vector Computational tasks: 1. Solve ODE 2. Inner loop: compute forces 6

  7. IVP ODE solvers Many algorithms Design space Taylor series ( e.g. , Euler) Single vs. multistep Runge-Kutta Explicit vs. implicit Extrapolation No. of req’d function/ derivative evaluations Multistep Multivalue 7

  8. Sources of parallelism Multistage, e.g. , Runge-Kutta: Stage evaluation Evaluating right-hand side, e.g. , forces in gravitational n-body Solving linear/non-linear systems, e.g. , implicit methods Partitioning equations in multiple tasks – waveform relaxation   � � t, y ( k +1) , y ( k )    y ( k +1) f 1 1 2 d 1   ←  dt y ( k +1) � �   t, y ( k ) 1 , y ( k +1) f 2 2 2 8

  9. Leap-frog integrator r 0 r 1 r 2 r 3 r 4 ˙ ˙ ˙ ˙ ˙ r 1 r 3 r 5 r 7 r 9 2 2 2 2 2 h t + h � � 2 ≡ ˙ r k + h · v k + 1 v k + 1 r r k +1 ← 2 2 2 + h · g ( r k +1 ) v k + 3 v k + 1 ← 2 9

  10. Time reversibility of leap-frog Energy Source: http://www.physics.drexel.edu/courses/Comp_Phys/Integrators/leapfrog/ 10

  11. Efficiently computing forces: Particle-mesh (PM) approach 11

  12. Method 1: Direct summation (“Particle-Particle”) Most straightforward and accurate Expensive ⇒ O(N 2 ) Parallelization? 12

  13. Method 2: Particle-Mesh Idea: Gravitational field has a scalar potential ∇ × F ( r ) ≡ 0 = F ( r ) = −∇ φ ( r ) ⇒ Potential is given by Poisson’s equation ∇ 2 φ ( r ) = ρ ( r ) We know how to solve this! Just need a sensible ρ (r) 13

  14. ρ : Create a mesh and assign particles to cells: Nearest grid point Coarse ρ Charge-in-cell or particle-in-cell ( PIC ) Assign “shape” or “cloud” to each particle Smooths ρ Boundary conditions? Periodic or multipole Last step: ρ → Φ → F Finite differencing + interpolation 14

  15. Particle-in-cell: Replace points by distribution function 15

  16. PM: Pros & cons Pro: Poisson is warm and fuzzy! Many accurate and efficient solvers ( e.g. , multigrid, FFT) Con: Limitations of meshes How to choose no. of grid points? Cannot resolve interactions on scales smaller than grid size Hybrid PP-PM (“P 3 M”) methods possible 16

  17. Efficiently computing forces: Tree codes 17

  18. Approximating long-distance interactions D R D D R < θ = center of mass 18

  19. Repeat recursively R D D R 1 D 1 19

  20. Idea: Organize particles in space in a tree 20

  21. Adaptive quad-tree 21

  22. Source: M. Warren & J. Salmon, In Supercomputing 1993. 22

  23. Barnes-Hut algorithm (1986) Algorithm: Build tree For each node , compute center-of-mass and total mass For each particle , traverse tree to compute force on it 23

  24. D R D R < θ 24

  25. Other trees are possible, e.g. , kd-tree 25

  26. Fast multipole method of Greengard & Rokhlin (1987) Differences from Barnes-Hut Computes potential, not force Uses more than center-of- and total-mass ⇒ more accurate & expensive Accesses fixed set of boxes at every level, independent of “D / R” Increasing accuracy BH: Fixed info / box, more boxes FMM: Fixed no. of boxes; more info / box 26

  27. FMM computes compact expression for potential 1 | F ( r ) | = r 2 r ⇓ F ( r ) = −∇ φ ( r ) 27

  28. Potential in 3-D 3-D: − 1 1 φ ( r ) = | r | = − x 2 + y 2 + z 2 � � x � ∂φ � ∂ x, ∂φ ∂ y , ∂φ � r 3 , y r 3 , z F ( r ) = = − − r 3 ∂ z 28

  29. Potential in 2-D 2-D: x 2 + y 2 � φ ( r ) = ln | r | = ln � x � ∂φ � ∂ x, ∂φ � r 2 , y F ( r ) = = − − r 2 ∂ y For n points in the plane: n ( x − x k ) 2 + ( y − y k ) 2 � � φ ( x, y ) = m k ln k =1 29

  30. “Complex” representation n 2-D: ( x − x k ) 2 + ( y − y k ) 2 � � φ ( x, y ) = m k ln k =1 Complex plane: x + iy z ≡ n � φ ( z ) = m k ln | z − z k | k =1 � n � � = Re m k ln( z − z k ) k =1 30

  31. 2-D multipole expansion n � � 1 − z k � � m k ln( z − z k ) = M ln z + m k ln z k =1 k ∞ � z k � d 1 � � = M ln z + m k d z k d =1 � n � ∞ 1 � � m k z d = M ln z + k z d d =1 k =1 � �� � ≡ α d ∞ � Can approx. α d = M ln z + by truncation z d d =1 31

  32. 2-D multipole expansion n � m k z d ≡ α d k k =1 n ∞ α d � � m k ln( z − z k ) = M ln z + z d k =1 d =1 p α d � M ln z + z d + Error( p ) ≈ d =1 � p +1 � max | z k | Error( p ) ∼ | z | 32

  33. � p +1 � max | z k | Error( p ) ∼ | z | Error outside larger box is O( c p +1 ) D √ 2 c ∼ 2 D ≈ . 47 3 33

  34. Outer expansion n � m k z d ≡ α d k k =1 p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 Outer(N) = {M, α 1 , α 2 , …, α p , z N } Use to evaluate ϕ (z) outside node N due to those inside N Centered at z N Evaluation costs is O(p) Cost linear with no. of bits of precision 34

  35. p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 35

  36. p α (1) � d φ 1 ( z ) = M 1 ln( z − z 1 ) + ( z − z 1 ) d d =1 p α (2) � d φ 2 ( z ) = M 2 ln( z − z 2 ) + ( z − z 2 ) d d =1 For z outside dashed black box: φ 1 ( z ) φ 2 ( z ) ∼     α (2) α (1) 1 1 . .     . . = A ( z 1 ) · ⇒ ≈ . .         α (2) α (1) d d Outer(N2) = Outer_shift(Outer(N1), z 2 ) 36

  37. Inner expansion Outer(N) = {M, α 1 , α 2 , …, α p , z N } p n α d � � m k ln( z − z k ) M ln z + ≈ z d k =1 d =1 Similarly, Inner(N) evaluates potential inside N from particles outside. p � β d ( z − z N ) d d =1 37

  38. Inner expansion Need Inner(N4) = Inner(N1) = Inner_shift(Inner(N2), z 1 ) Convert(Outer(N3)) 38

  39. FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 39

  40. FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 40

  41. FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 41

  42. Building Outer(N) 42

  43. FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 43

  44. Building Inner(N) 44

  45. FMM algorithm Build tree Bottom-up traversal to compute Outer(N) Top-down traversal to compute Inner(N) For each leaf N, add contributions of nearest particles directly into Inner(N) 45

  46. Administrivia 46

  47. Upcoming schedule changes Some adjustment of topics (TBD) Tu 4/1 — Esteemed colleague R. Riegel on “Dual tree algorithms in statistics” Th 4/3 — Attend talk by Dr. Douglass Post from DoD HPC Modernization Program, 9:30–10:30am, room MiRC 102A&B No HW 2 (optional assignment possible) Project checkpoint (3 page max): Tu 4/8 47

  48. “In conclusion…” 48

  49. Ideas apply broadly Physical sciences, e.g. , Plasmas Molecular dynamics Electron-beam lithography device simulation Fluid dynamics “Generalized” n-body problems: Talk to your classmate, Ryan Riegel 49

  50. Backup slides 50

  51. 2-D multipole expansion � n � � φ ( z ) = Re m k ln( z − z k ) k =1 ⇓ n 1 − z k � � � � m k ln( z − z k ) = M ln z + m k ln z k =1 k ∞ α d � = M ln z + z d d =1 Approx. by truncating sum 51

Recommend


More recommend