ChaNGa CHArm N-body GrAvity Laxmikant Kale Thomas Quinn Filippo - - PowerPoint PPT Presentation

changa charm n body gravity laxmikant kale thomas quinn
SMART_READER_LITE
LIVE PREVIEW

ChaNGa CHArm N-body GrAvity Laxmikant Kale Thomas Quinn Filippo - - PowerPoint PPT Presentation

ChaNGa CHArm N-body GrAvity Laxmikant Kale Thomas Quinn Filippo Gioachin Graeme Lufkin Pritish Jetley Joachim Stadel Celso Mendes Amit Sharma Outline Scientific background How to build a Galaxy Types of Simulations


slide-1
SLIDE 1

ChaNGa CHArm N-body GrAvity

slide-2
SLIDE 2

Thomas Quinn Graeme Lufkin Joachim Stadel Laxmikant Kale Filippo Gioachin Pritish Jetley Celso Mendes Amit Sharma

slide-3
SLIDE 3

Outline

  • Scientific background

– How to build a Galaxy – Types of Simulations – Simulation Challenges

  • ChaNGa and those Challenges

– Features – Tree gravity – Load balancing – Multistepping

  • Future Challenges

– Needed Simulations – Technology Challenges

slide-4
SLIDE 4

Image courtesy NASA/WMAP

Cosmology: How does this ...

slide-5
SLIDE 5

... turn into this?

slide-6
SLIDE 6

Computational Cosmology

  • CMB gives fluctuations of 1e-5
  • Galaxies are overdense by 1e7
  • It happens through Gravitational

Collapse

  • Making testable predictions from a

cosmological hypothesis requires

– Non-linear, dynamic calculation – e.g. Computer simulation

slide-7
SLIDE 7

Simulation process

  • Start with fluctuations based on Dark Matter properties
  • Follow model analytically (good enough to get CMB)
  • Create a realization of these fluctuations in particles.
  • Follow the motions of these particles as they interact via

gravity.

  • Compare final distribution of particles with observed

properties of galaxies.

slide-8
SLIDE 8

Simulating galaxies: Procedure

  • 1. Simulate 100 Mpc volume at 10-100 kpc

resolution

  • 2. Pick candidate galaxies for further study
  • 3. Resimulate galaxies with same large scale

structure but with higher resolution, and lower resolution in the rest of the computational volume.

  • 4. At higher resolutions, include gas physics and

star formation.

slide-9
SLIDE 9
slide-10
SLIDE 10

Gas Stars Dark Matter

slide-11
SLIDE 11

05/02/08 Parallel Programming Laboratory @ UIUC 11

Types of simulations

Zoom In “Uniform” Volume Star Cluster

slide-12
SLIDE 12

Computational Challenges

  • Large spacial dynamic range: > 100 Mpc to < 1

kpc

– Hierarchical, adaptive gravity solver is needed

  • Large temporal dynamic range: 10 Gyr to 1 Myr

– Multiple timestep algorithm is needed

  • Gravity is a long range force

– Hierarchal information needs to go across processor

domains

slide-13
SLIDE 13
  • Multi-Platform
  • Massively Parallel (100s; 1000s on large sims)
  • Treecode with periodic boundary conditions
  • Multi-stepping (but bad load balancing)
  • Hydrodynamics (via SPH) with radiative cooling
  • UV background
  • Star Formation
  • Supernovae feedback into thermal energy

The existing code:

slide-14
SLIDE 14

ChaNGa Features

  • Tree-based gravity solver
  • High order multipole expansion
  • Periodic boundaries (if needed)
  • Individual multiple timesteps
  • Dynamic load balancing with choice of strategies
  • Checkpointing
  • Visualization
  • Built from the ground up on Charm++
slide-15
SLIDE 15

Need for high multipole order

slide-16
SLIDE 16

Parallel Programming Laboratory @ UIUC 05/02/08

16

Space decomposition

TreePiece 1 TreePiece 2 TreePiece 3 ...

slide-17
SLIDE 17

Parallel Programming Laboratory @ UIUC 05/02/08

17

Basic algorithm ...

  • Newtonian gravity interaction

– Each particle is influenced by all others: O(n²) algorithm

  • Barnes-Hut approximation: O(nlogn)

– Influence from distant particles combined into center of

mass

slide-18
SLIDE 18

Parallel Programming Laboratory @ UIUC 05/02/08

18

... in parallel

  • Remote data

– need to fetch from other processors

  • Data reusage

– same data needed by more than one particle

slide-19
SLIDE 19

Parallel Programming Laboratory @ UIUC 05/02/08

19

Overall algorithm

Processor 1

local work (low priority)remote work miss

TreePiece C

local work (low priority)

remote work

TreePiece B global work

prefetch visit of the tree

TreePiece A local work (low priority) Start computation End computation global work

remote

present?

r e q u e s t n

  • d

e

CacheManager

YES: return

Processor n

reply with requested data

NO: fetch

callback TreePiece on Processor 2

buffer

High priority High priority

prefetch visit of the tree

slide-20
SLIDE 20

05/02/08 Parallel Programming Laboratory @ UIUC 20

Scaling: comparison

Uniform 3M on Tungsten

slide-21
SLIDE 21

05/02/08 Parallel Programming Laboratory @ UIUC 21

Load balancing with GreedyLB

Zoom In 5M on 1,024 BlueGene/L processors

5.6s 6.1s 4x messages

slide-22
SLIDE 22

05/02/08 Parallel Programming Laboratory @ UIUC 22

Load balancing with OrbRefineLB

Zoom in 5M on 1,024 BlueGene/L processors

5.6s 5.0s

slide-23
SLIDE 23

05/02/08 Parallel Programming Laboratory @ UIUC 23

Scaling with load balancing

Number of Processors x Execution Time per Iteration (s)

slide-24
SLIDE 24

Timestepping Challenges

  • 1/m particles need m times more force

evaluations

  • Naively, simulation cost scales as N^(4/3)ln(N)

– This is a problem when N ~ 1e9 or greater

  • If each particle an individual timestep scaling

reduces to N (ln(N))^2

  • A difficult dynamic load balancing problem
slide-25
SLIDE 25

Timestepping and Load Balancing

slide-26
SLIDE 26

Cosmo Loadbalancer

  • Use Charm++ measurement based load balancer
  • Modification: provide LB database with

information about timestepping.

– “Large timestep”: balance based on previous Large

step

– “Small step” balance based on previous small step

slide-27
SLIDE 27

Results on 3 rung example

613s 429s 228s

slide-28
SLIDE 28

Summary

  • Cosmological simulations provide a challenges to

parallel implementations

– Non-local data dependencies – Hierarchical in space and time

  • ChaNGa has been successful in addressing this

challenges using Charm++ features

– Message priorities – New load balancers

slide-29
SLIDE 29

Future

  • Changa currently in use in high time dynamic

range simulations: galactic nuclei

  • New Physics

– Smooth particle hydrodynamics

  • Better gravity algorithms

– Fast multipole method – New domain decomposition/load balancing strategies

  • Generic tree walk to enable new algorithms
slide-30
SLIDE 30
slide-31
SLIDE 31

Have We converged?

Weinberg & Katz (2007)

slide-32
SLIDE 32

Computing Challenge Summary

  • The Universe is big => we will always be

pushing for more resources

  • New algorithm efforts will be made to make

efficient use of the resources we have

– Efforts made to abstract away from machine details – Parallelization efforts need to depend on more

automated processes.