The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe - - PowerPoint PPT Presentation

the improvement of start
SMART_READER_LITE
LIVE PREVIEW

The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe - - PowerPoint PPT Presentation

The improvement of START Kenji Hasegawa (U. Tsukuba, CCS Kobe branch) Takashi Okamoto (U. Tsukuba, CCS Kobe branch) Cosmological Radiative Transfer Comparison Project Workshop IV @ Austin, Texas, Dec 11-14, 2012 Outline Introduction What is


slide-1
SLIDE 1

The improvement of START

Kenji Hasegawa (U. Tsukuba, CCS Kobe branch)

Takashi Okamoto (U. Tsukuba, CCS Kobe branch)

Cosmological Radiative Transfer Comparison Project Workshop IV @ Austin, Texas, Dec 11-14, 2012

slide-2
SLIDE 2

Outline

Introduction What is “START” Previous Studies with START Improvements New Ray-tracing Test & Scalability Additional Process (Roles of Dust) Preliminary Results Summary

slide-3
SLIDE 3

What is “START”

SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010)

  • Non-equilibrium chemistry
  • Radiative Transfer

SPH (Smoothed Particle Hydrodynamics) method e-, H+, H, H-, H2, H2+, He, He+, and He2+ HI, HeI, HeII ionizing photon, and H2 photodissociating photon. SPH particles are directly used as grids for RT → Spatial resolution changes adaptively. RT Calculation is accelerated by Tree Algorithm

  • Hydrodynamics
slide-4
SLIDE 4

What is “START”

1)Make an oct-tree structure for sources. 2) If a cell which contains sources is far enough away from an SPH particle, the cell is regarded as a virtual luminous sources.

SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010)

slide-5
SLIDE 5

SPH with Tree-based Accelerated Radiative Transfer (KH & Umemura 2010)

What is “START”

1)Make an oct-tree structure for sources. 2) If a cell which contains sources is far enough away from an SPH particle, the cell is regarded as a virtual luminous source. calculation cost is proportional to log(Ns) (Not Ns)

l d < θcrit

l: size of a cell d: distance between a SPH particle and a cell

In the limit of θcrit = 0.0, the scheme corresponds to RSPH (Susa 2006)

slide-6
SLIDE 6

Similar method for grid-based RT ARGOT: Accelerated Radiative Transfer on grids using oct-tree (Okamoto, Yoshikawa & Umemura 2012)

What is “START”

slide-7
SLIDE 7

Previous Work with START

KH, Umemura & Suwa (2010), Umemura, Susa, KH, Suwa, & Semelin (2012)

UV feedback on a secondary collapsing Pop III halo

  • RHD simulation including the transfer of diffuse

recombination photons. ⇨Nsource = NSPH = 2million

First Star

The secondary core can survive!!

~70pc

slide-8
SLIDE 8

Previous Work with START

KH & Semelin (2012) UV feedback on galaxies during the Epoch of Reionization

  • RHD simulation including internal UV (ionization and LW)

feedback in each galaxy.

Z=24 Z=9.5 Z=6.0 Z=7.3

5cMpc

slide-9
SLIDE 9

Previous Work with START

KH & Semelin(2012)

Cosmic SF history Ionization history

High resolution run High resolution run Low resolution run

We found:

  • The formation of galaxies

during the EoR is controled by internal UV & SN feedback.

  • Ionization and Cosmic SF

histories are very sensitive to the mass resolution. Box size is too small to show cosmic reionization history...

Mmin,halo=2×107Msun Mmin,halo= 1.6×108Msun

Much larger number of particles are required

slide-10
SLIDE 10

What we need are

★Powerful Super Computer ★RHD code which enables

us to perform massive parallel simulations

slide-11
SLIDE 11

K Computer

Top500 list Nov. 2012 http:/ /www.top500.org

~82k nodes (650k cores) available Peak Performance ~ 10PFlops

slide-12
SLIDE 12

Ray-Tracing: Old version.

DISTANCE Lv. 2 Lv .1 Lv. 3

Ray-tracings are solved from all sources in all levels.

slide-13
SLIDE 13

Ray-Tracing: Old version.

DISTANCE Lv. 2 Lv .1 Lv. 3

Ray-tracings are solved from all sources in all levels.

slide-14
SLIDE 14

Ray-Tracing: Old version.

DISTANCE Lv. 2 Lv .1 Lv. 3

Time of MPI communications dramatically increases with increase of Nnode

Ray-tracings are solved from all sources in all levels.

slide-15
SLIDE 15

Ray-tracing: Improved version

Point: Reuse of the information of lower level

Lv .2 Lv .1 Lv .3 DISTANCE

slide-16
SLIDE 16

Ray-tracing: Improved version

Point: Reuse of the information of lower level

Lv .2 Lv .1 Lv .3 DISTANCE

slide-17
SLIDE 17

Ray-tracing: Improved version

Point: Reuse of the information of lower level

Lv .2 Lv .1 Lv .3 DISTANCE

Not only MPI time but also the cost

  • f RT calculation

can be reduced.

slide-18
SLIDE 18

TREE WALK

Lv.1 Lv.2 Lv.3 Lv.4

*In practice, oct- tree is utilized.

Lv.5

slide-19
SLIDE 19

TREE WALK

Lv.2 Lv.3 Lv.4

*In practice, oct- tree is utilized.

Lv.1 Lv.5

slide-20
SLIDE 20

TREE WALK

Lv.2 Lv.3 Lv.4

*In practice, oct- tree is utilized.

Lv.1 Lv.5

slide-21
SLIDE 21

TREE WALK

Lv.2 Lv.3 Lv.4

Parallelization via openmp *In practice, oct- tree is utilized.

Lv.1 Lv.5

slide-22
SLIDE 22

Parallelization: Between nodes

★The size of each

domain is adjusted to have equivalent calculation cost every a few steps.

slide-23
SLIDE 23

★Each domain

asynchronously sends (receives)

  • ptical depths. to

downstream (from upstream) domains. (Same as RSPH by Susa 2006)

Parallelization: Between

★The size of each

domain is adjusted to have equivalent calculation cost every a few steps.

slide-24
SLIDE 24

Parallelization: Between

★The size of each

domain is adjusted to have equivalent calculation cost every a few steps.

★Each domain

asynchronously sends (receives)

  • ptical depths. to

downstream (from upstream) domains. (Same as RSPH by Susa 2006)

slide-25
SLIDE 25

Make load balance better

Parallelization: Between

★The size of each

domain is adjusted to have equivalent calculation cost every a few steps.

★Each domain

asynchronously sends (receives)

  • ptical depths. to

downstream (from upstream) domains. (Same as RSPH by Susa 2006)

slide-26
SLIDE 26

Test of the new method

DATA: the distributions of the SPH and stellar particles @z=7.0 obtained by a cosmological hydrodynamic

  • simulation. NSPH = 1283, Ns~300

Density Temperature

Reference (by RSPH)

Ionized fraction

slide-27
SLIDE 27

Test of the new method

Density Temperature Ionized fraction

Reference (by RSPH)

θcrit = 0.5 θcrit = 0.7 θcrit = 0.9

10Myr 20Myr 30Myr Temperature by New START

DATA: the distributions of the SPH and stellar particles @z=7.0 obtained by a cosmological hydrodynamic

  • simulation. NSPH = 1283, Ns~300
slide-28
SLIDE 28

Test of the new method

* If we employ an appropriate tolerance parameter, RT can be solved accurately. * Similar method will be implemented into ARGOT by T. Oakamoto.

Density Temperature

Reference (by RSPH)

θcrit = 0.5 θcrit = 0.7 θcrit = 0.9

10Myr 20Myr 30Myr Temperature by New START

Ionized fraction

slide-29
SLIDE 29

START scalability: Hydro Part

Cosmological Hydrodynamics

N=5123×2: Test on K computer Very Good strong scaling up to 8k nodes (64k cores)

slide-30
SLIDE 30

Comparison between the improved and old versions

XE6(cray)@Kyoto

  • Speed-up is factor of 2
  • Time for MPI does not increase with

increase of Nnode.

NSPH=2563

START scalability: RT Part

Nsource=16k

slide-31
SLIDE 31

Dependence on the number of sources. Comparison between the runs with Ns=2k and Ns=16k

  • Calculation time is insensitive to the

number of sources.

START scalability: RT Part

slide-32
SLIDE 32

Test with 5123 SPH particles and 16k source particles

  • With 5123 particles, the scheme still shows good

scalability.

  • It is expected that the scheme keeps good

scalability, even if we increase the number of nodes.

START scalability: RT Part

slide-33
SLIDE 33

Additional Processes

  • Evolution of spectrum (age, freq.) = (22, 60)
  • Metal Enrichment

In previous study (KH & Semelin 2012), we assumed blackbody-shape with 50,000K for stellar sources. High energy photons were

  • verproduced.
  • Metal cooling
  • Roles of Dust grain
  • Molecular formation
  • Absorption of Photons
  • Radiation Force

STAR FORMATION and REIONIZATION

Affect

M=Msun

Population synthesis by PEGASE

slide-34
SLIDE 34

Role of Dust

Absorption

H Lyman limit

  • Even if H and

He atoms are ionized, dust

  • pacity does not

change.

  • Opacity is

sensitive to the size of dust at frequency range above the Lyman limit.

Dust date from Draine & Lee (1984) 30 bins 30 bins Z=Zsun

Mdust = 0.01MH

Draine et al. (2007)

slide-35
SLIDE 35

Absorption by Dust

without dust

Z=0.01Zsun Dust size 0.1micron

*Found in Local Group *Proposed by Nozawa+(2007)

slide-36
SLIDE 36

Absorption by Dust

without dust

Z=0.01Zsun Dust size 0.01micron

*Typical size of first grains, proposed by Todini & Ferrara (2001)

slide-37
SLIDE 37

Summary

New method:

  • Good Strong Scaling. (So far up to 2,000

nodes) Probably NSPH=10243 run is possible, using K

computer 8k-16k nodes (in 1-2 weeks?).

  • Accurate (with small tolerance parameters)

Simulations including Metal Enrichment:

  • The role of metal (especially dust) on the

evolution of high-z galaxies and IGM.

  • Compute SEDs, LF

, escape fraction ... of high-z galaxies.