Why is it worth to spend 1.5 million CPU-hours on Relativistic - - PowerPoint PPT Presentation

why is it worth to spend 1 5 million cpu hours on
SMART_READER_LITE
LIVE PREVIEW

Why is it worth to spend 1.5 million CPU-hours on Relativistic - - PowerPoint PPT Presentation

Why is it worth to spend 1.5 million CPU-hours on Relativistic Astrophysics? Miguel ngel Aloy Tors Departamento de Astronoma y Astrofsica Universidad de Valencia Collaborators: Petar Mimica, Javier Vargas SciComp09 Barcelona,


slide-1
SLIDE 1

Why is it worth to spend 1.5 million CPU-hours on Relativistic Astrophysics?

Miguel Ángel Aloy Torás

Departamento de Astronomía y Astrofísica Universidad de Valencia Collaborators: Petar Mimica, Javier Vargas

Barcelona, 19-05-09 SciComp09

1

slide-2
SLIDE 2

We have employed a numerical approach in order to understand the actual physics happening in relativistic, astrophysical sources. Over the last 2 years, we have used ~106 CPU hours/year of computer time in RES-facilities. In this period we have accomplished several tasks in the field

  • f relativistic (magneto-)hydrodynamics.

Tasks: 1.Calculation of the early phases of propagation of ultrarelativistic shells of

  • plasma. These calculations are addressed to model the so-called afterglow

phase of gamma-ray bursts (GRBs), which find themselves among the most energetic events in our Universe. 2.Study of the physical parameters of blazars by comparing synthetic- emission models with observations. 3.Influence of the cooling on the collimation properties of relativistic outflows.

The topic in a nutshell

Common link: Relativistic (Resistive) (Magneto-)Hydrodynamics RHD, RMHD, RRMHD

2

slide-3
SLIDE 3

Special Relativistic Hydrodynamics (SRHD): Equations

D t +·(Dv) = 0 (mass conservation) S t +·(S⊗v+ pI) = 0 (momentum conservation)

  • t +·(S−Dv) = 0 (energy conservation)

The state vector and the flux vectors are:

U = (D,S1,S2,S3,) , Fi = (Dvi,S1vi +1i,S2vi +2i,S3vi +3i,Si −Dvi) D = W: relativistic rest-mass density. S = hW2v: relativistic momentum density. = hW2c2 − p−Wc2: relativistic

energy density.

v: flow velocity. W = 1/

  • 1−v2/c2: Lorentz factor.

: rest-mass density. h = 1+/c2+ p/c2: specific enthalpy. : specific internal energy. p: pressure.

Relativistic Effects

h ≥ 1 ( ≥ c2) , W ≥ 1 (v → c)

Equations of RHD

5 equations 5 unknowns + p(,)

3

slide-4
SLIDE 4

∂U ∂t + ∂Fx(U) ∂x + ∂Fy(U) ∂y + ∂Fz(U) ∂z = 0, ∇ · B = 0, U =              ρu0 (ρh + b2)u0ux − b0bx (ρh + b2)u0uy − b0by (ρh + b2)u0uz − b0bz (ρh + b2)(u0)2 −

  • p + b2

2

  • − (b0)2

Bx By Bz              , Fx(U) =              ρux (ρh + b2)uxux +

  • p + b2

2

  • − bxbx

(ρh + b2)uxuy − bxby (ρh + b2)uxuz − bxbz (ρh + b2)u0ux − b0bx Byvx − Bxvy Bzvx − Bxvz              , Fy(U) =              ρuy (ρh + b2)uxuy − bxby (ρh + b2)uyuy +

  • p + b2

2

  • − byby

(ρh + b2)uyuz − bybz (ρh + b2)u0uy − b0by Bxvy − Byvx Bzvy − Byvz              Fz(U) =              ρuz (ρh + b2)uxuz − bxbz (ρh + b2)uyuz − bybz (ρh + b2)uzuz +

  • p + b2

2

  • − bzbz

(ρh + b2)u0uz − b0bz Bxvz − Bzvx Byvz − Byvz              .

Equations of RMHD

8 equations 8 unknowns + p(,)

Requires 3 more variables than RHD. RMHD differs substantially from RHD at the numerical level. There is a mixture of volume average (v, p, etc.) and surface average variables (B). The volumetric variables follow the same numerical scheme as in RHD. Magnetic field components evolved using constraint transport to account for the numerical preservation of B. 4

slide-5
SLIDE 5

Equations of RRMHD

14 equations 14 unknowns + p(,)

∂Q(P) ∂t + ∂F m(P) ∂xm = S(P), (55) where Q =

              

  • Bi
  • Ei

q ργ e Pi

              

, P =

              

  • Bi
  • Ei

q ρ p ui

              

, S =

              

−κ 0i q − κ −J i 0i

              

are the vectors of conserved quantities, primitive quantities and sources, respectively, and Fm =

              

Bm eimk Ek + gim E j −eimk Bk + gim J m ρum Sm im

              

i i

Sa(P) =

             

0i q −qvi 0i

             

and Sb(P) =

             

−κ 0i −κ −J i

c

0i

             

, where Jc = σγ [E + v×B − (E·v)v] is the conductivity current. The source term S is potentially stiff (in

From Komissarov (2007)

RRMHD is treated in a way very similar to RHD (Komissarov 2007), e.g., no constraint transport since all variables are volumetric. Drawback: need 2 extra scalar fields (,). Stiffness: Semi-analytic integration of the source terms involving resistivity and/or scalar fields.

B0

domain computational

Lx 2L y x y L

5

slide-6
SLIDE 6

MRGENESIS: a common framework for RHD, RMHD & RRMHD

MRGENESIS is a multidimensional (1D, 2D or 3D), parallel (MPI) code which allows

  • ne to compute problems where RHD or RMHD are relevant.

In the newest development, not fully integrated with RMHD-branch, RRMHD is also included. Employs:

  • Finite Volume approach.
  • Method of lines: separate semi-discretization of space and time.
  • Time advance: TVD Runge Kutta methods of 2nd and 3rd order.
  • High-resolution Shock Capturing schemes.
  • Inter-cell reconstruction: Up to 3rd order using PPM algorithm.
  • In RMHD: constraint transport.
  • In RRMHD: Munz’s method (=Lagrange Multipliers) to conserve B and charge.
  • Several orthogonal coordinate systems (Cartesian, Cylindrical, Spherical).
  • SPEV for problems where the non-thermal emission of R(M)HD models is sought.

6

slide-7
SLIDE 7

1 10 100 1000

number of threads

0,001 0,01 0,1 1 10

wall clock time [normalized]

iMac Intel Core 2 Duo 2.8 GHz (Uni. Valencia) iMac Intel Core 2 Duo 2.8 GHz, ifort 11.0 (Uni. Valencia) Intel Xeon 5140 2.33 GHz (Uni. Valencia) Intel Xeon E6405 2.0 GHz (HYBRID cluster, Uni. Split) AMD Opteron 8350 2.9 GHz (Uni. Valencia)

Parallel performance

Small and multicore systems

7

slide-8
SLIDE 8

1 10 100 1000

number of threads

0,001 0,01 0,1 1 10

wall clock time [normalized]

iMac Intel Core 2 Duo 2.8 GHz (Uni. Valencia) iMac Intel Core 2 Duo 2.8 GHz, ifort 11.0 (Uni. Valencia) Intel Xeon 5140 2.33 GHz (Uni. Valencia) Intel Xeon E6405 2.0 GHz (HYBRID cluster, Uni. Split) AMD Opteron 8350 2.9 GHz (Uni. Valencia)

Parallel performance

Small and multicore systems

best performance: different threads on different processors. worst performance: different threads on same processor but different cores

7

slide-9
SLIDE 9

1 10 100 1000

number of threads

0,001 0,01 0,1 1 10

wall clock time [normalized]

SiCortex SC072 (Megware Computer GmbH) Tirant (Uni. Valencia) MareNostrum (BSC)

Parallel performance

Large multiprocessor systems

8

slide-10
SLIDE 10

1 10 100 1000

number of threads

0,001 0,01 0,1 1 10

wall clock time [normalized]

iMac Intel Core 2 Duo 2.8 GHz (Uni. Valencia) iMac Intel Core 2 Duo 2.8 GHz, ifort 11.0 (Uni. Valencia) Intel Xeon 5140 2.33 GHz (Uni. Valencia) SiCortex SC072 (Megware Computer GmbH) Intel Xeon E6405 2.0 GHz (HYBRID cluster, Uni. Split) Tirant (Uni. Valencia) MareNostrum (BSC) AMD Opteron 8350 2.9 GHz (Uni. Valencia)

Parallel performance

All systems

9

slide-11
SLIDE 11

The use of a cutting-edge facility as Mare Nostrum (also Cesvima and Tirant) has allowed us to simulate with unprecedented resolution the early phase of the afterglow associated to one of the most luminous events in the Universe: GRBs. Because of the ultrarelativistic character of the ejecta flow, we have to use huge numerical resolution, i.e., enormous computational resources.

Early afterglow

0,001 0,01 0,1 1 10 100

  • 0,1

1 10

eq

Weak/No Reverse Shock Dissipation rel=1.04 rel=1.1 rel=1.25 Substantial Reverse Shock Dissipation

Long standing issue: do afterglows result from magnetized or unmagnetized ejecta sweeping the interstellar medium (ISM)?. Our simulations have quantified which is the approximate magnetization of the ejecta (o) to allow for the production of a reverse shock in the ejecta, which may accelerate particles, whose optical emission (optical flash) is envisioned to be the signature of such shock.

σ0 := B2 4πγ0ρ0c2

ξ = 0.73 E1/6

53

n1/6 ∆1/2

12 γ4/3 2.5

Mimica, Giannios, Aloy (2009)

10

slide-12
SLIDE 12

Evolution of magnetized and non-magnetized afterglows

1 light-day = 2.6x1015 cm = 174 AU Earth surface magnetic field strength = 0.03 - 0.06 mT

11

slide-13
SLIDE 13

Evolution of magnetized and non-magnetized afterglows

1 light-day = 2.6x1015 cm = 174 AU Earth surface magnetic field strength = 0.03 - 0.06 mT

12

slide-14
SLIDE 14

Evolution of magnetized and non-magnetized afterglows

1 light-day = 2.6x1015 cm = 174 AU Earth surface magnetic field strength = 0.03 - 0.06 mT

12

slide-15
SLIDE 15
  • Code: MRGENESIS.
  • Simulations: 1D RMHD (spherical

coordinates).

  • Domain size: Rtot=1018 cm
  • Number of models: 8
  • Parameters: o = 0, 1, 3, =0.5, 1,

which correspond to combinations

  • f parameters such as

o = 10, 15, 20 o = 1015 cm (thin shell), 1016 cm (thick shell) E53 = 3.33, 33.3 n0 = 10 cm-3

Early afterglow: technical details

Our simulations allowed us to verify the existence of scaling laws, that enabled us to extrapolate the results to parameters in the right range for applications to GRB afterglows, namely: oAG~100, oAG~3x1012 cm.

1,56 1,57

r [10

17] cm 10

  • 2

10

  • 1

10

,

  • 1,56

1,57

r [10

17] cm 10 20

  • Fig. 5. Results of a test of the rescaling with γ0. Red, black

and blue lines show the results for models with γ0 = 10, 15 and 20, respectively. Full, dashed and dot-dashed lines show the rest-mass density, σ and the fluid Lorentz factor, respec-

  • tively. Models with γ0 = 10 and 20 have been rescaled using

Eqs (10) - (12). After rescaling of the results, the profiles of all models almost overlap. Only around the RS there are large

  • discrepancies. The reason for them is that we have a finite time

resolution and, therefore, we have to rescale our models using the closest discrete time we have to the one requested by the transformation expressed in Eqs. 11.

10 100

rel

FS 0 = 10 0 = 15 0 = 20 0 = 100 1 2

tobs

1 1,01 1,02 1,03 1,04

rel

RS

  • Fig. 6. Similar to Fig. 3, but for the models used in the test of

the rescaling hypothesis. Black, blue and red lines in the upper (lower) panel show the relative Lorentz factor of the fluid at the FS (RS) as a function of the normalized time of observation for models with γ0 = 10, 15 and 20, respectively. As expected, the relative Lorentz factor at the front shock scales with γ0 while at the reverse shock they coincide. The dashed line in the upper panel shows the results of a model obtained by rescaling to γ0 = 100 the model with γ0 = 15.

Mimica, Giannios, Aloy (2009)

13

slide-16
SLIDE 16

Early afterglow: technical details

Note on resolution: Need to resolve o in more than 104 zones for o ~ 10. Since Rtot / o ~ 100 - 1000 ⇒ Ntot ~ 106 - 107 ⇒ Niter ~ 107 - 108 (CFL~0.1) Currently, we use ~ 500 - 1000 CPUs/job, which need ~ 200 h to finish. Since x∝o , we would need ~ 2000 h in 500 - 1000 CPUs to run realistic models with o ~100. But the physical problem, if a realistic parameter set were needed, demands (in 1D!!): Rtot / oAG ~ 3x105 ⇒ Ntot ~ 3x109 ⇒ Niter ~ 3x1010 ⇒ 6x105 h in 500 CPUs ⇒ 3x104 h in 10000 CPUs ⇒ 3x105 h in 10000 CPUs (o ~100).

t , l.

1

log

  • 5
  • 4
  • 3

log x

27 28

Niter x 10

3

  • Fig. A.1. Time τ (upper panel) and the number of iterations Niter

(lower panel) needed to resolve the Riemann problem in planar coordinates as a function of the spatial discretization ∆x.

.

10 20 30 0.5 1

  • Fig. A.2. Time needed to resolve the Riemann problem in pla-

nar coordinates as a function of the initial Lorentz factor Γ0 for ∆x = 1.5625 × 10−5.

.

14

slide-17
SLIDE 17

Blazars and pc-scale jets

15

slide-18
SLIDE 18

Blazars and pc-scale jets are observed because they emit huge amounts of non-thermal (synchrotron, IC, etc.) radiation.

Blazars and pc-scale jets

Kovalev et al. (2007)

15

slide-19
SLIDE 19

Blazars and pc-scale jets are observed because they emit huge amounts of non-thermal (synchrotron, IC, etc.) radiation.

Blazars and pc-scale jets

Kovalev et al. (2007) Connecting observations (=non-thermal radiation) with simulations of a R(M)HD (=thermal) flow is an unavoidable task. Physically this means to solve the Bolztman equation to couple the space-time/momentum evolution of non-thermal particles to the dynamical evolution of the thermal jet flow. To solve Boltzman-eq. is a 7D problem (t, x, p)!!! Assuming one needs 100 points/dimension ⇒ 1014 points!!! Simplifications: 2D spatial axial-symmetry (r, z) + 1D time evolution (t) + 1D momentum evolution (p = p(E)) Total: 4D problem (t, r, z, E) CLAIMS for SUPERCOMPUTING! pb

b

∂f ∂xβ − Γa

bcpc ∂f

∂pa

  • =

δf δτ

  • coll

.

15

slide-20
SLIDE 20

We have developed a new algorithm to deal with such a huge 4D problem: SPEV It has taken more than 2 years of development to include lepton-transport fully coupled to R(M)HD. SPEV properties:

  • 1. MPI parallelization over sectors of the detection screen and frequencies: loosely coupled

problem.

  • 2. Linear scaling up to 1024 CPUs (limited by I/O efficiency and memory/CPU).
  • 3. Synchrotron and synchrotron self-absorption.

Blazars and pc-scale jets

  • 4. Tabulated synchrotron integrals (combinations of Bessel

functions) for piece-wise power-law electron distribution functions [100 times faster than direct numerical integration].

  • 5. Complete inclusion of relativistic effects (aberration, time

dilation, Doppler boosting).

  • 6. Lagrangian tracers (particles) are injected at desired

positions and their whole evolution is computed.

  • 7. 4D ray tracing (snapshots of 2D spatial hydro models

converted to 3D + time).

  • 8. I/O intensive (parallel HDF5): R(M)HD O-intensive;

Energy evolution I-intensive. tobs t1 t2 t3

detector screen (observer) motion (v~c) towards observer

16

slide-21
SLIDE 21

17

slide-22
SLIDE 22

Radio imaging of a pc-scale jet Simulation data: Computing time:1 week with 512 CPUs RHD: 1600 x 80 x 20,000 (time steps) ~ 0.5 Tb + 2x105 particles Detector: 270 x 18 (screen area) x 128 (obs. time intervals) x 3 (frequencies) ~ 2x106

17

slide-23
SLIDE 23

17

slide-24
SLIDE 24

The parameter space in SPEV simulations spans variations of: 1.Magnetic field strength. 2.Jet-to-external medium pressure. 3.Initial proportionality between the energy and number density of the thermal and non- thermal particle distribution. 4.Initial power-law index of the non-thermal particle energy distribution. 5.Hydrodynamic properties of the beam and of the perturbation. All in all, the minimum dimension of the parameter space is 6! ⇒ Many models need to be run!! Supercomputing needed because: 1.Large number of models. 2.Large dimensionality of the problem. 3.Huge I/O needs.

Blazars and pc-scale jets

18

slide-25
SLIDE 25

The problem: Doppler factor crisis

➡The size of an emitting region is bounded by the light crossing time of the smallest variability timescale. ➡Example HST1-knot located at Rprojected ~ 60 pc from the core of M87:

➡ Radio/optical/X-ray variability: tvarX-rays < 0.14 yr ⇒ RemissX-rays < DtvarX-raysc ~ 0.044D pc << Rprojected (D ~ 4 - 20 Doppler factor from superluminal motions) ➡ TeV variability: It is correlated with the X-ray variability, but: tvarTeV << tvarX-rays ⇒ RemissTeV << RemissX-rays < DtvarTeVc ~ 0.002D pc

Cooling & collimation of subpc- scale jets

It has been proposed (Bromberg & Levinson 2008

  • analytic model) that cooling at subpc-scale (the

so-called blazar zone) may produce substantial collimation of a jet whose beam crosses over a conical shock. We have tested such theoretical hypothesis with MRGENESIS (2D, RHD, spherical coordinates).

19

slide-26
SLIDE 26

No Cooling Strong Cooling

20

slide-27
SLIDE 27

Strong Cooling No Cooling

21

slide-28
SLIDE 28

Strong Cooling No Cooling

Recession of the recollimation shock

21

slide-29
SLIDE 29

Strong Cooling No Cooling

Slightly better collimation Recession of the recollimation shock

21

slide-30
SLIDE 30

Strong Cooling No Cooling

Slightly better collimation Recession of the recollimation shock Recollimation shock changes geometry from planar (more effective to decelerate the beam) to bi-conical

21

slide-31
SLIDE 31

Strong Cooling No Cooling

Slightly better collimation Recession of the recollimation shock Recollimation shock changes geometry from planar (more effective to decelerate the beam) to bi-conical slower beam before the shock

21

slide-32
SLIDE 32

The need of supercomputing: Huge resolution because of the stiffness of cooling

➡ Each model needs a minimum resolution of: 20000 x 1000 (r x theta) ➡ 72 hours / run for stabilization ➡ Around 400 CPUs/run @ minimum resolution. ➡ In time dependent models (to be run), a proper time resolution will need an

spatial resolution: 40000 x 2000 ⇒ 4 times better spatial resolution ⇒ 8 times longer runs ~ 576 h/run with 400 CPUs ~ 230 h/run with 1000 CPUs

Cooling & collimation of subpc- scale jets

22

slide-33
SLIDE 33

The results of the work related to the tasks developed in the RES, can be sum up with the following numbers:

➡ 6 publications in journals:

  • 1. Mimica, P.; Aloy, M.-A.; Agudo, I.; Martí, J. M.; Gómez, J. L.; Miralles, J. A., Spectral Evolution of Superluminal Components in Parsec-

Scale Jets, ApJ, 696, 1142 (2009)

  • 2. Mimica, P.; Giannios, D.; Aloy, M. A., Deceleration of arbitrarily magnetized GRB ejecta: the complete evolution, A&A, 494, 879 (2009)
  • 3. Mizuta, A.; Aloy, M.-A., Angular Energy Distribution of Collapsar-Jets, (2008), arXiv:0812.4813
  • 4. Aloy, M. A.; Mimica, P., Observational Effects of Anomalous Boundary Layers in Relativistic Jets, ApJ, 681, 84 (2008)
  • 5. Giannios, D.; Mimica, P.; Aloy, M. A., On the existence of a reverse shock in magnetized gamma-ray burst ejecta, A&A, 478, 747 (2008)
  • 6. Mimica, P.; Aloy, M.-A.; Müller, E., Internal shocks in relativistic outflows: collisions of magnetized shells, A&A, 466, 93 (2007)

➡ 2 conference proceedings:

  • 1. Mizuta, A.; Aloy, M.-A.; Müller, E., Energy Distribution of Relativistic GRB Jets, AIP Conf. Proc. vol 1000, 467 (2008)
  • 2. Mimica, P.; Giannios, D.; Aloy, M.-A., An RMHD study of transition between prompt and afterglow GRB phases, (2008), arXiv0801.1325

➡Results presented in 4 different conferences (two invited talks).

Conclusions

23

slide-34
SLIDE 34

The results of the work related to the tasks developed in the RES, can be sum up with the following numbers:

➡ 6 publications in journals:

  • 1. Mimica, P.; Aloy, M.-A.; Agudo, I.; Martí, J. M.; Gómez, J. L.; Miralles, J. A., Spectral Evolution of Superluminal Components in Parsec-

Scale Jets, ApJ, 696, 1142 (2009)

  • 2. Mimica, P.; Giannios, D.; Aloy, M. A., Deceleration of arbitrarily magnetized GRB ejecta: the complete evolution, A&A, 494, 879 (2009)
  • 3. Mizuta, A.; Aloy, M.-A., Angular Energy Distribution of Collapsar-Jets, (2008), arXiv:0812.4813
  • 4. Aloy, M. A.; Mimica, P., Observational Effects of Anomalous Boundary Layers in Relativistic Jets, ApJ, 681, 84 (2008)
  • 5. Giannios, D.; Mimica, P.; Aloy, M. A., On the existence of a reverse shock in magnetized gamma-ray burst ejecta, A&A, 478, 747 (2008)
  • 6. Mimica, P.; Aloy, M.-A.; Müller, E., Internal shocks in relativistic outflows: collisions of magnetized shells, A&A, 466, 93 (2007)

➡ 2 conference proceedings:

  • 1. Mizuta, A.; Aloy, M.-A.; Müller, E., Energy Distribution of Relativistic GRB Jets, AIP Conf. Proc. vol 1000, 467 (2008)
  • 2. Mimica, P.; Giannios, D.; Aloy, M.-A., An RMHD study of transition between prompt and afterglow GRB phases, (2008), arXiv0801.1325

➡Results presented in 4 different conferences (two invited talks).

Conclusions

★All in all, it has payed off to use more than 1.5 million hours of computing time in Mare Nostrum.

23

slide-35
SLIDE 35

The results of the work related to the tasks developed in the RES, can be sum up with the following numbers:

➡ 6 publications in journals:

  • 1. Mimica, P.; Aloy, M.-A.; Agudo, I.; Martí, J. M.; Gómez, J. L.; Miralles, J. A., Spectral Evolution of Superluminal Components in Parsec-

Scale Jets, ApJ, 696, 1142 (2009)

  • 2. Mimica, P.; Giannios, D.; Aloy, M. A., Deceleration of arbitrarily magnetized GRB ejecta: the complete evolution, A&A, 494, 879 (2009)
  • 3. Mizuta, A.; Aloy, M.-A., Angular Energy Distribution of Collapsar-Jets, (2008), arXiv:0812.4813
  • 4. Aloy, M. A.; Mimica, P., Observational Effects of Anomalous Boundary Layers in Relativistic Jets, ApJ, 681, 84 (2008)
  • 5. Giannios, D.; Mimica, P.; Aloy, M. A., On the existence of a reverse shock in magnetized gamma-ray burst ejecta, A&A, 478, 747 (2008)
  • 6. Mimica, P.; Aloy, M.-A.; Müller, E., Internal shocks in relativistic outflows: collisions of magnetized shells, A&A, 466, 93 (2007)

➡ 2 conference proceedings:

  • 1. Mizuta, A.; Aloy, M.-A.; Müller, E., Energy Distribution of Relativistic GRB Jets, AIP Conf. Proc. vol 1000, 467 (2008)
  • 2. Mimica, P.; Giannios, D.; Aloy, M.-A., An RMHD study of transition between prompt and afterglow GRB phases, (2008), arXiv0801.1325

➡Results presented in 4 different conferences (two invited talks).

Conclusions

★All in all, it has payed off to use more than 1.5 million hours of computing time in Mare Nostrum. ★We hope we will get, at least, a similar amount in the near future!.

23