Novel Bayesian approaches to supernova type Ia cosmology Roberto - - PowerPoint PPT Presentation

novel bayesian approaches to supernova type ia cosmology
SMART_READER_LITE
LIVE PREVIEW

Novel Bayesian approaches to supernova type Ia cosmology Roberto - - PowerPoint PPT Presentation

Novel Bayesian approaches to supernova type Ia cosmology Roberto Trotta - MCMSki 2014 07/01/14 - www.robertotrotta.com Wednesday, 8 January 14 The cosmological concordance model The CDM cosmological concordance model is built on three


slide-1
SLIDE 1

Novel Bayesian approaches to supernova type Ia cosmology

Roberto Trotta - MCMSki 2014 07/01/14 - www.robertotrotta.com

Wednesday, 8 January 14

slide-2
SLIDE 2

Roberto Trotta

The cosmological concordance model

1.INFLATION: A burst of exponential expansion in the first ~10-32 s after the Big Bang, probably powered by a yet unknown scalar field. 2.DARK MATTER: The growth of structure in the Universe and the observed gravitational effects require a massive, neutral, non-baryonic yet unknown particle making up ~25% of the energy density. 3.DARK ENERGY: The accelerated cosmic expansion (together with the flat Universe implied by the Cosmic Microwave Background) requires a smooth yet unknown field with negative equation of state, making up ~70% of the energy density. The next 5 to 10 years are poised to bring major

  • bservational breakthroughs in each of those topics!

The ΛCDM cosmological concordance model is built on three pillars:

Wednesday, 8 January 14

slide-3
SLIDE 3

time Big Bang TODAY End of the visible cosmos SN Type Ia Radiation era Dark matter era Dark energy era

Wednesday, 8 January 14

slide-4
SLIDE 4

The end of the visible Universe

Data from the Planck satellite, 2013

Wednesday, 8 January 14

slide-5
SLIDE 5

1012 bits 50x106 pixels

2500 harmonics 6 parameters model

Wednesday, 8 January 14

slide-6
SLIDE 6

Wednesday, 8 January 14

slide-7
SLIDE 7

Roberto Trotta

Baryonic acoustic oscillations (z~0.35) ΛCDM

Low redshift cosmological probes

Kessler et al (SDDS collaboration) (2010)

Supernovae type Ia (z < 1.5)

Padmanabhan et al (2012)

No DE ΛCDM Δ distance modulus

Mehta et al (2012)

Correlation function Acoustic scale distance z~1100 z~0.35 CMB BAO

Wednesday, 8 January 14

slide-8
SLIDE 8

Roberto Trotta

Type Ia supernovae

  • Supernovae: core-collapse thermonuclear

explosions of stars, emitting a large (~ 1051 erg, cf Lgalaxy ~ 1044 erg/s ) amount of energy (photons + neutrinos).

  • Supernovae type Ia (SNIa): characterized by the

lack of H in their spectrum, outcome of a CO white dwarf (WD) in a close binary system accreting mass above the Chandrasekhar limit (1.4 solar masses).

  • The nature of the donor star is still disputed:

Single Degenerate (WD + Main sequence or Red giant or a He star companion) vs Double Degenerate (WD + WD merger) scenarios (or both) SN1994D

High z SN Team/ NASA/HST NASA/CXC/M. Weiss

Single degenerate Double degenerate

Wednesday, 8 January 14

slide-9
SLIDE 9

Roberto Trotta

SNIa cosmology

Absolute magnitude Unknown, but IF ~ constant unimportant (“standard candle”) Needs to be corrected via empirical correlations with other observables M → M + linear corrections (“Phillips relations”) Apparent rest-frame B-band magnitude From measurements in B, V, I, J, ... band Distance modulus Luminosity distance Cosmological parameters Quantities of interest ΩM, ΩDE, w, w(z), H0 (degenerate with M), Redshift Measured via spectrum

  • f the host galaxy

Goal: From the measured multi-band light curves and redshift, infer constraints on the cosmological parameters. But: the devil is in the (statistical) detail! Our solution: March, RT et al, MNRAS 418(4): 2308-2329, 2011 , e-print archive: 1102.3237

µ = mB − M = 5 log10 ✓dL(z, C) 1 Mpc ◆ + 25

Wednesday, 8 January 14

slide-10
SLIDE 10

Roberto Trotta

SNIa lightcurves

rors nges and r of

MJD 53100 53150 53200 Flux

g r i z

SNLS-04D3gx

Guy et al (2007)

SNLS CfA3 185 multi-band optical nearby SNIa

Hicken et al (2009)

Wednesday, 8 January 14

slide-11
SLIDE 11

Roberto Trotta

Brightness-width relationship

LC decline rate SNIa absolute magnitude BRIGHTER FAINTER ~ factor of 3 residual scatter ~ 0.2 mag Phillips (1993)

0.8 1 1.2 1.4 1.6 −20 −19.5 −19 −18.5 −18 −17.5 −17 −16.5 −16

Δ m15(B) Peak Magnitude

MB B0−µ

Mandel et al (2011)

Decline rate

Before dust correction After dust correction

Low-z calibration sample

Brighter SNIa are slow decliners

B band V band I band

Wednesday, 8 January 14

slide-12
SLIDE 12

Roberto Trotta

From lightcurves to distances

  • There are a few different lightcurve (LC) fitters on the market, with different

philosophies/statistical approaches:

  • MLCS2k2 (Jha et al, 2007): color (AV) and LC shape (Δ) parameters fitted

simultaneously with cosmology. Color correction includes a dust extinction law correction.

  • SALT/SiFTO/SALT2 (Guy et al, 2007): LC shape (x1) and colour (c) correction

extracted from LC alongside apparent B-band magnitude (mB) + covariance matrix. The distance modulus is subsequently estimated with cosmological parameters and remaining “intrinsic” scatter.

  • BayeSN (Mandel et al, 2009, 2011): Fully Bayesian hierarchical modeling of LC,

including population-level distributions (see later).

µ = mB − M + α × width − β × colour

Wednesday, 8 January 14

slide-13
SLIDE 13

Roberto Trotta

PS1 data

  • Most recent data set

from PAN-STARRS1 survey

  • 146 spectroscopically

confirmed SNIa

  • Cosmological fit: 112

PS1 at high-z (blue) + 201 low-z SNIa (red)

Wednesday, 8 January 14

slide-14
SLIDE 14

Roberto Trotta

The importance of a principled approach

  • SNIa cosmology is now a mature field. Cosmological inferences (in particular, w(z))

are beginning to be dominated by systematic uncertainties:

  • Do SNIa properties evolve with z?
  • Are there multiple SNIa populations, with different characteristics?
  • Is dust extinction modeling adequate?
  • Can additional observables help in reducing “intrinsic” variability?
  • Can a data-based approach help in guiding theoretical understanding?
  • All of those questions are best addressed from a statistically principled

standpoint: a complete (Bayesian) modeling including intrinsic variability, measurement errors, population-level distribution, observational effects can deliver superior insight.

Wednesday, 8 January 14

slide-15
SLIDE 15

Roberto Trotta

Standard Chi2 fits of SALT2 output

  • Standard analysis minimizes the likelihood (typically, C minimized with α, β fixed,

then α, β minimized with C fixed), arbitrarily defined as:

σ2

fit = σ2 mB + α2σ2 x1 + β2σ2 c + correlations

σ2

int represents the “intrinsic” (residual) scatter

determined by requiring Chi2/dof ~ 1

  • bserved values (SALT2 fits)

parameters

−2 log L = χ2 = X

i

(µ(zi, C) − [ ˆ mB,i − M + αˆ x1,i − βˆ ci])2 σ2

int + σ2 fit

Wednesday, 8 January 14

slide-16
SLIDE 16

Roberto Trotta

Problems of the standard analysis

  • Form of the likelihood function is unjustified
  • α, β appear in the variance, too - this is a problem of simultaneous estimation of the

mean and of the variance. Chi2 not the correct distribution.

  • Incorrectly normalized - missing term in front. Adding this in

results in a (known) 6-sigma bias of β.

  • Chi2/dof ~ 1 prescription prevents by construction model checking and hypothesis

testing

  • Marginalization (and use of fast Bayesian MCMC methods) impossible (profile

likelihood “fudge” necessary)

−1 2 log

  • σ2

int + σ2 fit

  • Principled Bayesian solution required!

−2 log L = χ2 = X

i

(µ(zi, C) − [ ˆ mB,i − M + αˆ x1,i − βˆ ci])2 σ2

int + σ2 fit

Wednesday, 8 January 14

slide-17
SLIDE 17

Roberto Trotta

Bayesian hierarchical model

For each SNIa, this relation holds exactly between latent (unobserved) variables: Latent variables

µi(zi, C) = mB,i − Mi + αx1,i − βci Mi ∼ N(M0, σ2

int)

ci ∼ N(c?, Rc) x1,i ∼ N(x?, Rx)

Population-level hyperparameters to be estimated from the data

Population hyper-parameters Prior Parameters of interest Prior

Derived variable

Observed values

[ ˆ mB,i, ˆ ci, ˆ x1,i] ∼ N([mBi, ci, x1,i], ˆ Ci)

INTRINSIC VARIABILITY NOISE, SELECTION EFFECTS

Wednesday, 8 January 14

slide-18
SLIDE 18

Roberto Trotta

Advantages of multi-layer model

  • The Bayesian hierarchical approach allows us to:
  • model explicitly the population-level intrinsic variability of SNIa
  • investigate the impact of multiple SNIa populations (e.g., different progenitor

models)

  • determine/include correlations with other observables (galaxy mass, metallicity,

age, spectral lines, etc) to reduce residual scatter in Hubble diagram

  • obtain a principled data likelihood that can be used with Bayesian MCMC/

MultiNest (marginal posteriors, Bayesian evidence for model selection)

  • derive a fully marginalized posterior on the residual (after colour and stretch

correction) intrinsic scatter in the SNIa intrisic magnitude

  • investigate possible SNIa evolution (e.g., β(z)) and other systematics

Wednesday, 8 January 14

slide-19
SLIDE 19

Roberto Trotta

At the heart of the method...

  • ... lies the fundamental problem of linear regression in the presence of

measurement errors on both the dependent and independent variable and intrinsic scatter in the relationship (e.g., Gull 1989, Gelman et al 2004, Kelly 2007): anagolous to

µi = mB,i − Mi + αx1,i − βci yi = b + axi xi ∼ p(x|Ψ) = Nxi(x?, Rx)

POPULATION DISTRIBUTION

yi|xi ∼ Nyi(b + axi, σ2)

INTRINSIC VARIABILITY

ˆ xi, ˆ yi|xi, yi ∼ Nˆ

xi,ˆ yi([xi, yi], Σ2)

MEASUREMENT ERROR

Wednesday, 8 January 14

slide-20
SLIDE 20

latent x latent y

INTRINSIC VARIABILITY

  • bserved x
  • bserved y

+ MEASUREMENT ERROR

  • bserved x

Kelly (2007)

  • bserved x

latent distrib’on PDF

  • Modeling the latent distribution of the

independent variable accounts for “Malmquist bias”

  • An observed x value far from the origin is more

probable to arise from up-scattering (due to noise)

  • f a lower latent x value than down-scattering of a

higher (less probable) x value

Wednesday, 8 January 14

slide-21
SLIDE 21

The key parameter is noise/population variance σxσy/Rx σxσy/Rx small

Bayesian marginal posterior identical to profile likelihood true

σxσy/Rx large

Bayesian marginal posterior broader but less biased than profile likelihood

March, RT et al (2011)

true

yi = b + axi

Wednesday, 8 January 14

slide-22
SLIDE 22

Roberto Trotta

Tests on simulated SNIa data

  • Simulated N=288 SNIa

with similar characteristics as SDSS +ESSENCE+SNLS+HST +Nearby sample

  • Reconstruction of

cosmological parameters

  • ver 100 realizations,

comparing Bayesian hierarchical method with standard Chi2. Simulated SNIa realization (colour coded according to “survey”)

March et al (2011)

Wednesday, 8 January 14

slide-23
SLIDE 23

Roberto Trotta

Posterior sampling

  • In the Bayesian hierarchical approach, we have
  • 3 cosmological parameters: H0, ΩM, ΩK (w=1) or H0, ΩM, w (ΩK =0)
  • 2 stretch/colour correction parameters: α, β
  • 6 population-level parameters: M0, σ2, x*, Rx, c*, Rc
  • 3N (=864) latent variables Mi, x1i, ci
  • Analytical marginalization over all latent variables and linear population-level

parameters is possible in Gaussian case (no selection effects). Sampling of the remaining parameters via MultiNest.

  • Alternatively, Gibbs sampling can be used to sample over all parameters (conditional

distributions are Gaussian in the absence of selection effects. Including them introduces additional accept/reject step).

Wednesday, 8 January 14

slide-24
SLIDE 24

Roberto Trotta

The Nested Sampling algorithm

x

1

L(x)

1 2

θ θ

Feroz et al (2008), arxiv: 0807.4512 Trotta et al (2008), arxiv: 0809.3792

  • Skilling (2006) introduced Nested Sampling as an algorithm originally aimed at the

efficient computation of the model likelihood (Skilling, 2006).

  • The idea is to map a multi-dimensional integral onto a 1D integral which is easy to

compute numerical

  • The method requires to sample uniformly from the fraction of the prior volume

X(μ) above the iso-likelihood level μ

X(µ) = Z

L(θ)>µ

P(θ)dθ

P(d) = Z dθL(θ)P(θ) = Z 1 X(µ)dµ

Wednesday, 8 January 14

slide-25
SLIDE 25

Roberto Trotta

The MultiNest ellipsoidal sampling

  • The MultiNest algorithm (Feroz & Hobson, 2007, 2008) uses a multi-dimensional

ellipsoidal decomposition of the remaining set of “live points” to approximate the prior volume above the target iso-likelihood contour. Multimodal likelihood Highly degenerate likelihood target iso-likelihood contours ellipsoidal approximation multi-modal decomposition Decreasing prior fraction X

Wednesday, 8 January 14

slide-26
SLIDE 26

Roberto Trotta

Marginal posterior (simulated data) w = 1 ΩK = 0

Red/empty: Chi2 (68%, 95% CL) Blue/filled: Bayesian (68%, 95% credible regions) True value True value

Bayesian posterior is noticeably different from the Chi2 CL: which one is “best”?

March et al (2011)

Wednesday, 8 January 14

slide-27
SLIDE 27

Roberto Trotta

Coverage, bias and mean squared error

  • Coverage of Bayesian 1D marginal posterior CR and of 1D Chi2 profile likelihood CI

computed from 100 realizations

  • Bias and mean squared error (MSE) defined as

is the posterior mean (Bayesian) or the maximum likelihood value (Chi2).

ˆ θ

Coverage Red: Chi2 Blue: Bayesian Results: Coverage: generally improved (but still some undercoverage

  • bserved)

Bias: reduced by a factor ~ 2-3 for most parameters MSE: reduced by a factor 1.5-3.0 for all parameters

March et al (2011)

Wednesday, 8 January 14

slide-28
SLIDE 28

Roberto Trotta

Cosmology results

288 SNIa

Kessler et al (SDDS collaboration) (2010)

Combined sample

w = 1

Red: Chi2 Blue: Bayesian Marginal posteriors

α = 0.12 ± 0.02

β = 2.7 ± 0.1

σ = 0.13 ± 0.01 mag

March et al (2011)

Wednesday, 8 January 14

slide-29
SLIDE 29

Roberto Trotta

Combined constraints

  • Combined cosmological constraints on matter and dark energy content:

CMB CMB BAO BAO SNIa SNIa Combined Combined

w = 1 ΩK = 0

March, RT et al (2011)

Wednesday, 8 January 14

slide-30
SLIDE 30

Roberto Trotta

The BayeSN approach

  • Developed by K. Mandel (Mandel et al, 2009, 2011) and collaborators: fully Bayesian

approach to LC fitting, including random errors, population structure, intrinsic variations/correlations, dust extinction and reddening, incomplete data Dust population parameters LC population parameters Prior Prior Dust (Av, Rv) Distance modulus Observed LC Absolute LC Apparent LC Redshift

SN 1...N

Wednesday, 8 January 14

slide-31
SLIDE 31

Dust absorption for each SNIa Population level analysis of correlations Inclusion of NIR LC Hubble diagram: residual scatter reduced by ~2 using optical+NIR LC +NIR

Mandel et al (2011)

Some results from BayeSN

Wednesday, 8 January 14

slide-32
SLIDE 32

The complete hierarchical model

Latent variables Population parameters Data Cosmological sample

Dust Light curves Absorption Light curves Environment Correlates Light curve summary statistics Optical spectra Near-infrared light curves SN environmental data Redshift zi Apparent light curves (nearby) Apparent light curves (distant) Redshift data Optical spectra Near-infrared light curve SN environmental data

Data Local calibration sample

Survey parameters E, C

Ψdust

Ψenv

ΨSN

i = 1, . . . , M

Distance modulus Redshift zi Survey parameters E, C Standardization parameters Cosmological parameter

C

Light curves

i = 1, . . . , M

Light curves Redshift data

i = 1, . . . , M

Mibt mibt mibt

ˆ mibt ˆ mibt

µ

Ψdust,i ci

νt, α, Υ

Distance modulus

µ

Standardization parameters

νt, α, Υ

Red arrows/boxes indicate elements/data that have never been explored before in such a multi-level setting

Wednesday, 8 January 14

slide-33
SLIDE 33

Roberto Trotta

Summary

  • Current and future SNIa surveys are becoming “systematics” limited: better

modeling is required to use them as powerful and reliable probes of dark energy

  • Bayesian multi-level models can capture the different layers and sources of

uncertainties in SNIa

  • Intrinsic population-level variability can be studied and constrained
  • Full propagation of uncertainties to the level of cosmological parameters becomes

possible with a consistent, principle Bayesian approach

  • The Bayesian hierarchical model of March et al outperforms the standard Chi2

approach 2/3 of the time

  • The BayeSN approach of Mandel et al offers a fully Bayesian modeling of SNIa LC

Wednesday, 8 January 14

slide-34
SLIDE 34

Roberto Trotta

Conclusions and future work

  • Extension of our method to

include survey selection effects

  • ngoing
  • Inclusion of multiple SNIa

population, possible redshift- dependence of SNIa properties, correlation with other observables (galaxy mass, metallicity, spectral lines, etc) straightforward

  • Bayesian model comparison

(LCDM vs modified gravity)

true true

β(z) = β0 + β1z

  • Powerful Bayesian methods can take SNIa cosmological inference to a next

quantitative step: reduced systematics thanks to better modeling

  • Required to deal with future large (~ 3000) samples

Wednesday, 8 January 14

slide-35
SLIDE 35

Thanks! www.robertotrotta.com astro.ic.ac.uk/icic

Wednesday, 8 January 14