Choosing the Summary Statistics and the Acceptance Rate in - - PowerPoint PPT Presentation

choosing the summary statistics and the acceptance rate
SMART_READER_LITE
LIVE PREVIEW

Choosing the Summary Statistics and the Acceptance Rate in - - PowerPoint PPT Presentation

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation (ABC) Michael G.B. Blum Laboratoire


slide-1
SLIDE 1

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Choosing the Summary Statistics and the Acceptance Rate in Approximate Bayesian Computation (ABC)

Michael G.B. Blum

Laboratoire TIMC-IMAG, CNRS, Grenoble

COMPSTAT 2010; Thursday, August 26

slide-2
SLIDE 2

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

A typical application of ABC in population genetics Estimating the time T since the out-of-Africa migration

T NA Present

Africa non-Africa

Recent Out-of-Africa Single Origin Population

Past

(a) Model of human

  • rigins

(b) Data

slide-3
SLIDE 3

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Flowchart of ABC

Simula'ons ¡ ABC ¡

Observed ¡DNA ¡ sequences ¡ Simulated ¡DNA ¡ sequences ¡ Different ¡values ¡

  • f ¡the ¡parameter ¡

T ¡ Most ¡probable ¡ values ¡for ¡T ¡

slide-4
SLIDE 4

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Rejection algorithm for targeting p(φ|S)

1

Generate a parameter φ according to the prior distribution π ;

2

Simulate data D′ according to the model p(D′|φ) ;

3

Compute the summary statistic S′ from D′ and accept the simulation if d(S, S′) < δ. Potential problem : the curse of dimensionality limits the number of statistics that rejection-ABC can handle.

slide-5
SLIDE 5

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Regression-adjustment for ABC

Beaumont, Zhang and Balding; Genetics 2002

Local linear regression φi|Si = m(Si) + ǫi, with a linear function for m. Adjustment φ∗

i = ˆ

m(S) + ˜ ǫi, ˆ m is found with weighted least-squares.

slide-6
SLIDE 6

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Regression-adjustment for ABC

Weighted least-squares

n

  • i=1

{φi − (β0 + (Si − S)Tβ1)}2Wi, where Wi ∝ K(||S − Si||/δ). Adjustment φ∗

i = ˆ

β0

LS + ˜

ǫi = φi − (Si − S)T ˆ β1

LS.

slide-7
SLIDE 7

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Regression-adjustment for ABC

φi φi

*

Csilléry, Blum, Gaggiotti and François; TREE 2010

slide-8
SLIDE 8

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Asymptotic theorem for ABC

Blum; JASA 2010

1

If there is a local homoscedastic relationship between φ and S, Bias with regression adjustment < Bias with rejection only

2

But Rate of convergence of the MSE = θ(n−4/(d+5)) d = dimension of the summary statistics n = number of simulations

slide-9
SLIDE 9

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

A Gaussian example to illustrate potential pitfalls with ABC

Toy example 1 : Estimation of σ2 σ2 ∼ Invχ2(d.f. = 1) µ ∼ N(0, σ2) N = 50 Summary statistics (S1, . . . , S5) = (¯ xN, s2

N, u1, u2, u3)

uj ∼ N(0, 1), j = 1, 2, 3

slide-10
SLIDE 10

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

A Gaussian example to illustrate potential pitfalls with ABC

50 100 150

1 summary statistic

σ2

0.1 1.0 10.0 100.0

Empirical Variance

  • Accepted

Rejected

50 100 150

5 summary statistics

σ2

0.1 1.0 10.0 100.0

Empirical Variance

slide-11
SLIDE 11

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Local Bayesian linear regression

Hjort; Book chapter 2003

Prior for the regression coefficients β β ∼ N(0, α−1Ip+1) The Maximum a posteriori minimizes the regularized weighted least-squares problem E(β) = 1 2τ 2

n

  • i=1

(φi − (Si − S)Tβ)2Wi + α 2 βTβ.

slide-12
SLIDE 12

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Local Bayesian linear regression

Posterior distribution of the regression coefficients β ∼ N(βMAP, V), βMAP = τ −2VX TWδφ V −1 = (αIp+1 + τ −2X TWδX). Regression-adjustment for ABC φ∗

i = φi − (Si − S)T ˆ

β1

MAP.

slide-13
SLIDE 13

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

The evidence function as an omnibus criterion

Empirical Bayes /Evidence approximation p(φ|τ 2, α, pδ) = Πn

i=1p(φi|β, τ 2)Wi

  • p(β|α) dβ,

α is the precision hyperparameter τ is the variance of the residuals pδ is the percentage of accepted simulations. Maximizing the evidence for

1

choosing pδ

2

choosing the set of summary statistics

slide-14
SLIDE 14

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

The evidence function as an omnibus criterion

A closed-formed formula log p(φ|τ 2, α, pδ) = p + 1 2 log α − NW 2 log τ 2 − E(βMAP) −1 2 log |V −1| − NW 2 log 2π, where NW = Wi.

slide-15
SLIDE 15

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

The evidence function as an omnibus criterion

The evidence as a function of the tolerance rate log p(φ|pδ) = max

(α,τ) log p(φ|τ 2, α, pδ).

The evidence as a function of the set of summary statistics log p(φ|S) = max

(α,τ,pδ) log p(φ|τ 2, α, pδ).

slide-16
SLIDE 16

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Iterative algorithm for maximizing the evidence w.r.t. α and τ

Updating the value of the hyperparameter α = γ βT

MAPβMAP

, where γ is the effective number of summary statistics. γ = (p + 1) − αTr(V). τ 2 = n

i=1(φi − (Si − S)Tβ)2Wi

NW − γ .

slide-17
SLIDE 17

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Using the evidence for choosing pδ

Toy example 2 φ ∼ U−c,c, c ∈ R, S ∼ N

1 + eφ , σ2 = (.05)2

  • ,
  • ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●
  • ● ● ● ● ● ● ● ● ●
  • ● ●
  • −150

100

Acceptance rate

0.003 0.010 0.100 1.000

Log evidence

slide-18
SLIDE 18

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Using the evidence for choosing pδ

−3 −1 1 2 3

c=3

S φ

0.0 0.2 0.4 0.6 0.8 1.0

  • Accepted

Rejected

−4 −2 2 4

c=5

S φ

0.0 0.2 0.4 0.6 0.8 1.0

  • −10

−5 5 10

c=10

S φ

0.0 0.2 0.4 0.6 0.8 1.0

  • −40

20 40

c=50

S φ

0.0 0.2 0.4 0.6 0.8 1.0

slide-19
SLIDE 19

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Using the evidence for choosing the summary statistics

Toy example 1 : (S1, . . . , S5) = (¯ xN, s2

N, u1, u2, u3)

1 5 1 5 1 5

Number of summary statistics σ2

0.1 1.0 10.0 50.0 2.5% quantile

  • f the posterior

50% quantile

  • f the posterior

97.5% quantile

  • f the posterior

Choice of a set of statistics with the evidence

Variance only Other 100

slide-20
SLIDE 20

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Transformation of the statistics can matter

Left Panel S1 = log s2

N or (S1, . . . , S5) = (¯

xN, log s2

N, u1, u2, u3)

Right Panel S1 = s2

N or (S1, . . . , S5) = (¯

xN, s2

N, u1, u2, u3)

1 5 1 5 1 5

1.05

1.10 1.15 1.20

1.25

Number of summary statistics σ2

2.5% quantile

  • f the posterior

50% quantile

  • f the posterior

97.5% quantile

  • f the posterior

1 5 1 5 1 5

Number of summary statistics σ2

0.1

1.0 10.0

50.0

2.5% quantile

  • f the posterior

50% quantile

  • f the posterior

97.5% quantile

  • f the posterior

Log of the empirical variance Original scale

slide-21
SLIDE 21

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Pros and cons

Cons Quite complicated Model (variable) selection for regression but not for density estimation Pros Similar methodology without regression adjustment Omnibus criterion (Choice of the summary statistics, of the tolerance rate pδ) Shrinkage of regression coefficients

slide-22
SLIDE 22

Introduction Standard algorithms Potential pitfalls with ABC Local Bayesian linear regression Conclusion

Thanks all for your attention