Approximate Bayesian Computation Michael Gutmann - - PowerPoint PPT Presentation

approximate bayesian computation
SMART_READER_LITE
LIVE PREVIEW

Approximate Bayesian Computation Michael Gutmann - - PowerPoint PPT Presentation

Approximate Bayesian Computation Michael Gutmann https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University 1st December 2015 Content Two parts: 1. The basics of approximate Bayesian computation (ABC) 2. ABC


slide-1
SLIDE 1

Approximate Bayesian Computation

Michael Gutmann

https://sites.google.com/site/michaelgutmann University of Helsinki and Aalto University

1st December 2015

slide-2
SLIDE 2

Content

Two parts:

  • 1. The basics of approximate Bayesian computation (ABC)
  • 2. ABC methods used in practice

What is ABC? A set of methods for approximate Bayesian inference which can be used whenever sampling from the model is possible. Michael Gutmann ABC 2 / 47

slide-3
SLIDE 3

Part I Basic ABC

slide-4
SLIDE 4

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Recap of Bayesian inference

◮ The ingredients for Bayesian parameter inference:

◮ Observed data yo ∈ Y ⊂ Rn ◮ A statistical model for the data generating process, py|θ,

parametrized by θ ∈ Θ ⊂ Rd.

◮ A prior probability density function (pdf) for the parameters θ,

◮ The mechanics of Bayesian inference:

pθ|y(θ|yo) ∝ py|θ(yo|θ) × pθ(θ) (1) posterior ∝ likelihood function × prior (2)

◮ Often written without subscripts (“function overloading”)

p(θ|yo) ∝ p(yo|θ) × p(θ) (3)

Michael Gutmann ABC 4 / 47

slide-5
SLIDE 5

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Likelihood function

◮ Likelihood function: L(θ) = p(yo|θ)

◮ For discrete random variables:

L(θ) = p(yo|θ) = Pr(y = yo|θ) (4) Probability that data generated from the model, when using parameter value θ, are equal to yo.

◮ For continuous random variables:

L(θ) = p(yo|θ) = lim

ǫ→0

Pr(y ∈ Bǫ(yo)|θ) Vol(Bǫ(yo)) (5) Proportional to the probability that the generated data are in a small ball Bǫ(yo) around yo.

◮ L(θ) indicates to which extent different values of the model

parameters are consistent with the observed data.

Michael Gutmann ABC 5 / 47

slide-6
SLIDE 6

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Example

p(θ) =

1 √ 2π·42 exp

  • − θ2

2·42

  • y o = 2

p(y|θ) =

1 √ 2π exp

  • − (y−θ)2

2

  • 15
  • 10
  • 5

5 10 15 0.01 0.02 0.03 0.04 0.05 0.06 0.07 0.08 0.09 0.1 mean prior pdf

  • 15
  • 10
  • 5

5 10 15 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 mean likelihood

  • 15
  • 10
  • 5

5 10 15 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 0.45 mean posterior pdf

prior likelihood function posterior data model Michael Gutmann ABC 6 / 47

slide-7
SLIDE 7

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Different kinds of statistical models

◮ The statistical model was defined via the family of pdfs

p(y|θ).

◮ Statistical models can be specified in other ways as well. ◮ In this lecture: models which are specified via a mechanism

(rule) for generating data

◮ Example: Instead of

p(y|θ) = 1 √ 2π exp

  • −(y − θ)2

2

  • (6)

we could have specified the model via y = z + θ z =

  • −2 log(ω) cos(2πν)

(7) where ω and ν are independent random variables uniformly distributed on (0, 1). Advantage?

Michael Gutmann ABC 7 / 47

slide-8
SLIDE 8

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Simulator-based models

◮ Sampling from the model is straightforward. For example:

  • 1. Sampling ωi and νi from the uniform random variables ω and

ν,

  • 2. computing the nonlinear transformation

yi = f (ωi, νi, θ) = θ +

  • −2 log(ωi) cos(2πνi)

produces samples yi ∼ p(y|θ).

◮ Enables direct modeling of how data are generated. ◮ Names for models specified via a data generating mechanism:

◮ Generative models ◮ Implicit models ◮ Stochastic simulation models ◮ Simulator-based models Michael Gutmann ABC 8 / 47

slide-9
SLIDE 9

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Examples

Simulator-based models are used in:

◮ Astrophysics:

Simulating the formation of galaxies, stars, or planets

◮ Evolutionary biology:

Simulating the evolution of life

◮ Health science:

Simulating the spread of an infectious disease

◮ . . .

Dark matter density simulated by the Illustris collaboration (Figure from http://www.illustris-project.org)

Michael Gutmann ABC 9 / 47

slide-10
SLIDE 10

Examples (evolutionary biology)

◮ Simulation of different hypothesized evolutionary scenarios ◮ Interaction between early modern humans (Homo sapiens)

and their Neanderthal contemporaries in Europe

Immigration of Modern Humans into Europe from the Near East. Light gray: Neanderthal population. Dark: Homo sapiens. from (Currat and Excoffier, Plos Biology, 2004, 10.1371/journal.pbio.0020421). The numbers in the figures indicate generations. See also Pinhasi et al, The genetic history of Europeans, Trends in Genetics, 2012

Michael Gutmann ABC 10 / 47

slide-11
SLIDE 11

Examples (health science)

◮ Simulation of bacterial transmission dynamics in child day

care centers (Numminen et al, Biometrics, 2013)

Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30 Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30 Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30

Time

Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30

Individual S t r a i n Michael Gutmann ABC 11 / 47

slide-12
SLIDE 12

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Formal definition of a simulator-based model

◮ Let (Ω, F, P) be a probability space. ◮ A simulator-based model is a collection of (measurable)

functions g(., θ) parametrized by θ, ω ∈ Ω → y = g(ω, θ) ∈ Y (8)

◮ For any fixed θ, yθ = g(., θ) is a random variable.

Simulation / Sampling

Michael Gutmann ABC 12 / 47

slide-13
SLIDE 13

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Advantages of simulator-based models

◮ Direct implementation of hypotheses of how the observed

data were generated.

◮ Neat interface with physical or biological models of data. ◮ Modeling by replicating the mechanisms of nature which

produced the observed/measured data. (“Analysis by synthesis”)

◮ Possibility to perform experiments in silico.

Michael Gutmann ABC 13 / 47

slide-14
SLIDE 14

Bayesian inference Inference for simulator-based models Recap Simulator-based models

Disadvantages of simulator-based models

◮ Generally elude analytical treatment. ◮ Can be easily made more complicated than necessary. ◮ Statistical inference is difficult . . . but possible!

— This lecture is about inference for simulator-based models —

Michael Gutmann ABC 14 / 47

slide-15
SLIDE 15

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Family of pdfs induced by the simulator

◮ For any fixed θ, the output of the simulator yθ = g(., θ) is a

random variable.

◮ Generally, it is impossible to write down the pdf of yθ

analytically in closed form.

◮ No closed-form formulae available for p(y|θ). ◮ Simulator defines the model pdfs p(y|θ) implicitly.

Michael Gutmann ABC 15 / 47

slide-16
SLIDE 16

Implicit definition of the model pdfs

A A

Michael Gutmann ABC 16 / 47

slide-17
SLIDE 17

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Implicit definition of the likelihood function

◮ The implicit definition of the model pdfs implies an implicit

definition of the likelihood function. For discrete random variables:

Michael Gutmann ABC 17 / 47

slide-18
SLIDE 18

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Implicit definition of the likelihood function

◮ For continuous random variables: L(θ) = limǫ→0 Lǫ(θ)

Michael Gutmann ABC 18 / 47

slide-19
SLIDE 19

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Implicit definition of the likelihood function

◮ To compute the likelihood function, we need to compute the

probability that the simulator generates data close to yo, Pr (y = yo|θ)

  • r

Pr (y ∈ Bǫ(yo)|θ)

◮ No analytical expression available. ◮ But we can empirically test whether simulated data equals yo

  • r is in Bǫ(yo).

◮ This property will be exploited to perform inference for

simulator-based models.

Michael Gutmann ABC 19 / 47

slide-20
SLIDE 20

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Exact inference for discrete random variables

◮ For discrete random variables, we can perform exact Bayesian

inference without knowing the likelihood function.

◮ Idea: the posterior is obtained by conditioning p(θ, y) on the

event y = yo: p(θ|yo) = p(θ, yo) p(yo) = p(θ, y = yo) p(y = yo) (9)

◮ Given tuples (θi, yi) where

◮ θi ∼ pθ

(iid from the prior)

◮ yi = g(ωi, θi)

(obtained by running the simulator)

retain only those where yi = yo.

◮ The θi from the retained tuples are samples from the

posterior p(θ|yo).

Michael Gutmann ABC 20 / 47

slide-21
SLIDE 21

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Example

◮ Posterior inference of the success probability θ in a Bernoulli

trial.

◮ Data: yo = 1 ◮ Prior: pθ = 1 on (0, 1) ◮ Data generating process:

◮ Given θi ∼ pθ ◮ ωi ∼ U(0, 1) ◮ yi =

  • 1

if ωi < θi

  • therwise

◮ Retain those θi for which yi = yo.

Michael Gutmann ABC 21 / 47

slide-22
SLIDE 22

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Example

◮ The method produces samples from the posterior. ◮ Monte Carlo error when summarizing the samples as an

empirical distribution or computing expectations via sample averages.

◮ Histogram for N simulated tuples (θi, yi)

0.2 0.4 0.6 0.8 1 0.5 1 1.5 2 2.5 Success probability pdf Estimated posterior pdf True posterior pdf Prior pdf

N = 1000

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Success probability pdf Estimated posterior pdf True posterior pdf Prior pdf

N = 10, 000

0.2 0.4 0.6 0.8 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Success probability pdf Estimated posterior pdf True posterior pdf Prior pdf

N = 100, 000

Michael Gutmann ABC 22 / 47

slide-23
SLIDE 23

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

The good and the bad

◮ The method produces samples from p(θ|yo). ◮ This is good. ◮ But only applicable to discrete random variables. ◮ And even for discrete random variables:

Computationally not feasible in higher dimensions

◮ Reason: The probability of the event yθ = yo becomes smaller

and smaller as the dimension of the data increases.

◮ Out of N simulated tuples only a small fraction will be

accepted.

◮ The small number of accepted samples do not represent the

posterior well.

◮ Large Monte Carlo errors

◮ This is bad.

Michael Gutmann ABC 23 / 47

slide-24
SLIDE 24

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Approximations to make inference feasible

◮ Settle for approximate yet computationally feasible inference. ◮ Introduce two types of approximations:

  • 1. Instead of working with the whole data, work with lower

dimensional summary statistics tθ and to, tθ = T(yθ) to = T(yo). (10)

  • 2. Instead of checking tθ = to, check whether ∆θ = d(to, tθ) is

less than ǫ. (d may or may not be a metric)

◮ In other words:

  • 1. Replace Pr (y ∈ Bǫ′(yo) | θ) with Pr (∆θ ≤ ǫ| θ)
  • 2. Do not take the limit ǫ → 0

◮ Defines an approximate likelihood function ˜

Lǫ(θ), ˜ Lǫ(θ) ∝ Pr (∆θ ≤ ǫ | θ) (11)

Michael Gutmann ABC 24 / 47

slide-25
SLIDE 25

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Example

◮ Inference of the mean θ of a

Gaussian of variance one.

◮ Pr(y = yo|θ) = 0. ◮ Discrepancy ∆θ:

∆θ = (ˆ µo − ˆ µθ)2, ˆ µo = 1 n

n

  • i=1

yo

i ,

ˆ µθ = 1 n

n

  • i=1

yi, yi ∼ N(θ, 1)

  • 2
  • 1

1 2 3 4 2 4 6 8 10 12 14 16 θ discrepancy 0.1, 0.9 quantiles mean realization of stochastic process realizations at θ

Discrepancy ∆θ is a random variable.

Michael Gutmann ABC 25 / 47

slide-26
SLIDE 26

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Example

Probability that ∆θ is below some threshold ǫ approximates the likelihood function.

0.5 1 1.5 2 2.5 3 3.5 4 0.2 0.4 0.6 0.8 1 θ 0.1, 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled) Michael Gutmann ABC 26 / 47

slide-27
SLIDE 27

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Example

◮ Here, T(y) = 1 n

n

i=1 yi is a sufficient statistics for inference

  • f the mean θ

◮ The only approximation is ǫ > 0. ◮ In general, the summary statistics will not be sufficient.

Michael Gutmann ABC 27 / 47

slide-28
SLIDE 28

Bayesian inference Inference for simulator-based models Likelihood function Exact inference Approximate inference

Rejection ABC algorithm

◮ The two approximations made yield the rejection algorithm for

approximate Bayesian computation (ABC):

  • 1. Sample θi ∼ pθ
  • 2. Simulate a data set yi by running the simulator with θi

(yi = g(ωi, θi))

  • 3. Compute the discrepancy ∆i = d(T(yo), T(yi))
  • 4. Retain θi if ∆i ≤ ǫ

◮ This is the basic ABC algorithm. ◮ It produces samples θ ∼ ˜

pǫ(θ|yo), ˜ pǫ(θ|yo) ∝ pθ(θ)˜ Lǫ(θ) (12) ˜ Lǫ(θ) ∝ Pr

  • d(T(yo), T(y))
  • ∆θ

≤ ǫ | θ

  • (13)

Michael Gutmann ABC 28 / 47

slide-29
SLIDE 29

Part II ABC methods used in practice

slide-30
SLIDE 30

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Brief recap

◮ Simulator-based models: Models which are specified by a data

generating mechanism.

◮ By construction, we can sample from simulator-based models.

Likelihood function can generally not be written down.

◮ Rejection ABC: Trial and error scheme to find parameter

values which produce simulated data resembling the observed data.

◮ Simulated data resemble the observed data if some

discrepancy measure is small.

Michael Gutmann ABC 30 / 47

slide-31
SLIDE 31

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Critique of the rejection ABC algorithm

◮ The rejection ABC algorithm works. ◮ But it is computationally not efficient. ◮ The probability of the event ∆θ ≤ ǫ is usually small when

θ ∼ pθ. In particular for small ǫ.

Michael Gutmann ABC 31 / 47

slide-32
SLIDE 32

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Critique of the rejection ABC algorithm

◮ In the Gaussian example, the probability for ∆θ ≤ ǫ can be

computed in closed form

∆θ = (ˆ µo − ˆ µθ)2

Pr(∆θ ≤ ǫ) = Φ √n(ˆ µo − θ) + √nǫ

  • −Φ

√n(ˆ µo − θ) − √nǫ

  • Φ(x) =

x

−∞ 1 √ 2π exp

  • − 1

2 u2

du ◮ For nǫ small: ˜

Lǫ(θ) ∝ Pr(∆θ ≤ ǫ) ∝ √ǫL(θ)

◮ For small ǫ good approximation of

the likelihood function.

◮ But for small ǫ, Pr(∆θ ≤ ǫ) ≈ 0:

Very few samples will be accepted

0.5 1 1.5 2 2.5 3 3.5 4 0.2 0.4 0.6 0.8 1 θ 0.1, 0.9 quantiles mean threshold (0.1) approximate likelihood (rescaled) true likelihood (rescaled)

Michael Gutmann ABC 32 / 47

slide-33
SLIDE 33

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Two widely used algorithms

◮ Two widely used algorithms which improve upon rejection

ABC:

  • 1. Regression ABC (Beaumont et al, Genetics, 2002)
  • 2. Sequential Monte Carlo ABC (Sisson et al, PNAS, 2007)

◮ Both use rejection ABC as a building block. ◮ Sequential Monte Carlo (SMC) ABC is also known as

Population Monte Carlo (PMC) ABC.

Michael Gutmann ABC 33 / 47

slide-34
SLIDE 34

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Two widely used algorithms

◮ Regression ABC consists in running rejection ABC with a

relatively large ǫ and then adjusting the obtained samples so that they are closer to samples from the true posterior.

◮ Sequential Monte Carlo ABC consists in sampling θ from an

adaptively constructed proposal distribution φ(θ) rather than from the prior in order to avoid simulating many data sets which are not accepted.

Michael Gutmann ABC 34 / 47

slide-35
SLIDE 35

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Basic idea of regression ABC

◮ The summary statistics tθ = T(yθ) and θ have a joint

distribution.

◮ Let ti be the summary statistics for simulated data

yi = g(ωi, θi).

◮ We can learn a regression model between the summary

statistics (covariates) and the parameters (response variables) θi = f (ti) + ξi (14) where ξi is the error term (zero mean random variable).

◮ The training data for the regression are typically tuples (θi, ti)

produced by rejection-ABC with some sufficiently large ǫ.

Michael Gutmann ABC 35 / 47

slide-36
SLIDE 36

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Basic idea of regression ABC

◮ Fitting the regression model to the training data (θi, ti) yields

an estimated regression function ˆ f and the residuals ˆ ξi, ˆ ξi = θi − ˆ f (ti) (15)

◮ Regression ABC consists in replacing θi with θ∗ i ,

θ∗

i = ˆ

f (to) + ˆ ξi = ˆ f (to) + θi − ˆ f (ti) (16)

◮ Corresponds to an adjustment of θi. ◮ If the relation between t and θ is learned correctly, the θ∗ i

correspond to samples from an approximation with ǫ = 0.

Michael Gutmann ABC 36 / 47

slide-37
SLIDE 37

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Basic idea of sequential Monte Carlo ABC

◮ We may modify the rejection ABC algorithm and use φ(θ)

instead of the prior pθ.

  • 1. Sample θi ∼ φ(θ)
  • 2. Simulate a data set yi by running the simulator with θi

(yi = g(ωi, θi))

  • 3. Compute the discrepancy ∆i = d(T(yo), T(yi))
  • 4. Retain θi if ∆i ≤ ǫ

◮ The retained samples follow a distribution proportional to

φ(θ)˜ Lǫ(θ)

Michael Gutmann ABC 37 / 47

slide-38
SLIDE 38

Standard algorithms Recent developments Critique of rejection ABC Regression ABC Sequential Monte Carlo ABC

Basic idea of sequential Monte Carlo ABC

◮ Parameters θi weighted with wi,

wi = pθ(θi) φ(θi) , (17) follow a distribution proportional to pθ(θ)˜ Lǫ(θ).

◮ Can be used to iteratively morph the prior into a posterior:

◮ Use a sequence of shrinking thresholds ǫt ◮ Run rejection ABC with ǫ0. ◮ Define φt at iteration t based on the weighted samples from

the previous iteration (e.g Gaussian mixture with means equal to the θi from the previous iteration).

◮ More efficient than rejection ABC: φt(θ) is close to the

approximate posterior in the final iterations.

Michael Gutmann ABC 38 / 47

slide-39
SLIDE 39

Standard algorithms Recent developments Bayesian optimization for ABC Application

Another approach

◮ Evaluating ∆θ is computationally costly. We are only

interested in small ∆θ (thresholding!)

◮ We could increase the computational efficiency by evaluating

∆θ predominantly where it tends to be small.

◮ Use a combination of probabilistic modeling of ∆θ and

  • ptimization to figure out where to evaluate ∆θ.
  • 2
  • 1

1 2 3 4 2 4 6 8 10 12 14 16 theta

  • bjective

0.1, 0.9 quantiles mean realization of stochastic process realizations at theta

Michael Gutmann ABC 39 / 47

slide-40
SLIDE 40

Standard algorithms Recent developments Bayesian optimization for ABC Application

Learning a model of the discrepancy

◮ The approximate likelihood function ˜

Lǫ(θ) is determined by the distribution of the discrepancy ∆θ

˜ Lǫ(θ) ∝ Pr (∆θ ≤ ǫ | θ) ◮ If we new the distribution of ∆θ we could compute ˜

Lǫ(θ).

◮ In recent work, we proposed to learn a model of ∆θ and to

approximate ˜ Lǫ(θ) by ˆ Lǫ(θ), ˜ Lǫ(θ) ∝ Pr (∆θ ≤ ǫ | θ) , (18) where Pr is the probability under the model of ∆θ. (Gutmann and Corander, Journal of Machine Learning Research, in press, 2015)

◮ Model is learned more accurately in regions where ∆θ tends

to be small, using techniques from Bayesian optimization.

Michael Gutmann ABC 40 / 47

slide-41
SLIDE 41

Bayesian optimization

  • 5

5

  • 10

10 20 30 40 50 60 theta discrepancy

  • 5

5

  • 5

5 10 15 20 25 30 35 40 45 50 theta discrepancy

  • 5

5

  • 5

5 10 15 20 25 30 35 40 45 50 theta discrepancy

  • f new data

Acquisition

Acquisition function Belief based on 1 data point

Start: 1 data point

Where to evaluate next

2 data points

Updated belief Exploration vs exploitation Bayes’ theorem

3 data points

Acquisition function Where to evaluate next Where to evaluate next Updated belief Target

Updated belief Current belief

Michael Gutmann ABC 41 / 47

slide-42
SLIDE 42

Application to epidemiology of infectious diseases

◮ Inference about bacterial transmission dynamics in child day

care centers (Numminen et al, Biometrics, 2013)

Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30 Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30 Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30

Time

Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30

Individual S t r a i n Michael Gutmann ABC 42 / 47

slide-43
SLIDE 43

Standard algorithms Recent developments Bayesian optimization for ABC Application

Application to epidemiology of infectious diseases

Data: Colonization states of sampled attendees of 29 child day care centers (DCCs) in Oslo greater area.

Individual Strain 5 10 15 20 25 30 35 5 10 15 20 25 30

Example data from a DCC. Each square indicates an attendee colonized with a strain

  • f the bacterium Streptococcus pneumoniae.

Michael Gutmann ABC 43 / 47

slide-44
SLIDE 44

Standard algorithms Recent developments Bayesian optimization for ABC Application

Application to epidemiology of infectious diseases

◮ Simulator-based model: latent continuous-time Markov chain

for the transmission dynamics in a DCC and an observation model

(Numminen et, Biometrics, 2013). ◮ The model has three parameters:

◮ β: rate of infections within a DCC ◮ Λ: rate of infections outside a DCC ◮ θ: possibility to be infected with multiple strains

◮ Likelihood is intractable (data at a single time point are

available only).

Michael Gutmann ABC 44 / 47

slide-45
SLIDE 45

Standard algorithms Recent developments Bayesian optimization for ABC Application

Application to epidemiology of infectious diseases

◮ Comparison of the model-based approach with a

sequential/population Monte Carlo ABC approach.

◮ Roughly equal results using 1000 times fewer simulations. ◮ The minimizer of the

regression function under the model does not involve choosing a threshold ǫ.

Posterior means: solid lines with markers, credibility intervals: shaded areas or dashed lines.

2 2.5 3 3.5 4 4.5 5 5.5 6 1 2 3 4 5 6 7 8 9 10 11 Log 10 number of simulated data sets β minimizer of regression function model-based posterior PMC-ABC posterior Numminen et al (final result)

Michael Gutmann ABC 45 / 47

slide-46
SLIDE 46

Standard algorithms Recent developments Bayesian optimization for ABC Application

Application to epidemiology of infectious diseases

◮ Comparison of the model-based approach with a

sequential/population Monte Carlo ABC approach.

2 2.5 3 3.5 4 4.5 5 5.5 6 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Log 10 number of simulated data sets Λ minimizer of regression function model-based posterior PMC-ABC posterior Numminen et al (final result) 2 2.5 3 3.5 4 4.5 5 5.5 6 0.05 0.1 0.15 0.2 0.25 0.3 0.35 0.4 Log 10 number of simulated data sets θ minimizer of regression function model-based posterior PMC-ABC posterior Numminen et al (final result)

Posterior means are shown as solid lines with markers, credibility intervals as shaded areas or dashed lines. Michael Gutmann ABC 46 / 47

slide-47
SLIDE 47

Summary

◮ The topic was Bayesian inference for models specified via a

simulator (implicit / generative models).

◮ Introduced approximate Bayesian computation (ABC). ◮ Principle of ABC: Find parameter values which yield simulated

data resembling the observed data.

◮ Covered three classical algorithms:

  • 1. Rejection ABC
  • 2. Regression ABC
  • 3. Sequential Monte Carlo ABC

◮ Introduced recent work which uses Bayesian optimization to

increase the efficiency of the inference.

◮ Not covered: How to choose the summary statistics / the

discrepancy measure between simulated and observed data.

Michael Gutmann ABC 47 / 47