Approximate Bayesian Computation Dr. Jarad Niemi STAT 615 - Iowa - - PowerPoint PPT Presentation

approximate bayesian computation
SMART_READER_LITE
LIVE PREVIEW

Approximate Bayesian Computation Dr. Jarad Niemi STAT 615 - Iowa - - PowerPoint PPT Presentation

Approximate Bayesian Computation Dr. Jarad Niemi STAT 615 - Iowa State University December 5, 2017 Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 1 / 27 Outline Stochastic kinetic models Approximate Bayesian


slide-1
SLIDE 1

Approximate Bayesian Computation

  • Dr. Jarad Niemi

STAT 615 - Iowa State University

December 5, 2017

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 1 / 27

slide-2
SLIDE 2

Outline

Stochastic kinetic models Approximate Bayesian computation

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 2 / 27

slide-3
SLIDE 3

Stochastic kinetic models Terminology

Stochastic kinetic models

Imagine a well-mixed system in thermal equilibrium with N species: S1, . . . , SN with number of molecules X1, . . . , XN with elements Xj ∈ Z+ which change according to M reactions: R1, . . . , RM with propensities a1(x), . . . , aM(x). The propensities are given by aj(x) = θjhj(x) where hj(x) is a known function of the system state. If reaction j occurs, the state is updated by the stoichiometry νj with elements νij ∈ {−2, −1, 0, 1, 2}, i.e. reaction orders 0,1, and 2.

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 3 / 27

slide-4
SLIDE 4

Stochastic kinetic models Terminology

Michaelis-Menton System

The Michaelis-Menton system has N = 4 species: Substrate (S), Enzyme (E), Substrate-Enzyme Complex (SE), and Product (P). The M = 3 reactions as well as their propensities and stoichiometries are Stoichiometry Reaction Propensity S E SE P S + E − → SE θ1XSXE

  • 1
  • 1

1 SE − → S + E θ2XSE 1 1

  • 1

SE − → P+E θ3XSE 1

  • 1

1 where θ = (θ1, θ2, θ3) is the parameter of interest.

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 4 / 27

slide-5
SLIDE 5

Stochastic kinetic models Terminology

Michaelis-Menton snapshot

0.25 0.50 0.75 0.25 0.50 0.75

x y species

Substrate Enzyme Substrate−Enzyme Complex Product

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 5 / 27

slide-6
SLIDE 6

Stochastic kinetic models Gillespie algorithm

Gillespie algorithm

If reaction j ∈ {1, . . . , M} has the following probability lim

dt→0 P(reaction j within the interval (t, t + dt)|Xt) = aj(Xt)dt,

then this defines a continuous-time Markov jump process. Then a realization from this model can be obtained using the Gillespie algorithm:

  • 1. For j ∈ {1, . . . , M}, calculate aj(Xt).
  • 2. Calculate a0(Xt) = M

j=1 aj(Xt).

  • 3. Simulate a reaction time τ ∼ Exp(a0(Xt))
  • 4. Simulate a reaction id k ∈ {1, . . . , M} with probability ak(Xt)/a0(Xt)
  • 5. Update X according to vk and time by τ.

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 6 / 27

slide-7
SLIDE 7

Stochastic kinetic models Gillespie algorithm

Michaelis-Menton Gillespie Simulation

100 200 300 10 20 30

time count species

S E SE P

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 7 / 27

slide-8
SLIDE 8

Stochastic kinetic models Complete observations

Complete observations

Suppose you observe all system transitions: n reactions occur in the interval [0, T] t1, . . . , tn are the reaction times r1, . . . , rn are the reaction indicators, ri ∈ {1, . . . , M} Then inference can be performed based on the likelihood L(θ) ∝

M

  • j=1

θnj

j exp (−θjIj)

where nj = n

i=1 I(ri = j)

# of j reactions Ij = T

0 hj(Xt)dt

= n

i=1 hj(Xti−1)(ti − ti−1) + hj(Xtn)[T − tn]

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 8 / 27

slide-9
SLIDE 9

Stochastic kinetic models Inference

Inference

Maximum likelihood estimation ˆ θj = nj Ij Conjugate Bayesian inference p(θ) = M

j=1 Ga(θj; aj, bj)

p(θ|X) = M

j=1 Ga(θj; aj + nj, bj + Ij)

E[θj|X] = aj+nj

bj+Ij

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 9 / 27

slide-10
SLIDE 10

Stochastic kinetic models Inference

Michaelis-Menton Complete Data Inference

50 100 150 0.00 0.05 0.10 0.15 0.20

x density reaction

S+E−>E S+E<−SE SE−>P

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 10 / 27

slide-11
SLIDE 11

Stochastic kinetic models Discrete observations

Discrete observations

Suppose you only observe the system at discrete-times: For simplicity, observe the system at times t = 1, 2, . . . , T. At these times, we observe yt = Xt the system state. But do not observe the system between these times.

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 11 / 27

slide-12
SLIDE 12

Stochastic kinetic models Discrete observations

Michaelis-Mention discrete observations

100 200 300 10 20 30

time count species

S E SE P

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 12 / 27

slide-13
SLIDE 13

Stochastic kinetic models Inference

Inference

Inference is still performed based on the likelihood L(θ) = p(y|θ) = p(t, y) but this is the solution to the chemical master equation ∂ ∂tp(t, y) =

M

  • j=1
  • aj(y − vm)p(t, y − vm) − aj(y)p(t, y)
  • For constitutive production h(Xt) = 1 and a(Xt) = θ, we still have

L(θ) ∝ θn exp (−θI) with n = yT − y0 I = T 1dt = T

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 13 / 27

slide-14
SLIDE 14

Stochastic kinetic models Reversible isomerization

Reversible isomerization

100 200 300 30.00 30.25 30.50 30.75 31.00

time count species

S E SE P

How many reactions occurred in the interval [30, 31]? What is 31

30 XSEdt?

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 14 / 27

slide-15
SLIDE 15

Stochastic kinetic models Summary

Summary

With complete observations and independent gamma priors, the posterior is p(θ|X) =

M

  • j=1

Ga(θj; aj + nj, bj + Ij) where nj = n

i=1 I(ri = j)

Ij = T

0 hj(Xt)dt = n i=1 hj(Xti−1)(ti − ti−1) + hj(Xtn)[T − tn]

For discrete observations, the likelihood is analytically intractable and therefore no closed form exists for the posterior (or MLEs).

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 15 / 27

slide-16
SLIDE 16

Sampling methods

The idea

But we can simulate from the model using the Gillespie algorithm!! Intuitively, if we

  • 1. pick a set of parameters,
  • 2. simulate a realization using these parameters,
  • 3. and it matches our data,
  • 4. then these parameters should be reasonable.

Our goal is to formalize this through

  • 1. Rejection sampling
  • 2. Gibbs sampling

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 16 / 27

slide-17
SLIDE 17

Sampling methods

Simulations from the prior

1 2 3 4 5 0.00 0.25 0.50 0.75 1.00

time P Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 17 / 27

slide-18
SLIDE 18

Sampling methods Rejection sampling

Rejection sampling

Our objective is samples from the posterior p(θ|y) =

  • p(θ, X|y)dX ∝
  • p(y|X)p(X|θ)p(θ)dX

= n

t=1 I(yt = Xt)p(X|θ)p(θ)dX

A rejection sampling procedure is

  • 1. Sample θ ∼ p(θ).
  • 2. Sample X ∼ p(X|θ) a.k.a. Gillespie
  • 3. If yt = Xt for t = 1, 2, . . . , T, then
  • 4. θ is a sample from p(θ|y) and
  • 5. θ, X is a sample from p(θ, X|y).

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 18 / 27

slide-19
SLIDE 19

Sampling methods Gibbs sampling

Gibbs sampling

Our objective is samples from the posterior p(θ|y) =

  • p(θ, X|y)dX ∝
  • p(y|X)p(X|θ)p(θ)dX

A Gibbs sampling procedure is

  • 1. Start with θ(0), X(0)
  • 2. For k = 1, . . . , K,
  • a. Sample θ(k) ∼ p(θ|X(k−1))
  • b. Sample X(k) ∼ p(X|θ(k), y) a.k.a. rejection sampling

θ(k), X(k) converge to samples from p(θ, X|y)

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 19 / 27

slide-20
SLIDE 20

Approximate Bayesian computation

An approximate posterior

Intuitively, if we

  • 1. pick a set of parameters,
  • 2. simulate a realization using these parameters,
  • 3. and it is similar to our data,
  • 4. then these parameters should be reasonable.

We can formalize this using

Approximate Bayesian computation

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 20 / 27

slide-21
SLIDE 21

Approximate Bayesian computation The Approximation

Approximate Bayesian computation (ABC)

Our approximate objective is samples from the posterior p(θ|y) =

  • p(θ, X|ρ ≤ ǫ)dX ∝
  • I(ρ ≤ ǫ)p(X|θ)p(θ)dX

where ρ = ρ(y, X) is a measure of the difference between your data y and simulations X. Choice of ǫ reflects tension between computability and accuracy.

As ǫ → ∞,

p(θ|ρ ≤ ǫ)

d

→ p(θ) acceptance probability converges to 1

As ǫ → 0,

p(θ|ρ ≤ ǫ)

d

→ p(θ|y) acceptance probability decreases

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 21 / 27

slide-22
SLIDE 22

Approximate Bayesian computation ABC rejection sampling

ABC rejection sampling

Let ρ = n

t=1 |yt − Xt| and ǫ = n,

An ABC rejection sampling procedure is

  • 1. Sample θ ∼ p(θ)
  • 2. Sample X ∼ p(X|θ) a.k.a. Gillespie
  • 3. If ρ(y, X) ≤ ǫ, then
  • 4. θ is a sample from p(θ|ρ ≤ ǫ) and
  • 5. θ, X is a sample from p(θ, X|ρ ≤ ǫ).

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 22 / 27

slide-23
SLIDE 23

Approximate Bayesian computation ABC Gibbs sampling

ABC Gibbs sampling

Let ρ = n

t=1 |yt − Xt| and ǫ = n,

A Gibbs sampling procedure is

  • 1. Start with θ(0), X(0)
  • 2. For k = 1, . . . , K,
  • a. Sample θ(k) ∼ p(θ|X(k−1))
  • b. Sample X(k) ∼ p(X|θ(k), ρ ≤ ǫ) a.k.a. rejection sampling

θ(k), X(k) converge to samples from p(θ, X|ρ ≤ ǫ)

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 23 / 27

slide-24
SLIDE 24

Approximate Bayesian computation Gibbs sampling example

Michaelis-Menton system

E + S

θ1

− ⇀ ↽ −

θ2

ES

θ3

− → E + P

Table: Measurements taken from a simulated Michaelis-Mention system with parameters θ1 = 0.001, θ2 = 0.2, and θ3 = 0.1.

Time 10 20 30 40 50 60 70 80 90 100 E 120 71 76 81 80 90 90 104 103 109 109 S 301 219 180 150 108 86 61 52 35 29 22

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 24 / 27

slide-25
SLIDE 25

Approximate Bayesian computation Gibbs sampling example

With ǫ = 0 (i.e. draws from p(θ|y)),

θ1 p(θ1|y)

0.000 0.002 0.004 0.006 2000 4000 6000 8000

θ2 p(θ2|y)

0.0 0.5 1.0 1.5 2.0 1000 2000 3000 4000 5000

θ3 p(θ3|y)

0.07 0.09 0.11 500 1000 1500 2000 2500

θ1 θ2

0.001 0.003 0.005 0.007 0.0 0.5 1.0 1.5 2.0 x

KD p(KD|y)

50 150 250 500 1000 1500

KM p(KM|y)

250 300 350 400 200 400 600 800 1000 1200

100 300 Time Substrate Product 120 Enzyme ES−complex

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 25 / 27

slide-26
SLIDE 26

Approximate Bayesian computation Gibbs sampling example

Since rejection sampling is inherently parallel, run this algorithm on a graphical processing unit:

10 20 50 100 200 Expected samples (log10) CPU/GPU system time 3 4 5 6 0.01 0.1 0.5 GPU time per iteration (seconds)

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 26 / 27

slide-27
SLIDE 27

Summary

Summary

Bayesian inference in discretely observed SCKMs

Goal: p(θ|y) ∝ p(y|θ)p(θ) Likelihood, L(θ) = p(y|θ), is analytically intractable Sampling methods are required, e.g. rejection and/or Gibbs Acceptance rate can be unacceptably low

Approximate Bayesian computation (ABC) in SCKMs

Goal: p(θ|ρ ≤ ǫ) ∝ p(ρ ≤ ǫ|θ)p(θ) ρ = ρ(y, X) measures the difference between data and a simulation ǫ balances computability with accuracy Readily accommodates bounded errors, e.g. yt = Xt ± ǫ

ABC generally

More general than SKMs, e.g. phylogenetic trees Building ρ is an art, often use sufficient statistics of the data Not useful for unbounded errors, e.g. yt = Xt + ǫt, ǫt ∼ N(0, σ2) Current debate about usefulness for model selection

Jarad Niemi (STAT615@ISU) Approximate Bayesian Computation December 5, 2017 27 / 27