[PPT] - Multilevel Monte Carlo for the continuous time Markov chain models PowerPoint Presentation

SLIDE 1

Multilevel Monte Carlo for the continuous time Markov chain models arising in biology

David F. Anderson∗

∗anderson@math.wisc.edu

Department of Mathematics University of Wisconsin - Madison

Work is joint with Des Higham MCQMC 2012 February 17th, 2012

SLIDE 2

Overview

◮ Most common stochastic models of biochemical reaction systems are

continuous time Markov chains.

◮ Common examples include:

1. Gene regulatory networks.
2. Models of viral infection.
3. General population models (epidemic, predator-prey, etc.)

◮ These models are becoming dominant in molecular biology.

Problem: extend multi-level Monte Carlo to this setting.

SLIDE 3

Why?

1. Why would the be useful?

◮ Short answer: the common simulation methods can be very computationally

expensive.

◮ Many reactions happening over time-frame of interest.

2. Why is this non-trivial?

◮ In SDE case, we have a useful representation

X(t) = X(0) + t b(X(s))ds + t σ(X(s))dW(s) Zℓ(t) = Zℓ + t b(Zℓ ◦ η(s))ds + t σ(Zℓ ◦ η(s))dW(s).

◮ Two processes are coupled via Brownian motion W. ◮ Clear how to simulate two processes with different time steps. ◮ We need a pathwise representation to get a good coupling that:

2.1 Couples closely. 2.2 Easy to analyze.

SLIDE 4

The Poisson process

A Poisson process, Y(·): (a) Let {ξi} be i.i.d. exponential random variables with parameter one. (b) Now, put points down on a line with spacing equal to the ξi: x x x x x x x x

↔

ξ1

↔

ξ2

← →

ξ3 · · · t

◮ Let Y(t) denote the number of points hit by time t. ◮ In the figure above, Y(t) = 6.

SLIDE 5

The Poisson process

Let Y be a unit rate Poisson process and λ ≥ 0. Then Y(λt) is a Poisson process with parameter λ. x x x x x x x x

↔

ξ1

↔

ξ2

← →

ξ3 · · · λt There is no reason λ needs to be constant in time, in which case Yλ(t) ≡ Y t λ(s)ds

is an non-homogeneous Poisson process with intensity λ(t).

SLIDE 6

Build up model: Random time change representation of Tom Kurtz

Consider the simple system A + B → C where one molecule each of A and B is being converted to one of C. Simple book-keeping: if X(t) = (XA(t), XB(t), XC(t))T gives the state at time t, X(t) = X(0) + R(t)   −1 −1 1   , where

◮ R(t) is the # of times the reaction has occurred by time t, and ◮ X(0) is the initial condition.

SLIDE 7

Build up model: Random time change representation of Tom Kurtz

Assuming intensity or propensity of reaction is λ(X(s)) = κXA(s)XB(s), we can model R(t) = Y t κXA(s)XB(s)ds

where Y is a unit-rate Poisson point process.

Hence   XA(t) XB(t) XC(t)   ≡ X(t) = X(0) +   −1 −1 1   Y t κXA(s)XB(s)ds

.

SLIDE 8

Build up model: Random time change representation of Tom Kurtz

Now consider a network of reactions involving d chemical species,

S1, . . . , Sd: νk1S1 + νk2S2 + · · · + νkdSd − → ν′

k1S1 + ν′ k2S2 + · · · + ν′ kdSd

Denote reaction vector as ζk = ν′

k − νk,

The intensity (or propensity) of kth reaction is λk : Zd

≥0 → R.

By analogy with before

X(t) = X(0) +

k

Yk t λk(X(s))ds

ζk,

Yk are independent, unit-rate Poisson processes.

SLIDE 9

Example

Consider a model of gene transcription and translation: G

25

→ G + M, (Transcription) M

1000

→ M + P, (Translation) P + P

0.001

→ D, (Dimerization) M

0.1

→ ∅, (Degradation of mRNA) P

1

→ ∅ (Degradation of Protein). Then, if X = [XM, XP, XD]T, X(t) = X(0) + Y1 (25t)   1   + Y2

1000

t XM(s)ds   1   + Y3

0.001

t XP(s)(XP(s) − 1)ds   −2 1   + Y4

0.1

t XM(s)ds   −1   + Y5

1

t XP(s)ds   −1  

SLIDE 10

Computing Expectations

Problem: Approximate Ef(X(T)) to some desired tolerance, ǫ > 0. Easy!

◮ Simulate the CTMC exactly, ◮ generate independent paths, X[i](t), use the unbiased estimator

µn = 1

n

i=1

f(X[i](t)).

◮ stop when desired confidence interval is ± ǫ.

Total computational complexity = (cost per path) × (# paths) = O(N × ǫ−2).

SLIDE 11

Tau-leaping: Euler’s method

Recall: X(t) = X(0) +

k

Yk t λk(X(s))ds

ζk.

Tau-leaping is essentially an Euler approximation of t λk(X(s))ds: Z(h) = Z(0) +

k

Yk h λk(Z(s)) ds

ζk

≈ Z(0) +

k

Yk

λk(Z(0)) h
ζk

d

= Z(0) +

k

Poisson

λk(Z(0)) h
ζk.

SLIDE 12

Euler’s method

Path-wise representation for Z(t) generated by Euler’s method is Z(t) = X(0) +

k

Yk t λk(Z ◦ η(s))ds

ζk,

where η(s) = tn if tn ≤ s < tn+1 = tn + h is a step function giving left endpoints of time discretization.

SLIDE 13

Return to approximating Ef(X(T))

Let

µn = 1

n

i=1

f(ZL,[i](t)). We have MSE( µn) = (Bias)2 + (variance), If have an order one method Total computational complexity = (cost per path) × (# paths) = O(ǫ−1 × ǫ−2) = O(ǫ−3).

SLIDE 14

Benefits/drawbacks

Benefits:

1. Can drastically lower the computational complexity of a problem

Drawbacks:

1. Convergence results usually give order of convergence. Can’t give a

precise hL.

2. Tau-leaping has problems: what happens if you go negative?
3. Gone away from an unbiased estimator.

SLIDE 15

Multi-level Monte Carlo

Usual MLMC: Ef(X(t)) ≈ E[f(ZL(t)) − f(ZL−1(t))]+

L−1

ℓ=ℓ0+1

E[f(Zℓ(t))−f(Zℓ−1(t))]+Ef(Zℓ0(t)). In our setting: Ef(X(t)) = E[f(X(t)) − f(ZL(t))] +

L

ℓ=ℓ0+1

E[f(Zℓ(t)) − f(Zℓ−1(t))] + Ef(Zℓ0(t)). gives an unbiased estimator.

SLIDE 16

Multi-level Monte Carlo: an unbiased estimator

Ef(X(t)) = E[f(X(t)) − f(ZL(t))] +

L

ℓ=ℓ0+1

E[f(Zℓ(t)) − f(Zℓ−1(t))] + Ef(Zℓ0(t)). Estimators:

QE

def

= 1 nE

nE

i=1

(f(X[i](T) − f(ZL,[i](T))),

Qℓ

def

= 1 nℓ

nℓ

i=1

(f(Zℓ,[i](T)) − f(Zℓ−1,[i](T))), for ℓ ∈ {ℓ0 + 1, . . . , L}

Q0

def

= 1 n0

n0

i=1

f(Zℓ0,[i](T)). Question: what is the coupling and the computational cost?

SLIDE 17

How do we couple Poisson processes

Suppose I want to generate:

◮ A Poisson process with intensity 13.1. ◮ A Poisson process with intensity 13. ◮ We could let Y1 and Y2 be independent, unit-rate Poisson processes,

and set Z13.1(t) = Y1(13.1t), Z13(t) = Y2(13t), Using this representation, these processes are independent and, hence, not coupled. The variance of difference is large: Var(Z13.1(t) − Z13(t)) = Var(Y1(13.1t)) + Var(Y2(13t)) = 26.1t.

SLIDE 18

How do we generate processes simultaneously

Suppose I want to generate:

◮ A Poisson process with intensity 13.1. ◮ A Poisson process with intensity 13. ◮ We could let Y1 and Y2 be independent unit-rate Poisson processes, and

set Z13.1(t) = Y1(13t) + Y2(0.1t) Z13(t) = Y1(13t), The variance of difference is much smaller: Var(Z13.1(t) − Z13(t)) = Var (Y2(0.1t)) = 0.1t.

SLIDE 19

How do we generate processes simultaneously

More generally, suppose we want

1. non-homogeneous Poisson process with intensity f(t) and
2. non-homogeneous Poisson process with intensity g(t).

Let Y1, Y2, and Y3 be independent, unit-rate Poisson processes and define Zf(t) = Y1 t f(s) ∧ g(s)ds

+ Y2

t f(s) − (f(s) ∧ g(s)) ds

,

Zg(t) = Y1 t f(s) ∧ g(s)ds

+ Y3

t g(s) − (f(s) ∧ g(s)) ds

,

SLIDE 20

Back to our processes

X(t) = X(0) +

k

Yk t λk(X(s))ds

ζk,

Z(t) = X(0) +

k

Yk t λk(Z ◦ η(s))ds

ζk.

Now couple X(t) = X(0) +

k

Yk,1 t λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds

ζk

+

k

Yk,2 t λk(X(s)) − λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds

ζk

Zℓ(t) = Zℓ(0) +

k

Yk,1 t λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds

ζk

+

k

Yk,3 t λk(Zℓ ◦ ηℓ(s)) − λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds

ζk

Algorithm for simulation is equivalent to simulating CTMC.

SLIDE 21

For approximate processes

Zℓ(t) = Zℓ(0) +

k

Yk,1 t λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds

ζk

+

k

Yk,2 t λk(Zℓ ◦ ηℓ(s)) − λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds

ζk

Zℓ−1(t) = Zℓ−1(0) +

k

Yk,1 t λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds

ζk

+

k

Yk,3 t λk(Zℓ−1 ◦ ηℓ−1(s)) − λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds

ζk,

Algorithm for simulation is equivalent to Euler’s method. What about the variance and, hence, computational cost?

SLIDE 22

Multi-level Monte Carlo: chemical kinetic setting

Can prove:

Theorem (Anderson, Higham 2011)

Suppose (X, Zℓ) satisfy coupling. Then, sup

t≤T

E|X(t) − Zℓ(t)|2 ≤ C1(T)N−ρhℓ + C2(T)h2

ℓ.

Theorem (Anderson, Higham 2011)

Suppose (Zℓ, Zℓ−1) satisfy coupling. Then, sup

t≤T

E|Zℓ(t) − Zℓ−1(t)|2 ≤ C1(T)N−ρhℓ + C2(T)h2

ℓ.

0David F

. Anderson and Desmond J. Higham, Multi-level Monte Carlo for stochastically modeled chemical kinetic systems. To appear in SIAM: Modeling and Simulation. Available at arxiv.org:1107.2181. Also at www.math.wisc.edu/˜anderson.

SLIDE 23

Multi-level Monte Carlo: an unbiased estimator

For well chosen n0, nℓ, and nE. We have Var( Q) = Var   QE +

L

ℓ=ℓ0+1
Qℓ +

Q0   = O(ǫ2), with

Comp. cost =
ǫ−2(N−ρhL + h2

L)

N+ǫ−2
h−1

ℓ0 + ln(ǫ)2N−ρ + ln(ǫ−1)

1 M − 1hℓ0

SLIDE 24

Multi-level Monte Carlo: an unbiased estimator

Some observations:

1. Weak error plays no role in analysis: free to choose hL.
2. Common problems associated with tau-leaping

◮ Negativity of species numbers,

does not matter. Just define approximate process in a sensible way.

3. The method is unbiased.

SLIDE 25

Example

Consider a model of gene transcription and translation: G

25

→ G + M, M

1000

→ M + P, P + P

0.001

→ D, M

0.1

→ ∅, P

1

→ ∅. Suppose:

1. initialize with: G = 1, M = 0, P = 0, D = 0,
2. want to estimate the expected number of dimers at time T = 1,
3. to an accuracy of ± 1.0 with 95% confidence.

SLIDE 26

Example

Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1.0 4,740,000 149,000 CPU S (41 hours!) 8.27 ×1010 Method: Euler tau-leaping with crude Monte Carlo. Step-size Approximation # paths CPU Time # updates h = 3−7 3,712.3 ± 1.0 4,750,000 13,374.6 S 6.2 × 1010 h = 3−6 3,707.5 ± 1.0 4,750,000 6,207.9 S 2.1 × 1010 h = 3−5 3,693.4 ± 1.0 4,700,000 2,803.9 S 6.9 × 109 h = 3−4 3,654.6 ± 1.0 4,650,000 1,219.0 S 2.6 × 109 Method: unbiased MLMC with ℓ0 = 2, and M and L detailed below. Step-size parameters Approx. CPU Time # updates M = 3, L = 6 3,713.9 ± 1.0 1,063.3 S 1.1 ×109 M = 3, L = 5 3,714.7 ± 1.0 1,114.9 S 9.4 ×108 M = 3, L = 4 3,714.2 ± 1.0 1,656.6 S 1.0 ×109 M = 4, L = 4 3714.2 ± 1.0 1,334.8 S 1.1 ×109 M = 4, L = 5 3,713.8 ± 1.0 1,014.9 S 1.1 ×109

◮ the exact algorithm with crude Monte Carlo demanded 140 times more

CPU time than our unbiased MLMC estimator!

SLIDE 27

Example

Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1.0 4,740,000 149,000 CPU S (41 hours!) 8.27 ×1010 Unbiased Multi-level Monte Carlo with M = 3, L = 5, and ℓ0 = 2. Level # paths CPU Time

Var. estimator

# updates (X, Z3−5) 3,900 279.6 S 0.0658 6.8 ×107 (Z3−5, Z3−4) 30,000 49.0 S 0.0217 8.8 ×107 (Z3−4, Z3−3) 150,000 71.7 S 0.0179 1.5 ×108 (Z3−3, Z3−2) 510,000 112.3 S 0.0319 1.7 ×108 Tau-leap with h = 3−2 8,630,000 518.4 S 0.1192 4.7 ×108 Totals N.A. 1031.0 S 0.2565 9.5 ×108

SLIDE 28

Some conclusions about this method

1. Gillespie’s algorithm is by far the most common way to compute

expectations:

1.1 Means. 1.2 Variances. 1.3 Probabilities.

2. The new method (MLMC) also performs this task with no bias (exact).
3. Will commonly be many orders of magnitude faster than usual methods.
4. Applicable to essentially all continuous time Markov chains:

X(t) = X(0) +

k

Yk t λk(X(s))ds

ζk.
5. Makes no use of any specific structure or scaling in the problem.

SLIDE 29

Another example: Viral infection

Let

1. T = viral template.
2. G = viral genome.
3. S = viral structure.
4. V = virus.

Reactions: R1) T+ stuff

κ1

→ T + G κ1 = 1 R2) G

κ2

→ T κ2 = 0.025 R3) T+ stuff

κ3

→ T + S κ3 = 1000 R4) T

κ4

→ ∅ κ4 = 0.25 R5) S

κ5

→ ∅ κ5 = 2 R6) G + S

κ6

→ V κ6 = 7.5 × 10−6

◮ R. Srivastava, L. You, J. Summers, and J. Yin, J. Theoret. Biol., 2002. ◮ E. Haseltine and J. Rawlings, J. Chem. Phys, 2002. ◮ K. Ball, T. Kurtz, L. Popovic, and G. Rempala, Annals of Applied Probability, 2006. ◮ W. E, D. Liu, and E. Vanden-Eijden, J. Comput. Phys, 2006.

SLIDE 30

Another example: Viral infection

Stochastic equations for X = (XG, XS, XT, XV) are X1(t) = X1(0) + Y1 t X3(s)ds

− Y2
0.025

t X1(s)ds

− Y6
7.5 × 10−6

t X1(s)X2(s)ds

X2(t) = X2(0) + Y3
1000

t X3(s)ds

− Y5
2

t X2(s)ds

− Y6
7.5 × 10−6

t X1(s)X2(s)ds

X3(t) = X3(0) + Y2
0.025

t X1(s)ds

− Y4
0.25

t X3(s)ds

X4(t) = X4(0) + Y6
7.5 × 10−6

t X1(s)X2(s)ds

.

SLIDE 31

Another example: Viral infection

Reactions: R1) T+ stuff

κ1

→ T + G κ1 = 1 R2) G

κ2

→ T κ2 = 0.025 R3) T+ stuff

κ3

→ T + S κ3 = 1000 R4) T

κ4

→ ∅ κ4 = 0.25 R5) S

κ5

→ ∅ κ5 = 2 R6) G + S

κ6

→ V κ6 = 7.5 × 10−6 If T > 0,

◮ reactions 3 and 5 are much faster than others. ◮ M/M/∞ queue =

⇒ S is approximately Poisson(500 × T). Can average out to get approximate process Z(t).

SLIDE 32

Another example: Viral infection

Approximate process satisfies. Z1(t) = X1(0) + Y1 t Z3(s)ds

− Y2
0.025

t Z1(s)ds

− Y6
3.75 × 10−3

t Z1(s)Z3(s)ds

Z3(t) = X3(0) + Y2
0.025

t Z1(s)ds

− Y4
0.25

t Z3(s)ds

Z4(t) = X4(0) + Y6
3.75 × 10−3

t Z1(s)Z3(s)ds

.

(1) Now use Ef(X(t)) = E[f(X(t)) − f(Z(t))] + Ef(Z(t)).

SLIDE 33

Another example: Viral infection

X(t) = X(0) + Y1,1 t min{X3(s), Z3(s)}ds

ζ1 + Y1,2

t X3(s) − min{X3(s), Z3(s)}ds

ζ1

+ Y2,1

0.025

t min{X1(s), Z1(s)}ds

ζ2 + Y2,2
0.025

t X1(s) − min{X1(s), Z1(s)}ds

ζ2

+ Y3

1000

t X3(s)ds

ζ3

+ Y4,1

0.25

t min{X3(s), Z3(s)}(s)ds

ζ4 + Y4,2
0.25

t X3(s) − min{X3(s), Z3(s)}(s)ds

ζ4

+ Y5

2

t X2(s)ds

ζ5

+ Y6,1 t min{λ6(X(s)), Λ6(Z(s))}ds

ζ6 − Y6,2

t λ6(X(s)) − min{λ6(X(s)), Λ6(Z(s))}ds

ζ6

Z(t) = Y1,1 t min{X3(s), Z3(s)}ds

ζ1 + Y1,3

t Z3(s) − min{X3(s), Z3(s)}ds

ζ1

+ Y2,1

0.025

t min{X1(s), Z1(s)}ds

ζ2 + Y2,3
0.025

t Z1(s) − min{X1(s), Z1(s)}ds

ζ2

+ Y4,1

0.25

t min{X3(s), Z3(s)}(s)ds

ζ4 + Y4,3
0.25

t Z3(s) − min{X3(s), Z3(s)}(s)ds

ζ4

+ Y6,1 t min{λ6(X(s)), Λ6(Z(s))}ds

ζ6 − Y6,3

t Λ6(Z(s)) − min{λ6(X(s)), Λ6(Z(s))}ds

ζ6,

SLIDE 34

Another example: Viral infection

Suppose want EXvirus(20) Given T(0) = 10, all others zero. Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 13.85 ± 0.07 75,000 24,800 CPU S 1.45 × 1010 Method: Ef(X(t)) = E[f(X(t)) − f(Z(t))] + Ef(Z(t)). Approximation CPU Time # updates 13.91 ± 0.07 1,118.5 CPU S 2.41 × 108 Exact + crude Monte Carlo used:

1. 60 times more total steps.
2. 22 times more CPU time.

SLIDE 35

Thanks!

References:

1. David F. Anderson and Desmond J. Higham, Multi-level Monte Carlo for

continuous time Markov chains, with applications in biochemical kinetics, to appear in SIAM: Multiscale Modeling and Simulation. Available at arXiv.org:1107.2181. Also on my website: www.math.wisc.edu/˜anderson. Funding:

1. NSF Grant DMS-1009275.
2. Thanks to Art Owen.