SLIDE 1 Multilevel Monte Carlo for the continuous time Markov chain models arising in biology
David F. Anderson∗
∗anderson@math.wisc.edu
Department of Mathematics University of Wisconsin - Madison
Work is joint with Des Higham MCQMC 2012 February 17th, 2012
SLIDE 2 Overview
◮ Most common stochastic models of biochemical reaction systems are
continuous time Markov chains.
◮ Common examples include:
- 1. Gene regulatory networks.
- 2. Models of viral infection.
- 3. General population models (epidemic, predator-prey, etc.)
◮ These models are becoming dominant in molecular biology.
Problem: extend multi-level Monte Carlo to this setting.
SLIDE 3 Why?
- 1. Why would the be useful?
◮ Short answer: the common simulation methods can be very computationally
expensive.
◮ Many reactions happening over time-frame of interest.
- 2. Why is this non-trivial?
◮ In SDE case, we have a useful representation
X(t) = X(0) + t b(X(s))ds + t σ(X(s))dW(s) Zℓ(t) = Zℓ + t b(Zℓ ◦ η(s))ds + t σ(Zℓ ◦ η(s))dW(s).
◮ Two processes are coupled via Brownian motion W. ◮ Clear how to simulate two processes with different time steps. ◮ We need a pathwise representation to get a good coupling that:
2.1 Couples closely. 2.2 Easy to analyze.
SLIDE 4 The Poisson process
A Poisson process, Y(·): (a) Let {ξi} be i.i.d. exponential random variables with parameter one. (b) Now, put points down on a line with spacing equal to the ξi: x x x x x x x x
↔
ξ1
↔
ξ2
← →
ξ3 · · · t
◮ Let Y(t) denote the number of points hit by time t. ◮ In the figure above, Y(t) = 6.
SLIDE 5 The Poisson process
Let Y be a unit rate Poisson process and λ ≥ 0. Then Y(λt) is a Poisson process with parameter λ. x x x x x x x x
↔
ξ1
↔
ξ2
← →
ξ3 · · · λt There is no reason λ needs to be constant in time, in which case Yλ(t) ≡ Y t λ(s)ds
- is an non-homogeneous Poisson process with intensity λ(t).
SLIDE 6
Build up model: Random time change representation of Tom Kurtz
Consider the simple system A + B → C where one molecule each of A and B is being converted to one of C. Simple book-keeping: if X(t) = (XA(t), XB(t), XC(t))T gives the state at time t, X(t) = X(0) + R(t) −1 −1 1 , where
◮ R(t) is the # of times the reaction has occurred by time t, and ◮ X(0) is the initial condition.
SLIDE 7 Build up model: Random time change representation of Tom Kurtz
Assuming intensity or propensity of reaction is λ(X(s)) = κXA(s)XB(s), we can model R(t) = Y t κXA(s)XB(s)ds
- where Y is a unit-rate Poisson point process.
Hence XA(t) XB(t) XC(t) ≡ X(t) = X(0) + −1 −1 1 Y t κXA(s)XB(s)ds
SLIDE 8 Build up model: Random time change representation of Tom Kurtz
- Now consider a network of reactions involving d chemical species,
S1, . . . , Sd: νk1S1 + νk2S2 + · · · + νkdSd − → ν′
k1S1 + ν′ k2S2 + · · · + ν′ kdSd
Denote reaction vector as ζk = ν′
k − νk,
- The intensity (or propensity) of kth reaction is λk : Zd
≥0 → R.
X(t) = X(0) +
Yk t λk(X(s))ds
Yk are independent, unit-rate Poisson processes.
SLIDE 9 Example
Consider a model of gene transcription and translation: G
25
→ G + M, (Transcription) M
1000
→ M + P, (Translation) P + P
0.001
→ D, (Dimerization) M
0.1
→ ∅, (Degradation of mRNA) P
1
→ ∅ (Degradation of Protein). Then, if X = [XM, XP, XD]T, X(t) = X(0) + Y1 (25t) 1 + Y2
t XM(s)ds 1 + Y3
t XP(s)(XP(s) − 1)ds −2 1 + Y4
t XM(s)ds −1 + Y5
t XP(s)ds −1
SLIDE 10 Computing Expectations
Problem: Approximate Ef(X(T)) to some desired tolerance, ǫ > 0. Easy!
◮ Simulate the CTMC exactly, ◮ generate independent paths, X[i](t), use the unbiased estimator
n
n
f(X[i](t)).
◮ stop when desired confidence interval is ± ǫ.
Total computational complexity = (cost per path) × (# paths) = O(N × ǫ−2).
SLIDE 11 Tau-leaping: Euler’s method
Recall: X(t) = X(0) +
Yk t λk(X(s))ds
Tau-leaping is essentially an Euler approximation of t λk(X(s))ds: Z(h) = Z(0) +
Yk h λk(Z(s)) ds
≈ Z(0) +
Yk
d
= Z(0) +
Poisson
SLIDE 12 Euler’s method
Path-wise representation for Z(t) generated by Euler’s method is Z(t) = X(0) +
Yk t λk(Z ◦ η(s))ds
where η(s) = tn if tn ≤ s < tn+1 = tn + h is a step function giving left endpoints of time discretization.
SLIDE 13 Return to approximating Ef(X(T))
Let
n
n
f(ZL,[i](t)). We have MSE( µn) = (Bias)2 + (variance), If have an order one method Total computational complexity = (cost per path) × (# paths) = O(ǫ−1 × ǫ−2) = O(ǫ−3).
SLIDE 14 Benefits/drawbacks
Benefits:
- 1. Can drastically lower the computational complexity of a problem
Drawbacks:
- 1. Convergence results usually give order of convergence. Can’t give a
precise hL.
- 2. Tau-leaping has problems: what happens if you go negative?
- 3. Gone away from an unbiased estimator.
SLIDE 15 Multi-level Monte Carlo
Usual MLMC: Ef(X(t)) ≈ E[f(ZL(t)) − f(ZL−1(t))]+
L−1
E[f(Zℓ(t))−f(Zℓ−1(t))]+Ef(Zℓ0(t)). In our setting: Ef(X(t)) = E[f(X(t)) − f(ZL(t))] +
L
E[f(Zℓ(t)) − f(Zℓ−1(t))] + Ef(Zℓ0(t)). gives an unbiased estimator.
SLIDE 16 Multi-level Monte Carlo: an unbiased estimator
Ef(X(t)) = E[f(X(t)) − f(ZL(t))] +
L
E[f(Zℓ(t)) − f(Zℓ−1(t))] + Ef(Zℓ0(t)). Estimators:
def
= 1 nE
nE
(f(X[i](T) − f(ZL,[i](T))),
def
= 1 nℓ
nℓ
(f(Zℓ,[i](T)) − f(Zℓ−1,[i](T))), for ℓ ∈ {ℓ0 + 1, . . . , L}
def
= 1 n0
n0
f(Zℓ0,[i](T)). Question: what is the coupling and the computational cost?
SLIDE 17
How do we couple Poisson processes
Suppose I want to generate:
◮ A Poisson process with intensity 13.1. ◮ A Poisson process with intensity 13. ◮ We could let Y1 and Y2 be independent, unit-rate Poisson processes,
and set Z13.1(t) = Y1(13.1t), Z13(t) = Y2(13t), Using this representation, these processes are independent and, hence, not coupled. The variance of difference is large: Var(Z13.1(t) − Z13(t)) = Var(Y1(13.1t)) + Var(Y2(13t)) = 26.1t.
SLIDE 18
How do we generate processes simultaneously
Suppose I want to generate:
◮ A Poisson process with intensity 13.1. ◮ A Poisson process with intensity 13. ◮ We could let Y1 and Y2 be independent unit-rate Poisson processes, and
set Z13.1(t) = Y1(13t) + Y2(0.1t) Z13(t) = Y1(13t), The variance of difference is much smaller: Var(Z13.1(t) − Z13(t)) = Var (Y2(0.1t)) = 0.1t.
SLIDE 19 How do we generate processes simultaneously
More generally, suppose we want
- 1. non-homogeneous Poisson process with intensity f(t) and
- 2. non-homogeneous Poisson process with intensity g(t).
Let Y1, Y2, and Y3 be independent, unit-rate Poisson processes and define Zf(t) = Y1 t f(s) ∧ g(s)ds
t f(s) − (f(s) ∧ g(s)) ds
Zg(t) = Y1 t f(s) ∧ g(s)ds
t g(s) − (f(s) ∧ g(s)) ds
SLIDE 20 Back to our processes
X(t) = X(0) +
Yk t λk(X(s))ds
Z(t) = X(0) +
Yk t λk(Z ◦ η(s))ds
Now couple X(t) = X(0) +
Yk,1 t λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds
+
Yk,2 t λk(X(s)) − λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds
Zℓ(t) = Zℓ(0) +
Yk,1 t λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds
+
Yk,3 t λk(Zℓ ◦ ηℓ(s)) − λk(X(s)) ∧ λk(Zℓ ◦ ηℓ(s))ds
Algorithm for simulation is equivalent to simulating CTMC.
SLIDE 21 For approximate processes
Zℓ(t) = Zℓ(0) +
Yk,1 t λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds
+
Yk,2 t λk(Zℓ ◦ ηℓ(s)) − λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds
Zℓ−1(t) = Zℓ−1(0) +
Yk,1 t λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds
+
Yk,3 t λk(Zℓ−1 ◦ ηℓ−1(s)) − λk(Zℓ ◦ ηℓ(s)) ∧ λk(Zℓ−1 ◦ ηℓ−1(s))ds
Algorithm for simulation is equivalent to Euler’s method. What about the variance and, hence, computational cost?
SLIDE 22 Multi-level Monte Carlo: chemical kinetic setting
Can prove:
Theorem (Anderson, Higham 2011)
Suppose (X, Zℓ) satisfy coupling. Then, sup
t≤T
E|X(t) − Zℓ(t)|2 ≤ C1(T)N−ρhℓ + C2(T)h2
ℓ.
Theorem (Anderson, Higham 2011)
Suppose (Zℓ, Zℓ−1) satisfy coupling. Then, sup
t≤T
E|Zℓ(t) − Zℓ−1(t)|2 ≤ C1(T)N−ρhℓ + C2(T)h2
ℓ.
0David F
. Anderson and Desmond J. Higham, Multi-level Monte Carlo for stochastically modeled chemical kinetic systems. To appear in SIAM: Modeling and Simulation. Available at arxiv.org:1107.2181. Also at www.math.wisc.edu/˜anderson.
SLIDE 23 Multi-level Monte Carlo: an unbiased estimator
For well chosen n0, nℓ, and nE. We have Var( Q) = Var QE +
L
Q0 = O(ǫ2), with
- Comp. cost =
- ǫ−2(N−ρhL + h2
L)
ℓ0 + ln(ǫ)2N−ρ + ln(ǫ−1)
1 M − 1hℓ0
SLIDE 24 Multi-level Monte Carlo: an unbiased estimator
Some observations:
- 1. Weak error plays no role in analysis: free to choose hL.
- 2. Common problems associated with tau-leaping
◮ Negativity of species numbers,
does not matter. Just define approximate process in a sensible way.
- 3. The method is unbiased.
SLIDE 25 Example
Consider a model of gene transcription and translation: G
25
→ G + M, M
1000
→ M + P, P + P
0.001
→ D, M
0.1
→ ∅, P
1
→ ∅. Suppose:
- 1. initialize with: G = 1, M = 0, P = 0, D = 0,
- 2. want to estimate the expected number of dimers at time T = 1,
- 3. to an accuracy of ± 1.0 with 95% confidence.
SLIDE 26
Example
Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1.0 4,740,000 149,000 CPU S (41 hours!) 8.27 ×1010 Method: Euler tau-leaping with crude Monte Carlo. Step-size Approximation # paths CPU Time # updates h = 3−7 3,712.3 ± 1.0 4,750,000 13,374.6 S 6.2 × 1010 h = 3−6 3,707.5 ± 1.0 4,750,000 6,207.9 S 2.1 × 1010 h = 3−5 3,693.4 ± 1.0 4,700,000 2,803.9 S 6.9 × 109 h = 3−4 3,654.6 ± 1.0 4,650,000 1,219.0 S 2.6 × 109 Method: unbiased MLMC with ℓ0 = 2, and M and L detailed below. Step-size parameters Approx. CPU Time # updates M = 3, L = 6 3,713.9 ± 1.0 1,063.3 S 1.1 ×109 M = 3, L = 5 3,714.7 ± 1.0 1,114.9 S 9.4 ×108 M = 3, L = 4 3,714.2 ± 1.0 1,656.6 S 1.0 ×109 M = 4, L = 4 3714.2 ± 1.0 1,334.8 S 1.1 ×109 M = 4, L = 5 3,713.8 ± 1.0 1,014.9 S 1.1 ×109
◮ the exact algorithm with crude Monte Carlo demanded 140 times more
CPU time than our unbiased MLMC estimator!
SLIDE 27 Example
Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 3,714.2 ± 1.0 4,740,000 149,000 CPU S (41 hours!) 8.27 ×1010 Unbiased Multi-level Monte Carlo with M = 3, L = 5, and ℓ0 = 2. Level # paths CPU Time
# updates (X, Z3−5) 3,900 279.6 S 0.0658 6.8 ×107 (Z3−5, Z3−4) 30,000 49.0 S 0.0217 8.8 ×107 (Z3−4, Z3−3) 150,000 71.7 S 0.0179 1.5 ×108 (Z3−3, Z3−2) 510,000 112.3 S 0.0319 1.7 ×108 Tau-leap with h = 3−2 8,630,000 518.4 S 0.1192 4.7 ×108 Totals N.A. 1031.0 S 0.2565 9.5 ×108
SLIDE 28 Some conclusions about this method
- 1. Gillespie’s algorithm is by far the most common way to compute
expectations:
1.1 Means. 1.2 Variances. 1.3 Probabilities.
- 2. The new method (MLMC) also performs this task with no bias (exact).
- 3. Will commonly be many orders of magnitude faster than usual methods.
- 4. Applicable to essentially all continuous time Markov chains:
X(t) = X(0) +
Yk t λk(X(s))ds
- ζk.
- 5. Makes no use of any specific structure or scaling in the problem.
SLIDE 29 Another example: Viral infection
Let
- 1. T = viral template.
- 2. G = viral genome.
- 3. S = viral structure.
- 4. V = virus.
Reactions: R1) T+ stuff
κ1
→ T + G κ1 = 1 R2) G
κ2
→ T κ2 = 0.025 R3) T+ stuff
κ3
→ T + S κ3 = 1000 R4) T
κ4
→ ∅ κ4 = 0.25 R5) S
κ5
→ ∅ κ5 = 2 R6) G + S
κ6
→ V κ6 = 7.5 × 10−6
◮ R. Srivastava, L. You, J. Summers, and J. Yin, J. Theoret. Biol., 2002. ◮ E. Haseltine and J. Rawlings, J. Chem. Phys, 2002. ◮ K. Ball, T. Kurtz, L. Popovic, and G. Rempala, Annals of Applied Probability, 2006. ◮ W. E, D. Liu, and E. Vanden-Eijden, J. Comput. Phys, 2006.
SLIDE 30 Another example: Viral infection
Stochastic equations for X = (XG, XS, XT, XV) are X1(t) = X1(0) + Y1 t X3(s)ds
t X1(s)ds
t X1(s)X2(s)ds
t X3(s)ds
t X2(s)ds
t X1(s)X2(s)ds
t X1(s)ds
t X3(s)ds
- X4(t) = X4(0) + Y6
- 7.5 × 10−6
t X1(s)X2(s)ds
SLIDE 31
Another example: Viral infection
Reactions: R1) T+ stuff
κ1
→ T + G κ1 = 1 R2) G
κ2
→ T κ2 = 0.025 R3) T+ stuff
κ3
→ T + S κ3 = 1000 R4) T
κ4
→ ∅ κ4 = 0.25 R5) S
κ5
→ ∅ κ5 = 2 R6) G + S
κ6
→ V κ6 = 7.5 × 10−6 If T > 0,
◮ reactions 3 and 5 are much faster than others. ◮ M/M/∞ queue =
⇒ S is approximately Poisson(500 × T). Can average out to get approximate process Z(t).
SLIDE 32 Another example: Viral infection
Approximate process satisfies. Z1(t) = X1(0) + Y1 t Z3(s)ds
t Z1(s)ds
t Z1(s)Z3(s)ds
t Z1(s)ds
t Z3(s)ds
- Z4(t) = X4(0) + Y6
- 3.75 × 10−3
t Z1(s)Z3(s)ds
(1) Now use Ef(X(t)) = E[f(X(t)) − f(Z(t))] + Ef(Z(t)).
SLIDE 33 Another example: Viral infection
X(t) = X(0) + Y1,1 t min{X3(s), Z3(s)}ds
t X3(s) − min{X3(s), Z3(s)}ds
+ Y2,1
t min{X1(s), Z1(s)}ds
t X1(s) − min{X1(s), Z1(s)}ds
+ Y3
t X3(s)ds
+ Y4,1
t min{X3(s), Z3(s)}(s)ds
t X3(s) − min{X3(s), Z3(s)}(s)ds
+ Y5
t X2(s)ds
+ Y6,1 t min{λ6(X(s)), Λ6(Z(s))}ds
t λ6(X(s)) − min{λ6(X(s)), Λ6(Z(s))}ds
Z(t) = Y1,1 t min{X3(s), Z3(s)}ds
t Z3(s) − min{X3(s), Z3(s)}ds
+ Y2,1
t min{X1(s), Z1(s)}ds
t Z1(s) − min{X1(s), Z1(s)}ds
+ Y4,1
t min{X3(s), Z3(s)}(s)ds
t Z3(s) − min{X3(s), Z3(s)}(s)ds
+ Y6,1 t min{λ6(X(s)), Λ6(Z(s))}ds
t Λ6(Z(s)) − min{λ6(X(s)), Λ6(Z(s))}ds
SLIDE 34 Another example: Viral infection
Suppose want EXvirus(20) Given T(0) = 10, all others zero. Method: Exact algorithm with crude Monte Carlo. Approximation # paths CPU Time # updates 13.85 ± 0.07 75,000 24,800 CPU S 1.45 × 1010 Method: Ef(X(t)) = E[f(X(t)) − f(Z(t))] + Ef(Z(t)). Approximation CPU Time # updates 13.91 ± 0.07 1,118.5 CPU S 2.41 × 108 Exact + crude Monte Carlo used:
- 1. 60 times more total steps.
- 2. 22 times more CPU time.
SLIDE 35 Thanks!
References:
- 1. David F. Anderson and Desmond J. Higham, Multi-level Monte Carlo for
continuous time Markov chains, with applications in biochemical kinetics, to appear in SIAM: Multiscale Modeling and Simulation. Available at arXiv.org:1107.2181. Also on my website: www.math.wisc.edu/˜anderson. Funding:
- 1. NSF Grant DMS-1009275.
- 2. Thanks to Art Owen.