SLIDE 1 Simulation of rare events by Adaptive Multilevel Splitting algorithms Charles-Edouard Bréhier
CNRS & Institut Camille Jordan, Université Lyon 1
Joint works with
- M. Gazeau (Borealis AI, Toronto), L. Goudenège (CNRS, Centrale Paris),
- T. Lelièvre (Ecole des Ponts, Paris), M. Rousset (Inria, Rennes)
- F. Bouchet, C. Herbert, T. Lestang (ENS Lyon, Physics Dept), F. Ragone
(Department of Earth and Environmental Sciences, Milano)
SLIDE 2
Motivations: rare event in metastable dynamics
Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points.
SLIDE 3
Motivations: rare event in metastable dynamics
Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points. Examples:
◮ SDE: dXt = −∇V (Xt)dt +
√ 2ǫdW (t)
◮
Double-well potential: V (x) = x4
4 − x2 2 .
SLIDE 4
Motivations: rare event in metastable dynamics
Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points. Exemples:
◮ ◮ SPDE: Allen-Cahn equation
∂u(t, x) ∂t = γ∆u(t, x) − ∇V (u(t, x)) + √ 2ǫξ(t, x)
SLIDE 5 Transitions between metastable states
One goal is to estimate p = Px0(τB < τA)
A B
⊥ΧΘΕ∴
where A and B are metastable states.
◮ Direct numerical simulations: prohibitive, need N ∼ Cp−1 independent
realizations.
◮ Analytic approach: Large Deviations, Potential Theory.
In the limit ǫ → 0, ǫ log(p(ǫ)) →
ǫ→0 −C
, p(ǫ) ∼
ǫ→0 ce−C/ǫ.
Transition times of the order eC/ǫ.
SLIDE 6 Splitting algorithms
Introduced in the 1950s:
- H. Kahn and T. E. Harris.
Estimation of particle transmission by random sampling. National Bureau of Standards, 12:27–30, 1951.
SLIDE 7 Splitting algorithms
Introduced in the 1950s:
- H. Kahn and T. E. Harris.
Estimation of particle transmission by random sampling. National Bureau of Standards, 12:27–30, 1951.
Revival in the 1990s and 2000s:
- P. Glasserman, P. Heidelberger, P. Shahabuddin, and T. Zajic.
Multilevel splitting for estimating rare event probabilities.
- Oper. Res., 47(4):585–600, 1999.
with many variants:
- M. Villén-Altamirano and J. Villén-Altamirano.
RESTART: A method for accelerating rare events simulations. In Proceeding of the thirteenth International Teletraffic Congress, volume Copenhagen, Denmark, June 19-26 of Queueing, performance and control in ATM: ITC-13 workshops, pages 71–76. North-Holland, Amsterdam-New York, 1991.
Estimation of small failure probabilities in high dimensions by subset simulation. Journal of Probabilistic Engineering Mechanics, 16:263–277, 2001.
Nested sampling for general Bayesian computation. Bayesian Anal., 1(4):833–859 (electronic), 2006.
- P. Del Moral, A. Doucet, and A. Jasra.
Sequential Monte Carlo samplers.
- J. R. Stat. Soc. Ser. B Stat. Methodol., 68(3):411–436, 2006.
SLIDE 8 Adaptive Multilevel Splitting algorithms
Introduced in
Adaptive multilevel splitting for rare event analysis.
- Stoch. Anal. Appl., 25(2):417–443, 2007.
Analysis of the method: series of recent works (non exhaustive)
Combinatorial analysis of the adaptive last particle method. Statistics and Computing, pages 1–20, 2014. C.-E. Bréhier, T. Lelièvre, and M. Rousset. Analysis of adaptive multilevel splitting algorithms in an idealized setting. ESAIM Probability and Statistics, 19:361–394, 2015.
Fluctuation analysis of adaptive multilevel splitting.
- Ann. Appl. Probab., 26(6):3319–3380, 2016.
C.-E. Bréhier, M. Gazeau, L. Goudenège, T. Lelièvre, and M. Rousset. Unbiasedness of some generalized adaptive multilevel splitting algorithms.
- Ann. Appl. Probab., 26(6):3559–3601, 2016.
SLIDE 9 Applications of AMS algorithms
- J. Rolland and E. Simonnet.
Statistical behaviour of adaptive multilevel splitting algorithms in simple models. Journal of Computational Physics, 283:541 – 558, 2015.
- D. Aristoff, T. Lelièvre, C. G. Mayne, and I. Teo.
Adaptive multilevel splitting in molecular dynamics simulations. In CEMRACS 2013—modelling and simulation of complex systems: stochastic and deterministic approaches, volume 48 of ESAIM Proc. Surveys, pages 215–225. EDP Sci., Les Ulis, 2015.
Several recently defended and ongoing PhD Thesis:
- R. Poncet (Polytechnique 10-2017), H. Louvin (CEA 10-2017), L. J. Lopes
(Ponts-Sanofi), T. Lestang (ENS Lyon, Physics).
- H. Louvin, E. Dumonteil, T. Lelièvre, M. Rousset, and C. M. Diop.
Adaptive multilevel splitting for monte carlo particle transport. In EPJ Web of Conferences, volume 153, page 06006. EDP Sciences, 2017.
- L. J. Lopes, C. Chipot, and T. Lelièvre.
Adaptive multilevel splitting method: Isomerization of the alanine dipeptide. 2017.
Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.
SLIDE 10
Principle of splitting algorithms
Principles: iterative algorithm
◮ System of interacting replicas: copies of the Markov trajectories are
simulated (in parallel).
◮ Adaptive Splitting: selection of replicas using a score function. ◮ Partial resampling: new copies are simulated, by branching replicas
(mutation).
◮ Weighting: remain consistent when removing/creating replicas.
An important property: the (Markov) model is a black box. Simpler setting: estimation of p = P(Y > a). with
◮ Y positive real-valued random variable (with continuous cdf) ◮ a threshold.
SLIDE 11
General notions: estimators, consistency, efficiency
An estimator of θ ∈ R is a random variable ˆ θ.
SLIDE 12 General notions: estimators, consistency, efficiency
An estimator of θ ∈ R is a random variable ˆ θ. About consistency:
◮ The bias of the estimator is defined as b(ˆ
θ) = E[ˆ θ] − θ.
◮ We say that ˆ
θ is an unbiased estimator of θ if b(ˆ θ) = 0.
◮ Let
ˆ θN
- N∈N be a sequence of estimators of θ. It is called consistent if
ˆ θN →
N→∞ θ in probability. It is called asymptotically unbiased if
b(ˆ θN) →
N→∞ 0.
SLIDE 13 General notions: estimators, consistency, efficiency
An estimator of θ ∈ R is a random variable ˆ θ. About consistency:
◮ The bias of the estimator is defined as b(ˆ
θ) = E[ˆ θ] − θ.
◮ We say that ˆ
θ is an unbiased estimator of θ if b(ˆ θ) = 0.
◮ Let
ˆ θN
- N∈N be a sequence of estimators of θ. It is called consistent if
ˆ θN →
N→∞ θ in probability. It is called asymptotically unbiased if
b(ˆ θN) →
N→∞ 0.
About efficiency: mean-square error E|ˆ θ − θ|2 =
θ − θ 2 + E|ˆ θ − Eˆ θ|2 = b(ˆ θ)2 + Var(ˆ θ) sum of bias squared (consistency error) and variance (statistical error). Example: ˆ mN = 1 N
N
Zn with i.i.d.
- Zn
- n∈N is a consistent unbiased estimator of m = E[Z1].
Statistical error: Var( ˆ mN) = Var(Z1)
N
. More precisely: Central Limit Theorem.
SLIDE 14 The rare event case
Estimation of p = P(X ∈ A) = E[1X∈A]? Direct approach: crude Monte-Carlo simulation Take Z = 1X∈A, and i.i.d. realizations Z1, . . . , Zn = 1Xn∈A, . . ., and compute the empirical average ˆ pN = 1 N
N
Zn = 1 N
N
1Xn∈A. Then ˆ pN is a consistent unbiased estimator of p = E[Z]. Relative error:
pN − p|2 p =
p √ N ∼
p→0
1 √pN .
SLIDE 15 The rare event case
Estimation of p = P(X ∈ A) = E[1X∈A]? Direct approach: crude Monte-Carlo simulation Take Z = 1X∈A, and i.i.d. realizations Z1, . . . , Zn = 1Xn∈A, . . ., and compute the empirical average ˆ pN = 1 N
N
Zn = 1 N
N
1Xn∈A. Then ˆ pN is a consistent unbiased estimator of p = E[Z]. Relative error:
pN − p|2 p =
p √ N ∼
p→0
1 √pN . Variance reduction strategy: find a family of simulatable estimators ˆ θ to reduce the cost. Requirements (to also remain unbiased) Var(ˆ θ) ≪ Var(ˆ pN) , E[ˆ θ] = p where simulations of ˆ θ and ˆ pN have the same average cost.
SLIDE 16 Splitting algorithm
Case study: p = P(Y > a), with Y > 0. Splitting: p =
N
P
N
pj with a non-decreasing sequence of levels 0 = a0 < a1 < . . . < aN−1 < aN = a. Estimator: ˆ p = N
j=1 pj,nrep, (N independent estimators, with nrep independent
replicas per level) is unbiased. Optimal choice of levels aj: such that p1 = . . . = pN = p1/N. Relative error, N → +∞: ∼ √
| log(p)| √ N
.
SLIDE 17 Splitting algorithm
Case study: p = P(Y > a), with Y > 0. Splitting: p =
N
P
N
pj with a non-decreasing sequence of levels 0 = a0 < a1 < . . . < aN−1 < aN = a. Estimator: ˆ p = N
j=1 pj,nrep, (N independent estimators, with nrep independent
replicas per level) is unbiased. Optimal choice of levels aj: such that p1 = . . . = pN = p1/N. Relative error, N → +∞: ∼ √
| log(p)| √ N
. Adaptive version: computes a random sequence Z 0, . . . of levels with pj = 1 −
k nrep .
SLIDE 18 The Markov chain setting
Model: X =
- Xt
- t∈N is a Markov chain on a state space S.
(Example: Xn+1 = Xn − h∇V (Xn) +
- 2β−1hwn explicit Euler-Maruyama discretization of a
Stochastic Differential Equation dXt = −∇V (Xt)dt +
Goal: Estimate with Monte-Carlo simulation the average E(ψ(Xt, t ≥ 0)), with ψ(Xt, t ≥ 0) = 1τB<τAϕ(Xt, t ≥ 0), where τC = inf {t|Xt ∈ C ⊂ S}, C = A or B. (ex: metastable states) Rare event regime: P {τB < τA} ≪ 1
A B
⊥ΧΘΕ∴
SLIDE 19 (Generalized) Adaptive Multilevel Splitting: the abstract setting
Goal: Build unbiased estimators of E
Key tool: A given reaction coordinate: ξ : S → R. Only requirement: there is a zmax ∈ R, such that B ⊂ {ξ(·) > zmax}. Terminology: z = ξ(x) is called the level of the state x ∈ S. score(Xt, t ≥ 0) := supt≤τA ξ(Xt) is the score of the trajectory (Xt)t≥0.
SLIDE 20 (Generalized) Adaptive Multilevel Splitting: the abstract setting
Goal: Build unbiased estimators of E
Key tool: A given reaction coordinate: ξ : S → R. Only requirement: there is a zmax ∈ R, such that B ⊂ {ξ(·) > zmax}. Terminology: z = ξ(x) is called the level of the state x ∈ S. score(Xt, t ≥ 0) := supt≤τA ξ(Xt) is the score of the trajectory (Xt)t≥0. Algorithm: based on:
◮ Multiple Replicas: several copies of the Markov chain are simulated in
parallel.
◮ Adaptive Splitting: splitting is performed for replicas with highest scores
(selection of replicas).
◮ Partial re-sampling: trajectories of new replicas are re-sampled partially
(branching) forward in time (mutation of replicas). Formally: kernel (z, x) ∈ R × SN → πz(x, ·) ∈ Proba(SN)
SLIDE 21 (Generalized) Adaptive Multilevel Splitting: algorithm
Parameters:(Picture’s parameters are in red)
◮ nrep(= 3): number of replicas. ◮ k(= 1) ∈ {1, . . . , nrep − 1}: rank of order statistic.
Initialization: q = 0
◮ Sample nrep i.i.d. replicas
n∈I (0) with labels I (0) = {1, · · · , nrep}
according to given Markov transitions.
◮ Compute Z (0) := the k-th order statistic of the sample scores . ◮ Set: Active replicas = I (0) >Z(0) =
- n ∈ I (0)|s(X (n)) > Z (0)
, the labels of replicas with score strictly higher than Z (0).
A B
>ΒΥ
4ΕΩΩΜΖΙ %ΓΞΜΖΙ
⊥ΧΘΕ∴
SLIDE 22 (Generalized) Adaptive Multilevel Splitting: algorithm
Iterate on q starting from q = 0. Iteration step 0: Stopping criterion
◮ If Z (q) > zmax or if I (q) >Z(q) = ∅ (extinction), stop and set Qiter = q.
Iteration step 1: Splitting
◮ Create independently Kq+1 = nrep − card(I (q) >Z(q))(= 1) new replicas, with
states chosen uniformly among active replicas, i.e. with labels in I (q)
>Z(q).
Thus: nrep(= 3) replicas are now active.
◮ Denote by I (q+1) the new total set of labels (active and passive).
A B
>ΒΥ
4ΕΩΩΜΖΙ %ΓΞΜΖΙ
⊥ΧΘΕ∴
→
A B
>ΒΥ 2Ι[ ⊥ΧΘΕ∴
SLIDE 23
(Generalized) Adaptive Multilevel Splitting: algorithm
Iteration step 2: Partial re-sampling of trajectories For each newly created replica, independently:
◮ Start from the level entry state XτZ(q)
where τz := inf(t ≥ 0|ξ(Xt) > z).
◮ Complete the trajectory using the Markov chain transition. ◮ Re-index replicas with q + 1 instead of q.
A B
>ΒΥ 2Ι[ ⊥ΧΘΕ∴
SLIDE 24 (Generalized) Adaptive Multilevel Splitting: algorithm
Iteration step 3: Level computation
◮ Define Z (q+1) as the k(= 1)-th order statistic of the active and new
replicas scores
◮ Update the set of active replicas:
I (q+1)
>Z(q+1) =
- n ∈ I (q+1)|s(X (n)) > Z (q+1)
.
A B
>ΒΥ >ΒΥ
Update q := q + 1.
SLIDE 25 Main result: Unbiased estimators
Estimator of E[1τB<τAϕ(X)]: ˆ ψ = R(0) × . . . × R(Qiter−1) 1 nrep
1τ(n,Qiter)
B
<τ(n,Qiter)
A
ϕ(X (n,Qiter)). where R(q) := 1 −
Kq+1 nrep = number of active replicas before splitting number of active replica after splitting .
Key remark: Kq+1 ≥ k, but it may happen that Kq+1 > k. Even Kq+1 = nrep may happen: extinction, simulation is stopped, ˆ ψ = 0.
Theorem (B., Gazeau, Goudenège, Lelièvre, Rousset 2016)
Assume that Qiter < +∞ almost surely, and that ϕ : P → R is bounded. Then ˆ ψ is an unbiased estimator of E[1τB<τAϕ(X)], for any choice of the reaction coordinate ξ and of nrep and k: E ˆ ψ
SLIDE 26 Towards proof: a more general statement
More generally: at each iteration q, the algorithm provides an unbiased estimator of E[ψ(X)], of the form ˆ ψ(q) =
G (n,q)ψ(X (n,q)) where
◮ all the replicas computed in the algorithm are taken into account
(including passive).
◮ weights have the form
G (n,q) = 1 nrep R(0) . . . R(Q(n)∧q−1) where Q(n) is the iteration when the replica with label n is made passive. We recover last theorem from this general statement:
◮ if ψ = 1τB<τAϕ, only active replicas can contribute. ◮ q → Qiter: martingale and stopping time argument.
SLIDE 27 Proof: Level = new ”time index“
◮ First key point: define the doubly indexed filtration:
Fq,z = σ
t
, t ≤ inf
t′
) > max(z, Z (q))
, which is all the information on trajectories at iteration of q, up to strict entry in level max(z, Z (q))).
◮ Second key point: check that at iteration q, all replicas are conditionally
independent given Fq,z with distribution given by the Markov transition.
◮ Third key point: define the appropriate martingales
Mq,z(ψ) := E
n∈I (q)
G (n,q)ψ(X (n,q))
with respect to the above filtration. Using the martingale property gives the unbiasedness of our estimator. Adaptive case: Z (q) and Qiter are stopping times.
SLIDE 28 Efficiency analysis
Simplification: k = 1, ϕ = 1. Then ˆ ψ = ˆ p is an estimator of p = P(τB < τA).
Theorem (Cérou, Delyon, Guyader, Rousset, preprint 2016)
When nrep → ∞, there is a Central Limit Theorem result: √nrep(ˆ p − p) →
nrep→∞ N
Moreover, σ2(ξ) ∈ [−p2 log(p), 2p(1 − p)]. The asymptotic variance σ2(ξ) is minimized when choosing the committor function ξ(x) = ξ(x) = Px(τB < τA) = P(τB < τA|X0 = x). Note that we are estimating p = ξ(x0), so not very useful in practice. This
- bservation is the general rule in variance reduction strategies.
SLIDE 29 Efficiency analysis
Simplification: k = 1, ϕ = 1. Then ˆ ψ = ˆ p is an estimator of p = P(τB < τA).
Theorem (Cérou, Delyon, Guyader, Rousset, preprint 2016)
When nrep → ∞, there is a Central Limit Theorem result: √nrep(ˆ p − p) →
nrep→∞ N
Moreover, σ2(ξ) ∈ [−p2 log(p), 2p(1 − p)]. The asymptotic variance σ2(ξ) is minimized when choosing the committor function ξ(x) = ξ(x) = Px(τB < τA) = P(τB < τA|X0 = x). Note that we are estimating p = ξ(x0), so not very useful in practice. This
- bservation is the general rule in variance reduction strategies.
When k = 1, with ξ = ξ, and neglecting the time discretization issues, proofs
- f such results follow from elementary arguments: ideal case
◮ ˆ
p =
n
Qiter
◮ Qiter follows a Poisson distribution of parameter −n log(p).
SLIDE 30 Numerical simulations
Practical consequences of the unbiasedness:
◮ the algorithm is easy to parallelize: large number N of independent
Monte-Carlo realizations, each with a “small” number of interacting replicas nrep.
◮ comparison of the results obtained with different reaction coordinates ξ
helps gain confidence in the estimation (overlap of confidence intervals). Main numerical results:
◮ Test case in dimension 1: importance of the good implementation to deal
with equalities of scores, for discrete time Markov chains. We have checked unbiasedness, changing nrep and k.
◮ Test cases in dimension 2: behaviour of the statistical error when the
reaction coordinate ξ changes, at fixed nrep. Apparent bias phenomenon.
[Glasserman,Heidelberger, Shahabuddin, Zajic, 1998]
This is not due to the smallness of the probability, but to the presence of multiple channels.
◮ Test case in infinite dimension: Allen-Cahn stochastic PDE.
SLIDE 31 Equalities of scores: example (nrep = 4)
The level Z (q) is given by the green replica. The replica selected at the splitting step is the blue replica.
0.5 1 1.5 2 −1 −0.5 0.5 1 1.5
SLIDE 32 Equalities of scores: example (nrep = 4)
The partial resampling step produces the black replica.
0.5 1 1.5 2 −1 −0.5 0.5 1 1.5
The blue and the black replicas have the same score.
SLIDE 33 1D example: Brownian drift
[Rolland, Simmonet 2015], [B. et al. 2015, Esaim Proc.]
Brownian drift dynamics: dXt = −dt +
Euler-Maruyama discretization, ∆t = 0.1: Xi+1 = Xi − ∆t +
Initial condition: X0 = 1. Estimated probability: p = P(τ1.9 < τ0.1)
- ù τ1.9 = inf{i ≥ 1, Xi > 1.9} et τ0.1 = inf{i ≥ 1, Xi < 0.1}.
Empirical average over N = 6.106 realizations of the AMS algorithm: pN = 1
N
N
m=1 ˆ
pn,k
m
Relative error ǫN.
SLIDE 34
Numerical results: β = 8
AMS + Monte-Carlo, N = 6.106: n 10 50 50 100 k 1 1 10 1 pN 3.597 10−4 3.596 10−4 3.596 10−4 3.597 10−4 ǫN 0.09 % 0.03 % 0.03 % 0.02 % Crude Monte-Carlo experiment N′ = 6.108: pCMC
N′
= 3.595 10−4 , ǫN′ = 0.2%. With natural but biased versions (k = 1) version AMS(+) AMS(+) AMS(−) n 10 100 100 pN 2.958 10−4 3.257 10−4 1.741 10−4
SLIDE 35
Numerical results, β = 24
AMS + Monte-Carlo, N = 6.106: n 10 50 50 100 k 1 1 10 1 pN 1.20 10−10 1.22 10−10 1.21 10−10 1.21 10−10 ǫN 6 % 0.6 % 0.8 % 0.2 % No crude Monte-Carlo reference. With natural but biased versions (k = 1) version AMS(+) AMS(+) AMS(−) n 10 100 100 pN 5.079 10−11 6.049 10−11 1.397 10−12 Taking care of equalities of scores is crucial to get unbiased estimators. With an unappropriate implementation: bias, with several orders of magnitude.
SLIDE 36 2D example: potential function V : R2 → R, with multiple reactive trajectories
[Park, Sener, Lu, Schulten, 2003] [Metzner, Schütte and Vanden-Eijnden, 2006], [Cérou, Guyader, Lelièvre, Pommier 2011], [Rolland, Simmonet 2015]
−1 −0.5 0.5 1 1.5 2 −1.5 −1 −0.5 0.5 1 1.5 1 2 3 4 5 6 7 8 y x
∴Χ ∴Χ % & ∋
◮ Dynamics: dXt = −∇V (Xt)dt +
- 2β−1dWt (discrete version: h = 0.05).
◮ A = small ball centered at xA = (−1, 0) =front deepest energy well. ◮ B = small ball centered at xB = (+1, 0) =back deepest energy well. ◮ Minimal energy path visits C =side energy well.
SLIDE 37 2D example
−1 −0.5 0.5 1 1.5 2 −1.5 −1 −0.5 0.5 1 1.5 1 2 3 4 5 6 7 8 y x
∴Χ ∴Χ % & ∋
Three examples of reaction coordinates:
- 1. ξ1(x) = x1 =absissa,
- 2. ξ2(x) = −x − xB =norm to final point B,
- 3. ξ3(x, y) = x − xA =norm to initial point A.
SLIDE 38 Monte-Carlo experiment, 2D example
Estimation of p = P(τB < τA). AMS algorithm: nrep = 100, k = 1. Each independent Monte-Carlo realization provides ˆ pm, for 1 ≤ m ≤ Nmax = 106. The empirical average for N ≤ Nmax realizations: pN = 1 N
N
ˆ pm. Confidence interval: [pN − δN, pN + δN], with empirical standard deviation δN = 1.96 √ N ×
N
N
(ˆ pm)2 − (pN)2 We represent the evolution of pN − δN, pN, pN + δN when 1 ≤ N ≤ Nmax varies.
SLIDE 39 Numerical results, 2D example
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10
6
−1 −0.5 0.5 1 1.5 2 x 10
−9
Abscissa Norm to final point Norm to initial point
Figure : β = 9.33.
SLIDE 40 Numerical results, 2D example
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10
6
−4 −2 2 4 6 x 10
−10
Abscissa Norm to final point Norm to initial point
Figure : β = 10.
SLIDE 41 Numerical results, 2D example
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10
6
−1 −0.5 0.5 1 1.5 2 x 10
−9
Abscissa Norm to final point Norm to initial point 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10
6
−4 −2 2 4 6 x 10
−10
Abscissa Norm to final point Norm to initial point
Interpretation:
◮ For N sufficiently large, the condidence interval overlap. ◮ For N too small, this may be false: the reaction coordinate plays a role on
the fluctuations of the empirical average pN around p. Terminology: “apparent bias”. This happens when:
◮ Multiple pathways exist to go from A to B. ◮ For given ξ and nrep, for most realizations, all replicas active at the end go
through only one of the two channels.
◮ One of this scenarios is rare. ◮ The values of ˆ
p associated to each of these two scenarios are very different.
◮ Practical consequence of the unbiasedness result: changing the reaction
coordinate is essential to gain confidence in the numerical results.
SLIDE 42 Going further: Allen-Cahn SPDE (B. & al. 2015, Bouchet et al. 2015)
∂u ∂t = γ ∂2u ∂x2 − ∇V(u) +
W (t, x) Potential function: V(u) = u4
4 − u2 2 . Reaction coordinate: ξ(u) =
1
0 u(x)dx.
The shape of the reactive trajectories from u = −1 to u = +1 depends on γ. Left: γ = 0.5. Right: γ = 0.05. β = 300, discretization in dimension 500.
SLIDE 43 Apparent bias phenomenon in a related 2D model
−1 −0.5 0.5 1 −1 −0.5 0.5 1 1 2 3 4 5 6 7 8 9 −1 −0.5 0.5 1 −1 −0.5 0.5 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8
Figure : Potential Vγ. Left: γ = 1. Right: γ = 0.1.
Simulation using AMS with 4 reaction coordinates. In the two channels case (right), one of them breaks the symmetry of the channels: ξ(x, y) = x (abscissa).
SLIDE 44 Apparent bias phenomenon in a related 2D model
◮ With one channel (left): good behavior even for an extremely small
p ≈ 10−9.
◮ The apparent bias phenomenon appears only when there are at least two
channels.
1 2 3 4 5 6 7 8 9 10 x 10
5
1.005 1.01 1.015 1.02 1.025 1.03 1.035 1.04 1.045 x 10
−9
Abscissa Norm to final point Norm to initial point Magnetisation 1 2 3 4 5 6 7 8 9 10 x 10
5
4 5 6 7 8 9 10 11 x 10
−9
Abscissa Norm to final point Norm to initial point Magnetisation
Figure : Parameters: nrep = 100, k = 1. β = 80. Left: γ = 1. Right: γ = 0.1.
SLIDE 45 Application in physics: estimation of return times
Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.
Time-series of a stationary ergodic process:
2 × 102 4 × 102 6 × 102 8 × 102 103
t
−4σ −2σ 2σ 4σ
A(t)
Rare event = ⇒ (approximately) Poisson statistics The (average) return time r(a) between two events “crossing the threshold a” satisfies pT(a) = P
0≤t≤TX(t) > a
− T
r(a) ,
for τc ≤ T ≪ r(a): we consider block maxima.
SLIDE 46 Application in physics: estimation of return times
Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.
Time-series of a stationary ergodic process:
2 × 102 4 × 102 6 × 102 8 × 102 103
t
−4σ −2σ 2σ 4σ
A(t)
Rare event = ⇒ (approximately) Poisson statistics The (average) return time r(a) between two events “crossing the threshold a” satisfies pT(a) = P
0≤t≤TX(t) > a
− T
r(a) ,
for τc ≤ T ≪ r(a): we consider block maxima. Estimator: ˆ r(a) = −T log(1 − ˆ pT(a)). Not unbiased: E[ˆ r(a)] = r(a), but consistent.
SLIDE 47 Application in physics: estimation of return times
Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.
Idea: apply the AMS algorithm to estimate ˆ p(a). More generally (justified by the Theorem above):
◮ one (or few) realization(s) of the AMS algorithm gives estimators
ˆ p(x),ˆ r(x) of p(x), r(x) for x ≤ a,
◮ instead of giving a, fix a maximum return time r = R.
Exemple of plot:
100 105 1010 1015
r(a)
2σ 4σ 6σ
a
AMS Theory Direct sampling
Dynamics: Ornstein-Uhlenbeck dXt = −Xtdt + σdW (t), σ =
1 √ 2.
Parameters: 100 replicas in AMS, 100 realizations, a = 5, T = 5τc.
SLIDE 48
Application in physics: estimation of return times
Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.
Yet another plot:
100 105 1010 1015
r(a)
2σ 4σ 6σ 8σ
a GKTL AMS Direct sampling
For time-averaged positions X θ(t) = 1
θ
t+θ
t
X(s)ds. Parameters for AMS: 100 replicas, a = 2, T = 50. Comparison with the Giardina-Kurchan-Tailleur-Lecomte algorithm. Cost: O(106τc). Cost of the direct numerical simulation: time-series of length 109τc.
SLIDE 49 Conclusion
Discussed today:
◮ Unbiased estimators in a Markov chain setting. ◮ Practical consequences:
◮ We can choose different number of replicas and different reaction
coordinates without changing the expectation of the estimator.
◮ We can interpret and deal with the apparent bias phenomenon.
◮ An application: estimation of return times.
Main reference:
C.-E. Bréhier, M. Gazeau, L. Goudenège, T. Lelièvre, and M. Rousset. Unbiasedness of some generalized adaptive multilevel splitting algorithms.
- Ann. Appl. Probab., 26(6):3559–3601, 2016.
Perspectives:
◮ Provide an efficiency analysis in the general framework. ◮ Propose new algorithms which fit into the general framework, and new
applications.
Thanks for your attention.