Simulation of rare events by Adaptive Multilevel Splitting - - PowerPoint PPT Presentation

simulation of rare events by adaptive multilevel
SMART_READER_LITE
LIVE PREVIEW

Simulation of rare events by Adaptive Multilevel Splitting - - PowerPoint PPT Presentation

Simulation of rare events by Adaptive Multilevel Splitting algorithms Charles-Edouard Brhier CNRS & Institut Camille Jordan, Universit Lyon 1 Joint works with M. Gazeau (Borealis AI, Toronto) , L. Goudenge (CNRS, Centrale Paris) , T.


slide-1
SLIDE 1

Simulation of rare events by Adaptive Multilevel Splitting algorithms Charles-Edouard Bréhier

CNRS & Institut Camille Jordan, Université Lyon 1

Joint works with

  • M. Gazeau (Borealis AI, Toronto), L. Goudenège (CNRS, Centrale Paris),
  • T. Lelièvre (Ecole des Ponts, Paris), M. Rousset (Inria, Rennes)
  • F. Bouchet, C. Herbert, T. Lestang (ENS Lyon, Physics Dept), F. Ragone

(Department of Earth and Environmental Sciences, Milano)

slide-2
SLIDE 2

Motivations: rare event in metastable dynamics

Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points.

slide-3
SLIDE 3

Motivations: rare event in metastable dynamics

Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points. Examples:

◮ SDE: dXt = −∇V (Xt)dt +

√ 2ǫdW (t)

Double-well potential: V (x) = x4

4 − x2 2 .

slide-4
SLIDE 4

Motivations: rare event in metastable dynamics

Goal: simulation of metastable stochastic processes = small noise perturbations of dynamics having multiple stationary points. Exemples:

◮ ◮ SPDE: Allen-Cahn equation

∂u(t, x) ∂t = γ∆u(t, x) − ∇V (u(t, x)) + √ 2ǫξ(t, x)

slide-5
SLIDE 5

Transitions between metastable states

One goal is to estimate p = Px0(τB < τA)

A B

⊥ΧΘΕ∴

where A and B are metastable states.

◮ Direct numerical simulations: prohibitive, need N ∼ Cp−1 independent

realizations.

◮ Analytic approach: Large Deviations, Potential Theory.

In the limit ǫ → 0, ǫ log(p(ǫ)) →

ǫ→0 −C

, p(ǫ) ∼

ǫ→0 ce−C/ǫ.

Transition times of the order eC/ǫ.

slide-6
SLIDE 6

Splitting algorithms

Introduced in the 1950s:

  • H. Kahn and T. E. Harris.

Estimation of particle transmission by random sampling. National Bureau of Standards, 12:27–30, 1951.

slide-7
SLIDE 7

Splitting algorithms

Introduced in the 1950s:

  • H. Kahn and T. E. Harris.

Estimation of particle transmission by random sampling. National Bureau of Standards, 12:27–30, 1951.

Revival in the 1990s and 2000s:

  • P. Glasserman, P. Heidelberger, P. Shahabuddin, and T. Zajic.

Multilevel splitting for estimating rare event probabilities.

  • Oper. Res., 47(4):585–600, 1999.

with many variants:

  • M. Villén-Altamirano and J. Villén-Altamirano.

RESTART: A method for accelerating rare events simulations. In Proceeding of the thirteenth International Teletraffic Congress, volume Copenhagen, Denmark, June 19-26 of Queueing, performance and control in ATM: ITC-13 workshops, pages 71–76. North-Holland, Amsterdam-New York, 1991.

  • S. K. Au and J. L. Beck.

Estimation of small failure probabilities in high dimensions by subset simulation. Journal of Probabilistic Engineering Mechanics, 16:263–277, 2001.

  • J. Skilling.

Nested sampling for general Bayesian computation. Bayesian Anal., 1(4):833–859 (electronic), 2006.

  • P. Del Moral, A. Doucet, and A. Jasra.

Sequential Monte Carlo samplers.

  • J. R. Stat. Soc. Ser. B Stat. Methodol., 68(3):411–436, 2006.
slide-8
SLIDE 8

Adaptive Multilevel Splitting algorithms

Introduced in

  • F. Cérou and A. Guyader.

Adaptive multilevel splitting for rare event analysis.

  • Stoch. Anal. Appl., 25(2):417–443, 2007.

Analysis of the method: series of recent works (non exhaustive)

  • E. Simonnet.

Combinatorial analysis of the adaptive last particle method. Statistics and Computing, pages 1–20, 2014. C.-E. Bréhier, T. Lelièvre, and M. Rousset. Analysis of adaptive multilevel splitting algorithms in an idealized setting. ESAIM Probability and Statistics, 19:361–394, 2015.

  • F. Cérou and A. Guyader.

Fluctuation analysis of adaptive multilevel splitting.

  • Ann. Appl. Probab., 26(6):3319–3380, 2016.

C.-E. Bréhier, M. Gazeau, L. Goudenège, T. Lelièvre, and M. Rousset. Unbiasedness of some generalized adaptive multilevel splitting algorithms.

  • Ann. Appl. Probab., 26(6):3559–3601, 2016.
slide-9
SLIDE 9

Applications of AMS algorithms

  • J. Rolland and E. Simonnet.

Statistical behaviour of adaptive multilevel splitting algorithms in simple models. Journal of Computational Physics, 283:541 – 558, 2015.

  • D. Aristoff, T. Lelièvre, C. G. Mayne, and I. Teo.

Adaptive multilevel splitting in molecular dynamics simulations. In CEMRACS 2013—modelling and simulation of complex systems: stochastic and deterministic approaches, volume 48 of ESAIM Proc. Surveys, pages 215–225. EDP Sci., Les Ulis, 2015.

Several recently defended and ongoing PhD Thesis:

  • R. Poncet (Polytechnique 10-2017), H. Louvin (CEA 10-2017), L. J. Lopes

(Ponts-Sanofi), T. Lestang (ENS Lyon, Physics).

  • H. Louvin, E. Dumonteil, T. Lelièvre, M. Rousset, and C. M. Diop.

Adaptive multilevel splitting for monte carlo particle transport. In EPJ Web of Conferences, volume 153, page 06006. EDP Sciences, 2017.

  • L. J. Lopes, C. Chipot, and T. Lelièvre.

Adaptive multilevel splitting method: Isomerization of the alanine dipeptide. 2017.

Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.

slide-10
SLIDE 10

Principle of splitting algorithms

Principles: iterative algorithm

◮ System of interacting replicas: copies of the Markov trajectories are

simulated (in parallel).

◮ Adaptive Splitting: selection of replicas using a score function. ◮ Partial resampling: new copies are simulated, by branching replicas

(mutation).

◮ Weighting: remain consistent when removing/creating replicas.

An important property: the (Markov) model is a black box. Simpler setting: estimation of p = P(Y > a). with

◮ Y positive real-valued random variable (with continuous cdf) ◮ a threshold.

slide-11
SLIDE 11

General notions: estimators, consistency, efficiency

An estimator of θ ∈ R is a random variable ˆ θ.

slide-12
SLIDE 12

General notions: estimators, consistency, efficiency

An estimator of θ ∈ R is a random variable ˆ θ. About consistency:

◮ The bias of the estimator is defined as b(ˆ

θ) = E[ˆ θ] − θ.

◮ We say that ˆ

θ is an unbiased estimator of θ if b(ˆ θ) = 0.

◮ Let

ˆ θN

  • N∈N be a sequence of estimators of θ. It is called consistent if

ˆ θN →

N→∞ θ in probability. It is called asymptotically unbiased if

b(ˆ θN) →

N→∞ 0.

slide-13
SLIDE 13

General notions: estimators, consistency, efficiency

An estimator of θ ∈ R is a random variable ˆ θ. About consistency:

◮ The bias of the estimator is defined as b(ˆ

θ) = E[ˆ θ] − θ.

◮ We say that ˆ

θ is an unbiased estimator of θ if b(ˆ θ) = 0.

◮ Let

ˆ θN

  • N∈N be a sequence of estimators of θ. It is called consistent if

ˆ θN →

N→∞ θ in probability. It is called asymptotically unbiased if

b(ˆ θN) →

N→∞ 0.

About efficiency: mean-square error E|ˆ θ − θ|2 =

θ − θ 2 + E|ˆ θ − Eˆ θ|2 = b(ˆ θ)2 + Var(ˆ θ) sum of bias squared (consistency error) and variance (statistical error). Example: ˆ mN = 1 N

N

  • n=1

Zn with i.i.d.

  • Zn
  • n∈N is a consistent unbiased estimator of m = E[Z1].

Statistical error: Var( ˆ mN) = Var(Z1)

N

. More precisely: Central Limit Theorem.

slide-14
SLIDE 14

The rare event case

Estimation of p = P(X ∈ A) = E[1X∈A]? Direct approach: crude Monte-Carlo simulation Take Z = 1X∈A, and i.i.d. realizations Z1, . . . , Zn = 1Xn∈A, . . ., and compute the empirical average ˆ pN = 1 N

N

  • n=1

Zn = 1 N

N

  • n=1

1Xn∈A. Then ˆ pN is a consistent unbiased estimator of p = E[Z]. Relative error:

  • E|ˆ

pN − p|2 p =

  • p(1 − p)

p √ N ∼

p→0

1 √pN .

slide-15
SLIDE 15

The rare event case

Estimation of p = P(X ∈ A) = E[1X∈A]? Direct approach: crude Monte-Carlo simulation Take Z = 1X∈A, and i.i.d. realizations Z1, . . . , Zn = 1Xn∈A, . . ., and compute the empirical average ˆ pN = 1 N

N

  • n=1

Zn = 1 N

N

  • n=1

1Xn∈A. Then ˆ pN is a consistent unbiased estimator of p = E[Z]. Relative error:

  • E|ˆ

pN − p|2 p =

  • p(1 − p)

p √ N ∼

p→0

1 √pN . Variance reduction strategy: find a family of simulatable estimators ˆ θ to reduce the cost. Requirements (to also remain unbiased) Var(ˆ θ) ≪ Var(ˆ pN) , E[ˆ θ] = p where simulations of ˆ θ and ˆ pN have the same average cost.

slide-16
SLIDE 16

Splitting algorithm

Case study: p = P(Y > a), with Y > 0. Splitting: p =

N

  • j=1

P

  • Y > aj
  • Y > aj−1
  • =

N

  • j=1

pj with a non-decreasing sequence of levels 0 = a0 < a1 < . . . < aN−1 < aN = a. Estimator: ˆ p = N

j=1 pj,nrep, (N independent estimators, with nrep independent

replicas per level) is unbiased. Optimal choice of levels aj: such that p1 = . . . = pN = p1/N. Relative error, N → +∞: ∼ √

| log(p)| √ N

.

slide-17
SLIDE 17

Splitting algorithm

Case study: p = P(Y > a), with Y > 0. Splitting: p =

N

  • j=1

P

  • Y > aj
  • Y > aj−1
  • =

N

  • j=1

pj with a non-decreasing sequence of levels 0 = a0 < a1 < . . . < aN−1 < aN = a. Estimator: ˆ p = N

j=1 pj,nrep, (N independent estimators, with nrep independent

replicas per level) is unbiased. Optimal choice of levels aj: such that p1 = . . . = pN = p1/N. Relative error, N → +∞: ∼ √

| log(p)| √ N

. Adaptive version: computes a random sequence Z 0, . . . of levels with pj = 1 −

k nrep .

slide-18
SLIDE 18

The Markov chain setting

Model: X =

  • Xt
  • t∈N is a Markov chain on a state space S.

(Example: Xn+1 = Xn − h∇V (Xn) +

  • 2β−1hwn explicit Euler-Maruyama discretization of a

Stochastic Differential Equation dXt = −∇V (Xt)dt +

  • 2β−1dWt)

Goal: Estimate with Monte-Carlo simulation the average E(ψ(Xt, t ≥ 0)), with ψ(Xt, t ≥ 0) = 1τB<τAϕ(Xt, t ≥ 0), where τC = inf {t|Xt ∈ C ⊂ S}, C = A or B. (ex: metastable states) Rare event regime: P {τB < τA} ≪ 1

A B

⊥ΧΘΕ∴

slide-19
SLIDE 19

(Generalized) Adaptive Multilevel Splitting: the abstract setting

Goal: Build unbiased estimators of E

  • 1τB<τAϕ(Xt, t ≥ 0)
  • .

Key tool: A given reaction coordinate: ξ : S → R. Only requirement: there is a zmax ∈ R, such that B ⊂ {ξ(·) > zmax}. Terminology: z = ξ(x) is called the level of the state x ∈ S. score(Xt, t ≥ 0) := supt≤τA ξ(Xt) is the score of the trajectory (Xt)t≥0.

slide-20
SLIDE 20

(Generalized) Adaptive Multilevel Splitting: the abstract setting

Goal: Build unbiased estimators of E

  • 1τB<τAϕ(Xt, t ≥ 0)
  • .

Key tool: A given reaction coordinate: ξ : S → R. Only requirement: there is a zmax ∈ R, such that B ⊂ {ξ(·) > zmax}. Terminology: z = ξ(x) is called the level of the state x ∈ S. score(Xt, t ≥ 0) := supt≤τA ξ(Xt) is the score of the trajectory (Xt)t≥0. Algorithm: based on:

◮ Multiple Replicas: several copies of the Markov chain are simulated in

parallel.

◮ Adaptive Splitting: splitting is performed for replicas with highest scores

(selection of replicas).

◮ Partial re-sampling: trajectories of new replicas are re-sampled partially

(branching) forward in time (mutation of replicas). Formally: kernel (z, x) ∈ R × SN → πz(x, ·) ∈ Proba(SN)

slide-21
SLIDE 21

(Generalized) Adaptive Multilevel Splitting: algorithm

Parameters:(Picture’s parameters are in red)

◮ nrep(= 3): number of replicas. ◮ k(= 1) ∈ {1, . . . , nrep − 1}: rank of order statistic.

Initialization: q = 0

◮ Sample nrep i.i.d. replicas

  • X (n,0)

n∈I (0) with labels I (0) = {1, · · · , nrep}

according to given Markov transitions.

◮ Compute Z (0) := the k-th order statistic of the sample scores . ◮ Set: Active replicas = I (0) >Z(0) =

  • n ∈ I (0)|s(X (n)) > Z (0)

, the labels of replicas with score strictly higher than Z (0).

A B

>ΒΥ

4ΕΩΩΜΖΙ %ΓΞΜΖΙ

⊥ΧΘΕ∴

slide-22
SLIDE 22

(Generalized) Adaptive Multilevel Splitting: algorithm

Iterate on q starting from q = 0. Iteration step 0: Stopping criterion

◮ If Z (q) > zmax or if I (q) >Z(q) = ∅ (extinction), stop and set Qiter = q.

Iteration step 1: Splitting

◮ Create independently Kq+1 = nrep − card(I (q) >Z(q))(= 1) new replicas, with

states chosen uniformly among active replicas, i.e. with labels in I (q)

>Z(q).

Thus: nrep(= 3) replicas are now active.

◮ Denote by I (q+1) the new total set of labels (active and passive).

A B

>ΒΥ

4ΕΩΩΜΖΙ %ΓΞΜΖΙ

⊥ΧΘΕ∴

A B

>ΒΥ 2Ι[ ⊥ΧΘΕ∴

slide-23
SLIDE 23

(Generalized) Adaptive Multilevel Splitting: algorithm

Iteration step 2: Partial re-sampling of trajectories For each newly created replica, independently:

◮ Start from the level entry state XτZ(q)

where τz := inf(t ≥ 0|ξ(Xt) > z).

◮ Complete the trajectory using the Markov chain transition. ◮ Re-index replicas with q + 1 instead of q.

A B

>ΒΥ 2Ι[ ⊥ΧΘΕ∴

slide-24
SLIDE 24

(Generalized) Adaptive Multilevel Splitting: algorithm

Iteration step 3: Level computation

◮ Define Z (q+1) as the k(= 1)-th order statistic of the active and new

replicas scores

◮ Update the set of active replicas:

I (q+1)

>Z(q+1) =

  • n ∈ I (q+1)|s(X (n)) > Z (q+1)

.

A B

>ΒΥ >ΒΥ

Update q := q + 1.

slide-25
SLIDE 25

Main result: Unbiased estimators

Estimator of E[1τB<τAϕ(X)]: ˆ ψ = R(0) × . . . × R(Qiter−1) 1 nrep

  • n∈I (Qiter)

1τ(n,Qiter)

B

<τ(n,Qiter)

A

ϕ(X (n,Qiter)). where R(q) := 1 −

Kq+1 nrep = number of active replicas before splitting number of active replica after splitting .

Key remark: Kq+1 ≥ k, but it may happen that Kq+1 > k. Even Kq+1 = nrep may happen: extinction, simulation is stopped, ˆ ψ = 0.

Theorem (B., Gazeau, Goudenège, Lelièvre, Rousset 2016)

Assume that Qiter < +∞ almost surely, and that ϕ : P → R is bounded. Then ˆ ψ is an unbiased estimator of E[1τB<τAϕ(X)], for any choice of the reaction coordinate ξ and of nrep and k: E ˆ ψ

  • = E[1τB<τAϕ(X)].
slide-26
SLIDE 26

Towards proof: a more general statement

More generally: at each iteration q, the algorithm provides an unbiased estimator of E[ψ(X)], of the form ˆ ψ(q) =

  • n∈I (q)

G (n,q)ψ(X (n,q)) where

◮ all the replicas computed in the algorithm are taken into account

(including passive).

◮ weights have the form

G (n,q) = 1 nrep R(0) . . . R(Q(n)∧q−1) where Q(n) is the iteration when the replica with label n is made passive. We recover last theorem from this general statement:

◮ if ψ = 1τB<τAϕ, only active replicas can contribute. ◮ q → Qiter: martingale and stopping time argument.

slide-27
SLIDE 27

Proof: Level = new ”time index“

◮ First key point: define the doubly indexed filtration:

Fq,z = σ

  • X (n,q)

t

, t ≤ inf

  • t′|ξ(X (n,q)

t′

) > max(z, Z (q))

  • , n ∈ I (q)

, which is all the information on trajectories at iteration of q, up to strict entry in level max(z, Z (q))).

◮ Second key point: check that at iteration q, all replicas are conditionally

independent given Fq,z with distribution given by the Markov transition.

◮ Third key point: define the appropriate martingales

Mq,z(ψ) := E  

n∈I (q)

G (n,q)ψ(X (n,q))

  • Fq,z

  with respect to the above filtration. Using the martingale property gives the unbiasedness of our estimator. Adaptive case: Z (q) and Qiter are stopping times.

slide-28
SLIDE 28

Efficiency analysis

Simplification: k = 1, ϕ = 1. Then ˆ ψ = ˆ p is an estimator of p = P(τB < τA).

Theorem (Cérou, Delyon, Guyader, Rousset, preprint 2016)

When nrep → ∞, there is a Central Limit Theorem result: √nrep(ˆ p − p) →

nrep→∞ N

  • 0, σ2(ξ)
  • .

Moreover, σ2(ξ) ∈ [−p2 log(p), 2p(1 − p)]. The asymptotic variance σ2(ξ) is minimized when choosing the committor function ξ(x) = ξ(x) = Px(τB < τA) = P(τB < τA|X0 = x). Note that we are estimating p = ξ(x0), so not very useful in practice. This

  • bservation is the general rule in variance reduction strategies.
slide-29
SLIDE 29

Efficiency analysis

Simplification: k = 1, ϕ = 1. Then ˆ ψ = ˆ p is an estimator of p = P(τB < τA).

Theorem (Cérou, Delyon, Guyader, Rousset, preprint 2016)

When nrep → ∞, there is a Central Limit Theorem result: √nrep(ˆ p − p) →

nrep→∞ N

  • 0, σ2(ξ)
  • .

Moreover, σ2(ξ) ∈ [−p2 log(p), 2p(1 − p)]. The asymptotic variance σ2(ξ) is minimized when choosing the committor function ξ(x) = ξ(x) = Px(τB < τA) = P(τB < τA|X0 = x). Note that we are estimating p = ξ(x0), so not very useful in practice. This

  • bservation is the general rule in variance reduction strategies.

When k = 1, with ξ = ξ, and neglecting the time discretization issues, proofs

  • f such results follow from elementary arguments: ideal case

◮ ˆ

p =

  • 1 − 1

n

Qiter

◮ Qiter follows a Poisson distribution of parameter −n log(p).

slide-30
SLIDE 30

Numerical simulations

Practical consequences of the unbiasedness:

◮ the algorithm is easy to parallelize: large number N of independent

Monte-Carlo realizations, each with a “small” number of interacting replicas nrep.

◮ comparison of the results obtained with different reaction coordinates ξ

helps gain confidence in the estimation (overlap of confidence intervals). Main numerical results:

◮ Test case in dimension 1: importance of the good implementation to deal

with equalities of scores, for discrete time Markov chains. We have checked unbiasedness, changing nrep and k.

◮ Test cases in dimension 2: behaviour of the statistical error when the

reaction coordinate ξ changes, at fixed nrep. Apparent bias phenomenon.

[Glasserman,Heidelberger, Shahabuddin, Zajic, 1998]

This is not due to the smallness of the probability, but to the presence of multiple channels.

◮ Test case in infinite dimension: Allen-Cahn stochastic PDE.

slide-31
SLIDE 31

Equalities of scores: example (nrep = 4)

The level Z (q) is given by the green replica. The replica selected at the splitting step is the blue replica.

0.5 1 1.5 2 −1 −0.5 0.5 1 1.5

slide-32
SLIDE 32

Equalities of scores: example (nrep = 4)

The partial resampling step produces the black replica.

0.5 1 1.5 2 −1 −0.5 0.5 1 1.5

The blue and the black replicas have the same score.

slide-33
SLIDE 33

1D example: Brownian drift

[Rolland, Simmonet 2015], [B. et al. 2015, Esaim Proc.]

Brownian drift dynamics: dXt = −dt +

  • 2β−1dBt.

Euler-Maruyama discretization, ∆t = 0.1: Xi+1 = Xi − ∆t +

  • 2∆tβ−1Gi

Initial condition: X0 = 1. Estimated probability: p = P(τ1.9 < τ0.1)

  • ù τ1.9 = inf{i ≥ 1, Xi > 1.9} et τ0.1 = inf{i ≥ 1, Xi < 0.1}.

Empirical average over N = 6.106 realizations of the AMS algorithm: pN = 1

N

N

m=1 ˆ

pn,k

m

Relative error ǫN.

slide-34
SLIDE 34

Numerical results: β = 8

AMS + Monte-Carlo, N = 6.106: n 10 50 50 100 k 1 1 10 1 pN 3.597 10−4 3.596 10−4 3.596 10−4 3.597 10−4 ǫN 0.09 % 0.03 % 0.03 % 0.02 % Crude Monte-Carlo experiment N′ = 6.108: pCMC

N′

= 3.595 10−4 , ǫN′ = 0.2%. With natural but biased versions (k = 1) version AMS(+) AMS(+) AMS(−) n 10 100 100 pN 2.958 10−4 3.257 10−4 1.741 10−4

slide-35
SLIDE 35

Numerical results, β = 24

AMS + Monte-Carlo, N = 6.106: n 10 50 50 100 k 1 1 10 1 pN 1.20 10−10 1.22 10−10 1.21 10−10 1.21 10−10 ǫN 6 % 0.6 % 0.8 % 0.2 % No crude Monte-Carlo reference. With natural but biased versions (k = 1) version AMS(+) AMS(+) AMS(−) n 10 100 100 pN 5.079 10−11 6.049 10−11 1.397 10−12 Taking care of equalities of scores is crucial to get unbiased estimators. With an unappropriate implementation: bias, with several orders of magnitude.

slide-36
SLIDE 36

2D example: potential function V : R2 → R, with multiple reactive trajectories

[Park, Sener, Lu, Schulten, 2003] [Metzner, Schütte and Vanden-Eijnden, 2006], [Cérou, Guyader, Lelièvre, Pommier 2011], [Rolland, Simmonet 2015]

−1 −0.5 0.5 1 1.5 2 −1.5 −1 −0.5 0.5 1 1.5 1 2 3 4 5 6 7 8 y x

∴Χ ∴Χ % & ∋

◮ Dynamics: dXt = −∇V (Xt)dt +

  • 2β−1dWt (discrete version: h = 0.05).

◮ A = small ball centered at xA = (−1, 0) =front deepest energy well. ◮ B = small ball centered at xB = (+1, 0) =back deepest energy well. ◮ Minimal energy path visits C =side energy well.

slide-37
SLIDE 37

2D example

−1 −0.5 0.5 1 1.5 2 −1.5 −1 −0.5 0.5 1 1.5 1 2 3 4 5 6 7 8 y x

∴Χ ∴Χ % & ∋

Three examples of reaction coordinates:

  • 1. ξ1(x) = x1 =absissa,
  • 2. ξ2(x) = −x − xB =norm to final point B,
  • 3. ξ3(x, y) = x − xA =norm to initial point A.
slide-38
SLIDE 38

Monte-Carlo experiment, 2D example

Estimation of p = P(τB < τA). AMS algorithm: nrep = 100, k = 1. Each independent Monte-Carlo realization provides ˆ pm, for 1 ≤ m ≤ Nmax = 106. The empirical average for N ≤ Nmax realizations: pN = 1 N

N

  • m=1

ˆ pm. Confidence interval: [pN − δN, pN + δN], with empirical standard deviation δN = 1.96 √ N ×

  • 1

N

N

  • m=1

(ˆ pm)2 − (pN)2 We represent the evolution of pN − δN, pN, pN + δN when 1 ≤ N ≤ Nmax varies.

slide-39
SLIDE 39

Numerical results, 2D example

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10

6

−1 −0.5 0.5 1 1.5 2 x 10

−9

Abscissa Norm to final point Norm to initial point

Figure : β = 9.33.

slide-40
SLIDE 40

Numerical results, 2D example

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10

6

−4 −2 2 4 6 x 10

−10

Abscissa Norm to final point Norm to initial point

Figure : β = 10.

slide-41
SLIDE 41

Numerical results, 2D example

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10

6

−1 −0.5 0.5 1 1.5 2 x 10

−9

Abscissa Norm to final point Norm to initial point 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 x 10

6

−4 −2 2 4 6 x 10

−10

Abscissa Norm to final point Norm to initial point

Interpretation:

◮ For N sufficiently large, the condidence interval overlap. ◮ For N too small, this may be false: the reaction coordinate plays a role on

the fluctuations of the empirical average pN around p. Terminology: “apparent bias”. This happens when:

◮ Multiple pathways exist to go from A to B. ◮ For given ξ and nrep, for most realizations, all replicas active at the end go

through only one of the two channels.

◮ One of this scenarios is rare. ◮ The values of ˆ

p associated to each of these two scenarios are very different.

◮ Practical consequence of the unbiasedness result: changing the reaction

coordinate is essential to gain confidence in the numerical results.

slide-42
SLIDE 42

Going further: Allen-Cahn SPDE (B. & al. 2015, Bouchet et al. 2015)

∂u ∂t = γ ∂2u ∂x2 − ∇V(u) +

  • 2β−1 ˙

W (t, x) Potential function: V(u) = u4

4 − u2 2 . Reaction coordinate: ξ(u) =

1

0 u(x)dx.

The shape of the reactive trajectories from u = −1 to u = +1 depends on γ. Left: γ = 0.5. Right: γ = 0.05. β = 300, discretization in dimension 500.

slide-43
SLIDE 43

Apparent bias phenomenon in a related 2D model

−1 −0.5 0.5 1 −1 −0.5 0.5 1 1 2 3 4 5 6 7 8 9 −1 −0.5 0.5 1 −1 −0.5 0.5 1 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8

Figure : Potential Vγ. Left: γ = 1. Right: γ = 0.1.

Simulation using AMS with 4 reaction coordinates. In the two channels case (right), one of them breaks the symmetry of the channels: ξ(x, y) = x (abscissa).

slide-44
SLIDE 44

Apparent bias phenomenon in a related 2D model

◮ With one channel (left): good behavior even for an extremely small

p ≈ 10−9.

◮ The apparent bias phenomenon appears only when there are at least two

channels.

1 2 3 4 5 6 7 8 9 10 x 10

5

1.005 1.01 1.015 1.02 1.025 1.03 1.035 1.04 1.045 x 10

−9

Abscissa Norm to final point Norm to initial point Magnetisation 1 2 3 4 5 6 7 8 9 10 x 10

5

4 5 6 7 8 9 10 11 x 10

−9

Abscissa Norm to final point Norm to initial point Magnetisation

Figure : Parameters: nrep = 100, k = 1. β = 80. Left: γ = 1. Right: γ = 0.1.

slide-45
SLIDE 45

Application in physics: estimation of return times

Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.

Time-series of a stationary ergodic process:

2 × 102 4 × 102 6 × 102 8 × 102 103

t

−4σ −2σ 2σ 4σ

A(t)

Rare event = ⇒ (approximately) Poisson statistics The (average) return time r(a) between two events “crossing the threshold a” satisfies pT(a) = P

  • max

0≤t≤TX(t) > a

  • = 1 − e

− T

r(a) ,

for τc ≤ T ≪ r(a): we consider block maxima.

slide-46
SLIDE 46

Application in physics: estimation of return times

Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.

Time-series of a stationary ergodic process:

2 × 102 4 × 102 6 × 102 8 × 102 103

t

−4σ −2σ 2σ 4σ

A(t)

Rare event = ⇒ (approximately) Poisson statistics The (average) return time r(a) between two events “crossing the threshold a” satisfies pT(a) = P

  • max

0≤t≤TX(t) > a

  • = 1 − e

− T

r(a) ,

for τc ≤ T ≪ r(a): we consider block maxima. Estimator: ˆ r(a) = −T log(1 − ˆ pT(a)). Not unbiased: E[ˆ r(a)] = r(a), but consistent.

slide-47
SLIDE 47

Application in physics: estimation of return times

Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.

Idea: apply the AMS algorithm to estimate ˆ p(a). More generally (justified by the Theorem above):

◮ one (or few) realization(s) of the AMS algorithm gives estimators

ˆ p(x),ˆ r(x) of p(x), r(x) for x ≤ a,

◮ instead of giving a, fix a maximum return time r = R.

Exemple of plot:

100 105 1010 1015

r(a)

2σ 4σ 6σ

a

AMS Theory Direct sampling

Dynamics: Ornstein-Uhlenbeck dXt = −Xtdt + σdW (t), σ =

1 √ 2.

Parameters: 100 replicas in AMS, 100 realizations, a = 5, T = 5τc.

slide-48
SLIDE 48

Application in physics: estimation of return times

Collaboration with F. Bouchet, C. Herbert, T. Lestang, F. Ragone.

Yet another plot:

100 105 1010 1015

r(a)

2σ 4σ 6σ 8σ

a GKTL AMS Direct sampling

For time-averaged positions X θ(t) = 1

θ

t+θ

t

X(s)ds. Parameters for AMS: 100 replicas, a = 2, T = 50. Comparison with the Giardina-Kurchan-Tailleur-Lecomte algorithm. Cost: O(106τc). Cost of the direct numerical simulation: time-series of length 109τc.

slide-49
SLIDE 49

Conclusion

Discussed today:

◮ Unbiased estimators in a Markov chain setting. ◮ Practical consequences:

◮ We can choose different number of replicas and different reaction

coordinates without changing the expectation of the estimator.

◮ We can interpret and deal with the apparent bias phenomenon.

◮ An application: estimation of return times.

Main reference:

C.-E. Bréhier, M. Gazeau, L. Goudenège, T. Lelièvre, and M. Rousset. Unbiasedness of some generalized adaptive multilevel splitting algorithms.

  • Ann. Appl. Probab., 26(6):3559–3601, 2016.

Perspectives:

◮ Provide an efficiency analysis in the general framework. ◮ Propose new algorithms which fit into the general framework, and new

applications.

Thanks for your attention.