An introduction to particle rare event simulation P. Del Moral - - PowerPoint PPT Presentation

an introduction to particle rare event simulation p del
SMART_READER_LITE
LIVE PREVIEW

An introduction to particle rare event simulation P. Del Moral - - PowerPoint PPT Presentation

An introduction to particle rare event simulation P. Del Moral INRIA Bordeaux- Sud Ouest & IMB & CMAP Computation of transition trajectories and rare events in non equilibrium systems , ENS Lyon, June 2012 Some hyper-refs


slide-1
SLIDE 1

An introduction to particle rare event simulation

  • P. Del Moral

INRIA Bordeaux- Sud Ouest & IMB & CMAP Computation of transition trajectories and rare events in non equilibrium systems , ENS Lyon, June 2012

Some hyper-refs

Feynman-Kac formulae, Genealogical & Interacting Particle Systems with appl., Springer (2004)

Sequential Monte Carlo Samplers JRSS B. (2006). (joint work with A. Doucet & A. Jasra)

A Backward Particle Interpretation of Feynman-Kac Formulae M2AN (2010). (joint work with A. Doucet & S.S. Singh)

On the concentration of interacting processes. Foundations & Trends in Machine Learning [170p.] (2012). (joint work with Peng Hu & Liming Wu) [+ Refs]

More references on the website : Feynman-Kac models and particle systems [+ Links]

slide-2
SLIDE 2

Introduction Feynman-Kac models Some rare event models Stochastic analysis

slide-3
SLIDE 3

Introduction Some basic notation Importance sampling Acceptance-rejection samplers Feynman-Kac models Some rare event models Stochastic analysis

slide-4
SLIDE 4

Basic notation

P(E) probability meas., B(E) bounded functions on E.

◮ (µ, f ) ∈ P(E) × B(E)

− → µ(f ) =

  • µ(dx) f (x)

◮ Q(x1, dx2) integral operators x1 ∈ E1 x2 ∈ E2

Q(f )(x1) =

  • Q(x1, dx2)f (x2)

[µQ](dx2) =

  • µ(dx1)Q(x1, dx2)

(= ⇒ [µQ](f ) = µ[Q(f )] )

◮ Boltzmann-Gibbs transformation

[Positive and bounded potential function G] µ(dx) → ΨG(µ)(dx) = 1 µ(G) G(x) µ(dx)

slide-5
SLIDE 5

Importance sampling and optimal twisted measures

P(X ∈ A) = PX(A) = 10−10 Find PY t.q. PY (A) = P(Y ∈ A) ≃ 1 Crude Monte Carlo sampling Y i i.i.d. PY PY dPX dPY 1A

  • = PX(A) ≃ PN

X(A) := 1

N

  • 1≤i≤N

dPX dPY (Y i) 1A(Y i) Optimal twisted measure = Conditional distribution Variance = 0 ⇐ ⇒ PY = Ψ1A (PX) = Law (X | X ∈ A) ⇓ Perfect or MCMC samplers =acceptance-rejection techniques BUT Very often with very small acceptance rates

slide-6
SLIDE 6

Conditional distributions and Feynman-Kac models

Example : Markov chain models Xn restricted to subsets An X = (X0, . . . , Xn) ∈ A = (A0 × . . . × An) Conditional distributions Law (X | X ∈ A) = Law((X0, . . . , Xn) | Xp ∈ Ap, p < n) = Qn and Proba(Xp ∈ Ap, p < n) = Zn

slide-7
SLIDE 7

Conditional distributions and Feynman-Kac models

Example : Markov chain models Xn restricted to subsets An X = (X0, . . . , Xn) ∈ A = (A0 × . . . × An) Conditional distributions Law (X | X ∈ A) = Law((X0, . . . , Xn) | Xp ∈ Ap, p < n) = Qn and Proba(Xp ∈ Ap, p < n) = Zn given by the Feynman-Kac measures dQn := 1 Zn   

  • 0≤p<n

Gp(Xp)    dPn with Pn = Law(X0, . . . , Xn) and Gp = 1Ap, p < n

slide-8
SLIDE 8

Introduction Feynman-Kac models Nonlinear evolution equation Interacting particle samplers Continuous time models Particle estimates Some rare event models Stochastic analysis

slide-9
SLIDE 9

Feynman-Kac models (general Gn(Xn) & Xn ∈ En)

Flow of n-marginals ηn(f ) = γn(f )/γn(1) with γn(f ) := E  f (Xn)

  • 0≤p<n

Gp(Xp)   ⇓ (γn(1) = Zn) Nonlinear evolution equation : ηn+1 = ΨGn(ηn)Mn+1 Zn+1 = ηn(Gn) × Zn with the Markov transitions Mn+1(xn, dxn+1) = P(Xn+1 ∈ dxn+1 | Xn = xn) Note : [Xn = (X ′

0, . . . , X ′ n) & Gn(Xn) = G ′(X ′ n)] =

⇒ ηn = Q′

n

slide-10
SLIDE 10

Interacting particle samplers

Nonlinear evolution equation : ηn+1 = ΨGn(ηn)Mn+1 Zn+1 = ηn(Gn) × Zn Sequential particle simulation technique Mn-propositions ⊕ Gn-acceptance-rejection with recycling

  • Genetic type branching particle model

ξn = (ξi

n)1≤i≤N Gn−selection

− − − − − − − → ξn = ( ξi

n)1≤i≤N Mn−mutation

− − − − − − − → ξn+1 = (ξi

n+1)1≤i≤N

Note : [Xn = (X ′

0, . . . , X ′ n) & Gn(Xn) = G ′(X ′ n)] =

⇒ Genealogical tree model

slide-11
SLIDE 11

⊃ Continuous time models ⊃ Langevin diffusions

Xn := X ′

[tn,tn+1[

& Gn(Xn) = exp tn+1

tn

Vs(X ′

s)ds

OR Euler approximations (Langevin diff. Metropolis-Hasting moves) OR Fully continuous time particle models Schr¨

  • dinger operators

d dt γt(f ) = γt(LV

t (f ))

with LV

t = L′ t + Vt

γt(1) = E

  • exp

t Vs(X ′

s)ds

  • = exp

t ηs(Vs)ds with ηt = γt/γt(1)

slide-12
SLIDE 12

⊃ Continuous time models ⊃ Langevin diffusions

Xn := X ′

[tn,tn+1[

& Gn(Xn) = exp tn+1

tn

Vs(X ′

s)ds

OR Euler approximations (Langevin diff. Metropolis-Hasting moves) OR Fully continuous time particle models Schr¨

  • dinger operators

d dt γt(f ) = γt(LV

t (f ))

with LV

t = L′ t + Vt

γt(1) = E

  • exp

t Vs(X ′

s)ds

  • = exp

t ηs(Vs)ds with ηt = γt/γt(1) Master equation ηt = Law(X t) ⇒

d dt ηt(f ) = ηt(Lt,ηt(f ))

(ex. : Vt = −Ut ≤ 0) Lt,ηt(f )(x) = L′

t(f )(x)

  • free exploration

+ Ut(x)

acceptance rate

  • (f (y) − f (x))

ηt(dy)

interacting jump law

⇓ Particle model: Survival-acceptance rates ⊕ Recycling jumps

slide-13
SLIDE 13

Genealogical tree evolution (N, n) = (3, 3)

  • ✲ •
  • ✲ • = •
  • ✲ •

✲ ✲

  • = •
  • ✲ •

✲ ✲

  • = •

Some particle estimates (δa(dx) ↔ δ(x − a) dx)

◮ Individuals ξi

n ”almost” iid with law ηn ≃ ηN n = 1 N

  • 1≤i≤N δξi

n

◮ Ancestral lines ”almost” iid with law Qn ≃ 1

N

  • 1≤i≤N δlinen(i)

◮ Normalizing constants

Zn+1 =

  • 0≤p≤n

ηp(Gp) ≃N↑∞ ZN

n+1 =

  • 0≤p≤n

ηN

p (Gp)

(Unbiased)

slide-14
SLIDE 14

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-15
SLIDE 15

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-16
SLIDE 16

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-17
SLIDE 17

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-18
SLIDE 18

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-19
SLIDE 19

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-20
SLIDE 20

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-21
SLIDE 21

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-22
SLIDE 22

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-23
SLIDE 23

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-24
SLIDE 24

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-25
SLIDE 25

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-26
SLIDE 26

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-27
SLIDE 27

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-28
SLIDE 28

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-29
SLIDE 29

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-30
SLIDE 30

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-31
SLIDE 31

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-32
SLIDE 32

Graphical illustration : ηn ≃ ηN

n := 1 N

  • 1≤i≤N δξi

n

slide-33
SLIDE 33

How to use the full ancestral tree model ?

Gn−1(xn−1)Mn(xn−1, dxn)

hyp

= Hn(xn−1, xn) νn(dxn) ⇒ Backward Markov model : Qn(d(x0, . . . , xn)) = ηn(dxn) Mn,ηn−1(xn, dxn−1)

  • ∝ ηn−1(dxn−1) Hn(xn−1,xn)

. . . M1,η0(x1, dx0)

slide-34
SLIDE 34

How to use the full ancestral tree model ?

Gn−1(xn−1)Mn(xn−1, dxn)

hyp

= Hn(xn−1, xn) νn(dxn) ⇒ Backward Markov model : Qn(d(x0, . . . , xn)) = ηn(dxn) Mn,ηn−1(xn, dxn−1)

  • ∝ ηn−1(dxn−1) Hn(xn−1,xn)

. . . M1,η0(x1, dx0) Particle approximation QN

n (d(x0, . . . , xn)) = ηN n (dxn) Mn,ηN

n−1(xn, dxn−1) . . . M1,ηN 0 (x1, dx0)

Ex.: Additive functionals fn(x0, . . . , xn) =

1 n+1

  • 0≤p≤n fp(xp)

QN

n (fn) :=

1 n + 1

  • 0≤p≤n

ηN

n Mn,ηN

n−1 . . . Mp+1,ηN p (fp)

  • matrix operations
slide-35
SLIDE 35

Introduction Feynman-Kac models Some rare event models

Self avoiding walks Level crossing probabilities Particle absorption models Quasi-invariant measures Doob h-processes Semigroup gradient estimates Boltzmann-Gibbs measures

Stochastic analysis

slide-36
SLIDE 36

Self avoiding walks in Zd

Feynman-Kac model with Xn = (X0, . . . , Xn) & Gn(Xn) = 1Xn∈{X0,...,Xn−1}

  • Conditional distributions

Qn = Law ((X0, . . . , Xn) | Xp = Xq, ∀0 ≤ p < q < n) and Zn = Proba (Xp = Xq, ∀0 ≤ p < q < n)

slide-37
SLIDE 37

Level crossing probabilities (1)

P (Vn(Xn) ≥ a)

  • r

P (X hits An before B)

◮ Level crossing at a fixed given time

P (Vn(Xn) ≥ a) = E

  • fn(Xn) eVn(Xn)

= E  fn(Xn)

  • 0≤p<n

Gp(Xp)   with

◮ The Markov chain on transition space

Xn = (Xn, Xn+1) and Gn(Xn) = exp [Vn+1(Xn+1) − Vn(Xn)]

◮ The test functions

fn(Xn) = 1Vn(Xn)≥a e−Vn(Xn)

slide-38
SLIDE 38

Level crossing probabilities (2)

◮ Excursion level crossing An ↓, with B non critical recurrent subset.

P(X hits An before B) = E  

0≤p≤n

1Ap(XTp)   Tn := inf {p ≥ Tn−1 : Xp ∈ (An ∪ B)} Feynman-Kac model E  

0≤p≤n

1Ap(XTp)   = E  

0≤p<n

Gp(Xp)   with Xn = (Xp)p∈[Tn,Tn+1] & Gn(Xn) = 1An+1(XTn+1)

slide-39
SLIDE 39

Absorption models

◮ Sub-Markov semigroups

Qn(x, dy) = Gn−1(x) Mn(x, dy) E c

n = En ∪ {c}

◮ Absorbed Markov chain

X c

n ∈ E c n absorption ∼(1−Gn)

− − − − − − − − − − − − − − − − − − → X c

n exploration ∼Mn

− − − − − − − − − − − − − − − − − − → X c

n+1

⇓ Qn = Law((X c

0 , . . . , X c n ) | T absorption ≥ n)

and Zn = Proba

  • T absorption ≥ n
slide-40
SLIDE 40

Homogeneous models (Gn, Mn) = (G, M)

◮ Reversibility condition : µ(dx)M(x, dy) = µ(dy)M(y, dx)

Proba

  • T absorption ≥ n
  • ≃ λn

with λ = top eigenvalue of Q(x, dy) = G(x) M(x, dy)

◮ Q(h) = λh

◮ Quasi-invariant measure :

P(X c

n ∈ dx | T absorption > n) →n↑

1 µ(h) h(x) µ(dx)

◮ Doob h-process X h :

Mh(x, dy) = 1 λh−1(x)Q(x, dy)h(y) = Q(x, dy)h(y) Q(h)(x) = M(x, dy)h(y) M(h)(x)

slide-41
SLIDE 41

Homogeneous models (Gn, Mn) = (G, M)

Qn(d(x0, . . . , xn)) ∝ P((X h

0 , . . . , X h n ) ∈ d(x0, . . . , xn)) h−1(xn)

◮ Invariant measure µh = µhMh & normalized additive functionals

F n(x0, . . . , xn) = 1 n + 1

  • 0≤p≤n

f (xp) = ⇒ Qn(F n) ≃n µh(f )

◮ If G = G θ depends on some θ ∈ R f :=

∂ ∂θ log G θ

∂ ∂θ log λθ ≃n 1 n + 1 ∂ ∂θ log Zθ

n+1 = Qn(F n)

NB : Similar expression when Mθ depends on some θ ∈ R.

slide-42
SLIDE 42

Semigroup gradient estimates

Xn+1(x) = Fn(Xn(x), Wn)

  • X0(x) = x ∈ Rd

Pn(f )(x) := E (f (Xn(x))) First variational equation ∂Xn+1 ∂x (x) = An(x, Wn) ∂Xn ∂x (x) with A(i,j)

n

(x, w) = ∂Fi

n(., w)

∂xj (x) Random process on the sphere U0 = u0 ∈ Sd−1 Un+1 = An(Xn, Wn)Un/An(Xn, Wn)Un =

∂Xn ∂x (x) u0

  • ∂Xn

∂x (x) u0

  • Feynman-Kac model Xn = (Xn, Un, Wn) & Gn (x, u, w) = An(x, w) u

∇Pn+1(f )(x) u0 = E         F(Xn+1)

  • ∇f (Xn+1) Un+1
  • 0≤p≤n

Gp (Xp)

  • ∂Xn

∂x (x) u0

       

slide-43
SLIDE 43

Boltzmann-Gibbs measures

ηn(dx) := 1 Zn e−βnV (x) λ(dx) with βn ↑

◮ For any MCMC transition Mn with target ηn

ηn = ηnMn

◮ Updating of the temperature parameter

ηn+1 = ΨGn(ηn) with Gn = e−(βn+1−βn)V Proof : e−βn+1V = e−(βn+1−βn)V × e−βnV Consequence : ηn+1 = ηn+1Mn+1 = ΨGn(ηn)Mn+1 and (β0 = 0) λ

  • e−βnV

= Zn =

  • 0≤p<n

ηp(Gp)

slide-44
SLIDE 44

Restriction models

ηn(dx) := 1 Zn 1An(x) λ(dx) with An ↓

◮ For any MCMC transition Mn with target ηn

ηn = ηnMn

◮ Updating of the subset

ηn+1 = ΨGn(ηn) with Gn = 1An+1 Proof : 1An+1 = 1An+1 × 1An Consequence : ηn+1 = ηn+1Mn+1 = ΨGn(ηn)Mn+1 and (λ(A0) = 1) λ (An) = Zn =

  • 0≤p<n

ηp(Gp)

slide-45
SLIDE 45

Product models

ηn(dx) := 1 Zn n

  • p=0

hp(x)

  • λ(dx)

with hp ≥ 0

◮ For any MCMC transition Mn with target ηn = ηnMn. ◮ Updating of the product

ηn+1 = ΨGn(ηn) with Gn = hn+1 Proof : n+1

p=0 hp

  • = hn+1 ×

n

p=0 hp

  • Consequence :

ηn+1 = ηn+1Mn+1 = ΨGn(ηn)Mn+1 and (h0 = 1) λ n

  • p=0

hp

  • = Zn =
  • 0≤p<n

ηp(Gp)

slide-46
SLIDE 46

Introduction Feynman-Kac models Some rare event models Stochastic analysis A brief review on genetic style algorithms Stochastic linearization models Current population models Particle free energy Genealogical tree models Backward particle models

slide-47
SLIDE 47

Equivalent particle algorithms

Sequential Monte Carlo Sampling Resampling Particle Filters Prediction Updating Genetic Algorithms Mutation Selection Evolutionary Population Exploration Branching Diffusion Monte Carlo Free evolutions Absorption Quantum Monte Carlo Walkers motions Reconfiguration Sampling Algorithms Transition proposals Accept-reject-recycle

slide-48
SLIDE 48

Equivalent particle algorithms

Sequential Monte Carlo Sampling Resampling Particle Filters Prediction Updating Genetic Algorithms Mutation Selection Evolutionary Population Exploration Branching Diffusion Monte Carlo Free evolutions Absorption Quantum Monte Carlo Walkers motions Reconfiguration Sampling Algorithms Transition proposals Accept-reject-recycle More botanical names: bootstrapping, spawning, cloning, pruning, replenish, multi-level splitting, enrichment, go with the winner, . . . 1950 ≤ Heuristic style algo. ≤ 1996 ≤ Particle Feynman-Kac models Convergence analysis : CLT, LDP, Lp-estimates, Empirical processes, Moderate deviations, propagations of chaos, exact weak expansions, .... Concentration analysis = Exponential deviation proba. estimates

slide-49
SLIDE 49

Stochastic linearization/Mean field particle models

◮ Discrete time models (ηn = Law(X n))

ηn = ηn−1Kn,ηn−1 transition ξi

n ∼ Kn,ηN

n−1

  • ξi

n−1, dxn

  • ηN

n

= ηN

n−1Kn,ηN

n−1 +

1 √ N W N

n

Theo :

  • W N

n

  • n≥0 → (Wn)n≥0 ⊥ centered Gaussian fields
slide-50
SLIDE 50

Stochastic linearization/Mean field particle models

◮ Discrete time models (ηn = Law(X n))

ηn = ηn−1Kn,ηn−1 transition ξi

n ∼ Kn,ηN

n−1

  • ξi

n−1, dxn

  • ηN

n

= ηN

n−1Kn,ηN

n−1 +

1 √ N W N

n

Theo :

  • W N

n

  • n≥0 → (Wn)n≥0 ⊥ centered Gaussian fields

◮ Continuous time models (ηt = Law(X t))

d dt ηt(f ) = ηt(Lt,ηt(f )) generator ξi

t ∼ Lt,ηN

t

dηN

t (f )

= ηN

t (Lt,ηN

t (f )) dt +

1 √ N dMN

t (f )

Theo : MN

t (f ) → Mt Gaussian martingale with

dM(f )t = ηt(ΓLt,ηt (f , f )) dt

slide-51
SLIDE 51

Current population models

Constants (c1, c2) related to (bias,variance), c universal constant ⊥ time. Test funct. fn ≤ 1, ∀ (x ≥ 0, n ≥ 0, N ≥ 1).

◮ The probability of the event

  • ηN

n − ηn

  • (f ) ≤ c1

N

  • 1 + x + √x
  • + c2

√ N √x is greater than 1 − e−x.

◮ x = (xi)1≤i≤d (−∞, x] = d

i=1(−∞, xi] cells in En = Rd.

Fn(x) = ηn

  • 1(−∞,x]
  • and

F N

n (x) = ηN n

  • 1(−∞,x]
  • The probability of the following event

√ N

  • F N

n − Fn

  • ≤ c
  • d (x + 1)

is greater than 1 − e−x.

slide-52
SLIDE 52

Particle free energy models

Constants (c1, c2) related to (bias,variance), c universal constant ⊥ time ∀ (x ≥ 0, n ≥ 0, N ≥ 1)

◮ Unbiased property

E  ηN

n (fn)

  • 0≤p<n

ηN

p (Gp)

  = E  fn(Xn)

  • 0≤p<n

Gp(Xp)  

◮ For any ǫ ∈ {+1, −1}, the probability of the event

ǫ n log ZN

n

Zn ≤ c1 N

  • 1 + x + √x
  • + c2

√ N √x is greater than 1 − e−x. note (0 ≤ ǫ ≤ 1 ⇒ (1 − e−ǫ) ∨ (eǫ − 1) ≤ 2ǫ) e−ǫ ≤ zN z ≤ eǫ ⇒

  • zN

z − 1

  • ≤ 2ǫ
slide-53
SLIDE 53

Genealogical tree models := ηN

n (in path space) Constants (c1, c2) related to (bias,variance), c universal constant ⊥ time fn test function fn ≤ 1, ∀ (x ≥ 0, n ≥ 0, N ≥ 1) .

◮ The probability of the event

  • ηN

n − Qn

  • (f ) ≤ c1

n + 1 N

  • 1 + x + √x
  • + c2
  • (n + 1)

N √x is greater than 1 − e−x.

◮ Fn = indicator fct. fn of cells in En =

  • Rd0 × . . . , ×Rdn

The probability of the following event sup

fn∈Fn

  • ηN

n (fn) − Qn(fn)

  • ≤ c (n + 1)
  • 0≤p≤n dp

N (x + 1) is greater than 1 − e−x.

slide-54
SLIDE 54

Backward particle models

Constants (c1, c2) related to (bias,variance), c universal constant ⊥ time. fn normalized additive functional with fp ≤ 1, ∀ (x ≥ 0, n ≥ 0, N ≥ 1)

◮ The probability of the event

  • QN

n − Qn

  • (fn) ≤ c1

1 N (1 + (x + √x)) + c2

  • x

N(n + 1) is greater than 1 − e−x.

◮ fa,n normalized additive functional w.r.t. fp = 1(−∞,a], a ∈ Rd = En

. The probability of the following event sup

a∈Rd

  • QN

n (fa,n) − Qn(fa,n)

  • ≤ c
  • d

N (x + 1) is greater than 1 − e−x.