Fluid Limits for some MCMC samplers Gersende FORT, CNRS, Paris, - - PowerPoint PPT Presentation

fluid limits for some mcmc samplers
SMART_READER_LITE
LIVE PREVIEW

Fluid Limits for some MCMC samplers Gersende FORT, CNRS, Paris, - - PowerPoint PPT Presentation

Fluid Limits for some MCMC samplers Gersende FORT, CNRS, Paris, France. Joint work with Sean MEYN (Univ. of Illinois, Urbana, USA), Eric MOULINES (GET, France), Pierre PRIOURET (University Paris VI, France). Outline of the talk


slide-1
SLIDE 1

Fluid Limits for some MCMC samplers

◮ Gersende FORT,

CNRS, Paris, France. Joint work with

◮ Sean MEYN

(Univ. of Illinois, Urbana, USA),

◮ Eric MOULINES

(GET, France),

◮ Pierre PRIOURET

(University Paris VI, France).

slide-2
SLIDE 2

Outline of the talk

We are interested in

◮ the existence + stability of the fluid limits for skip free Markov

Chains.

◮ their use in the study of (some) MCMC samplers.

slide-3
SLIDE 3

Outline of the talk

We are interested in

◮ the existence + stability of the fluid limits for skip free Markov

Chains.

◮ their use in the study of (some) MCMC samplers.

We will discuss

  • 1. Fluid Limits for skip-free Markov Chains.

◮ the existence of fluid limits ◮ their characterization ◮ their stability and the stability of the Markov Chain.

  • 2. Applications to Metropolis-Hastings Markov Chains

◮ Convergence of the samplers ◮ How to tune the parameters ?

slide-4
SLIDE 4

MCMC samplers / Hastings-Metropolis

Sample from a (complex, unnormalized) distribution π on Rd when exact sampling is not possible : Define a Markov Chain (Φn, n ≥ 0), with unique stationary distribution ∝ π and ergodic.

slide-5
SLIDE 5

MCMC samplers / Hastings-Metropolis

Sample from a (complex, unnormalized) distribution π on Rd when exact sampling is not possible : Define a Markov Chain (Φn, n ≥ 0), with unique stationary distribution ∝ π and ergodic.

  • Ex. Hastings-Metropolis algorithm

Given Φt, define Φt+1 by · Φt+1/2 ∼ Q(Φt, ·). · Φt+1 = Φt+1/2 with prob. α(Φt, Φt+1/2) Φt with prob. 1 − α(Φt, Φt+1/2) , where α(x, z) = 1 ∧ π(z)Q(z,x)

π(x)Q(x,z).

slide-6
SLIDE 6

MCMC samplers / Hastings-Metropolis

Problems :

◮ (⋆) Convergence ? (ergodicity)

κ(n) |Ex [g(Φn)] − π(g)| → 0 ∀x, g ∈?

◮ Limit Theorems

n−1

n

  • k=1

g(Φk) →a.s. π(g) 1 √n

n

  • k=1

{g(Φk) − π(g)} →d N(0, σ2

g). ◮ (⋆) How to tune the parameters i.e. (here) the proposal kernel Q(x, y)

slide-7
SLIDE 7

MCMC samplers / Hastings-Metropolis

Problems :

◮ (⋆) Convergence ? (ergodicity)

κ(n) |Ex [g(Φn)] − π(g)| → 0 ∀x, g ∈?

◮ Limit Theorems

n−1

n

  • k=1

g(Φk) →a.s. π(g) 1 √n

n

  • k=1

{g(Φk) − π(g)} →d N(0, σ2

g). ◮ (⋆) How to tune the parameters i.e. (here) the proposal kernel Q(x, y)

Hereafter, illustrations in the case · symmetric HM : Q(x, y) = q(|x − y|) · q(z) ∼ σ Nd(0, I)[z]

slide-8
SLIDE 8

Existence of fluid limits (a)

֒ → Define a normalized process (i) in the initial point ηr(0; x) = 1 r Φ0 = x, Φ0 = rx. (ii) in time and space ηr(t; x) = 1 r Φ⌊tr⌋, ηr(t; x) = 1 r Φk

  • n

k r ; (k + 1) r

  • .
slide-9
SLIDE 9

Existence of fluid limits (a)

֒ → Define a normalized process (i) in the initial point ηr(0; x) = 1 r Φ0 = x, Φ0 = rx. (ii) in time and space ηr(t; x) = 1 r Φ⌊tr⌋, ηr(t; x) = 1 r Φk

  • n

k r ; (k + 1) r

  • .

◮ Distributions · Px : distribution of the Markov Chain with initial distribution δx. · Qr;x : image prob. of Px by ηr(·; x)

  • prob. on the space of

c` ad-l` ag functions R+ → X.

slide-10
SLIDE 10

Existence of fluid limits (b)

◮ D´ efinition : Qx is a fluid limitif there exists {rn}n → +∞, {xn}n → x s.t. Qrn;xn = ⇒ Qx

  • n the space of the c`

ad-l` ag functions R+ → X.

slide-11
SLIDE 11

Existence of fluid limits (b)

◮ D´ efinition : Qx is a fluid limitif there exists {rn}n → +∞, {xn}n → x s.t. Qrn;xn = ⇒ Qx

  • n the space of the c`

ad-l` ag functions R+ → X.

Φk+1 = Φk + E [Φk+1|Fk] − Φk + Φk+1 − E [Φk+1|Fk]

slide-12
SLIDE 12

Existence of fluid limits (b)

◮ D´ efinition : Qx is a fluid limitif there exists {rn}n → +∞, {xn}n → x s.t. Qrn;xn = ⇒ Qx

  • n the space of the c`

ad-l` ag functions R+ → X.

Φk+1 = Φk + E [Φk+1|Fk] − Φk + Φk+1 − E [Φk+1|Fk] = Φk + Ex [Φk+1 − Φk|Fk]

  • ∆(Φk)

+ (Φk+1 − Ex [Φk+1|Fk])

  • ǫk+1

martingale increment

.

slide-13
SLIDE 13

Existence of fluid limits (b)

◮ D´ efinition : Qx is a fluid limitif there exists {rn}n → +∞, {xn}n → x s.t. Qrn;xn = ⇒ Qx

  • n the space of the c`

ad-l` ag functions R+ → X.

Φk+1 = Φk + E [Φk+1|Fk] − Φk + Φk+1 − E [Φk+1|Fk] = Φk + Ex [Φk+1 − Φk|Fk]

  • ∆(Φk)

+ (Φk+1 − Ex [Φk+1|Fk])

  • ǫk+1

martingale increment

. ◮ Result if · ∃p > 1, limK→+∞ supx∈X Ex

  • |ǫ1|p1

I|ǫ1|>K

  • → 0.

· supx∈X |∆(x)| < ∞. Then fluid limits exist, prob. on the space of continuous functions (whatever the

initial point on the unit sphere)

slide-14
SLIDE 14

Example 1 : (regular case)

−15 −10 −5 5 10 15 −15 −10 −5 5 10 15 Level curves of the target density

−0.2 0.2 0.4 0.6 0.8 1 1.2 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −0.2 0.2 0.4 0.6 0.8 1 1.2 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −0.2 0.2 0.4 0.6 0.8 1 1.2 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

π(x, y) ∝ (1 + x2 + y2 + x8y2) exp(−(x2 + y2)), q ∼ N (0, 4), r=100, r=1000, r=5000

slide-15
SLIDE 15

Example 2 : (irregular case)

−15 −10 −5 5 10 15 −15 −10 −5 5 10 15 Level curves of the target density

−0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2

π(x, y) ∝ N (0, Γ−1 1 ) + N (0, Γ−1 2 ), q ∼ N (0, 1), r=100, r=1000, r=5000

slide-16
SLIDE 16

Characterisation of the fluid limits

֒ → Can we describe the distributions Qx ?

−0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2 π(x, y) ∝ mixture of Gaussian, q ∼ N (0, I), r=5000 T=5

slide-17
SLIDE 17

Characterization (b)

Φk+1 = Φk + (Ex [Φk+1|Fk] − Φk)

  • ∆(Φk)

+ (Φk+1 − Ex [Φk+1|Fk])

  • ǫk+1martingale increment

◮ For the normalized process ηr k + 1 r , x

  • = 1

r Φk+1 = ηr k r , x

  • + 1

r ∆

  • r ηr

k r , x

  • + 1

r ǫk+1 = ηr k r , x

  • + 1

r h

  • ηr

k r , x

  • + 1

r (ξk + ǫk+1) where h (x) = lim

r→+∞ ∆(r x).

slide-18
SLIDE 18

Characterization (b)

Φk+1 = Φk + (Ex [Φk+1|Fk] − Φk)

  • ∆(Φk)

+ (Φk+1 − Ex [Φk+1|Fk])

  • ǫk+1martingale increment

◮ For the normalized process ηr k + 1 r , x

  • = 1

r Φk+1 = ηr k r , x

  • + 1

r ∆

  • r ηr

k r , x

  • + 1

r ǫk+1 = ηr k r , x

  • + 1

r h

  • ηr

k r , x

  • + 1

r (ξk + ǫk+1) where h (x) = lim

r→+∞ ∆(r x).

◮ Thus the dynamic µ k + 1 r

  • = µ

k r

  • + 1

r h k r

→ ODE : ˙ µ(t) = h(µ(t)) in an additive noise.

slide-19
SLIDE 19

Characterisation (c)

◮ Theorem If · Existence of the fluid limit. · there exists an open cone O de X \ {0}, · h : O → X s.t. sup

x∈H

  • rβ∆(rx) − |x|−βh(x)
  • → 0,

r → +∞, for any compact H ⊆ O

slide-20
SLIDE 20

Characterisation (c)

◮ Theorem If · Existence of the fluid limit. · there exists an open cone O de X \ {0}, · h : O → X s.t. sup

x∈H

  • rβ∆(rx) − |x|−βh(x)
  • → 0,

r → +∞, for any compact H ⊆ O Then for all 0 ≤ s ≤ t, on {η, η(u) ∈ O, s ≤ u ≤ t}, sup

s≤u≤t

  • η(u) − η(s) −

u

s

h ◦ η(v) dv

  • = 0,

x − a.s.

slide-21
SLIDE 21

Characterisation (c)

◮ Theorem If · Existence of the fluid limit. · there exists an open cone O de X \ {0}, · h : O → X s.t. sup

x∈H

  • rβ∆(rx) − |x|−βh(x)
  • → 0,

r → +∞, for any compact H ⊆ O Then for all 0 ≤ s ≤ t, on {η, η(u) ∈ O, s ≤ u ≤ t}, sup

s≤u≤t

  • η(u) − η(s) −

u

s

h ◦ η(v) dv

  • = 0,

x − a.s.

◮ i.e. the fluid limit Qβ

x is a Dirac mass at the point η satisfying

η(u) = η(s) + u

s

h ◦ η(v) dv, s ≤ u ≤ t, whenever η([s, t]) ⊂ O.

slide-22
SLIDE 22

Example 3 : Super-exponential case, O = X \ {0}

−15 −10 −5 5 10 15 −15 −10 −5 5 10 15 Level curves of the target density 5 10 15 20 25 30 35 40 −10 −5 5 10 15 20 25 30 35 40 Courbes de niveau de la densite

−4 −2 2 4 6 8 10 12 14 −10 −8 −6 −4 −2 2 4 6 8 10 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2

UpperLeft- Level curves of π UpperRight- Rejection area LowerLeft- Level curves, ∆ and h LowerRight- Process ηβ x and flow of the ODE.

slide-23
SLIDE 23

Example 4 : Super-exponential case, O X \ {0}

−15 −10 −5 5 10 15 −15 −10 −5 5 10 15 Level curves of the target density 2 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5 6 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2

Level curves of π Level curves, ∆ and h Process ηβ x and flow of the ODE.

slide-24
SLIDE 24

Example 4 : Super-exponential case, O X \ {0}

−15 −10 −5 5 10 15 −15 −10 −5 5 10 15 Level curves of the target density 2 2.5 3 3.5 4 4.5 5 5.5 6 2 2.5 3 3.5 4 4.5 5 5.5 6 −0.2 0.2 0.4 0.6 0.8 1 1.2 −0.2 0.2 0.4 0.6 0.8 1 1.2

Level curves of π Level curves, ∆ and h Process ηβ x and flow of the ODE.

֒ → There exists T0 < ∞ s.t. for all x ∈ X, |x| = 1, and any fluid limit Qx, Qx (η, η([0, T0]) ∩ O = ∅) = 1.

slide-25
SLIDE 25

[Appl 1] Stability : fluid limit → Markov Chain

◮ A fluid limit is stable if ∃ T > 0 and 0 < ρ < 1, s.t. ∀ x, |x| = 1, Qx

  • η, inf

[0,T ] |η(·)| ≤ ρ

  • = 1.
slide-26
SLIDE 26

[Appl 1] Stability : fluid limit → Markov Chain

◮ A fluid limit is stable if ∃ T > 0 and 0 < ρ < 1, s.t. ∀ x, |x| = 1, Qx

  • η, inf

[0,T ] |η(·)| ≤ ρ

  • = 1.

◮ If · irreducible, aperiodic, compact sets are petite. · Existence of the fluid limits. · Stability of the fluid limits. Then polynomial ergodicity, (n + 1)q−1 sup

{f,|f|≤1+|x|p−q}

|Ex[f(Φn)] − π(f)| → 0, 1 ≤ q ≤ p.

slide-27
SLIDE 27

[Appl 1] Stability : fluid limit → Markov Chain

◮ A fluid limit is stable if ∃ T > 0 and 0 < ρ < 1, s.t. ∀ x, |x| = 1, Qx

  • η, inf

[0,T ] |η(·)| ≤ ρ

  • = 1.

◮ If · irreducible, aperiodic, compact sets are petite. · Existence of the fluid limits. · Stability of the fluid limits. Then polynomial ergodicity, (n + 1)q−1 sup

{f,|f|≤1+|x|p−q}

|Ex[f(Φn)] − π(f)| → 0, 1 ≤ q ≤ p. ◮ Stability ODE / Stability fluid limit (i) Case 1 : O = X \ {0} ∃ T, ρ ∀ x, |x| = 1, inf

[0,T ] |µ(·; x)| ≤ ρ < 1.

(ii) Case 2 : O = X \ {0} and Qx (η, η([0, T0]) ∩ O = ∅) = 1. ∀K > 0, ∃ TK, ρK ∀ x ∈ O, |x| ≤ K, inf

[0,TK∧Tx] |µ(·; x)| ≤ ρK < 1.

slide-28
SLIDE 28

[Appl 2] How to choose the parameters of the algorithms ?

◮ Hybrid Hastings-Metropolis :

P(x, dy) =

d

  • k=1

ωk Pk(x, dy)

d

  • k=1

ωk = 1. · choose a direction i ∈ {1, · · · , d} with prob. ωi. · update the component i-th with a R-valued HM (proposal N(0, σ2

i )). ◮ Under conditions · · ·

hi(x) = 1 √ 2π ωi σi sign

  • lim

r→∞

∇i ln π(rx) |∇ ln π(rx)|

  • .

◮ “Parameters” : (ωk, σk)1≤k≤d, for ex.

ωi = 1 d σi = ci or σi = ℓ lim

r→∞

|∇i ln π(rx)| |∇ ln π(rx)| .

slide-29
SLIDE 29

Example 5 : Gaussian R2, diagonal dispersion matrix

Level curves of the target density −10 −8 −6 −4 −2 2 4 6 8 10 −10 −8 −6 −4 −2 2 4 6 8 10

−1 −0.5 0.5 1 −0.2 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −0.2 0.2 0.4 0.6 0.8 1

Γ = diag(1, 4)

slide-30
SLIDE 30

Example 6 : Gaussian R2, non-diagonal dispersion matrix

Level curves of the target density −10 −8 −6 −4 −2 2 4 6 8 10 −10 −8 −6 −4 −2 2 4 6 8 10 −1 −0.5 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

Γ−1 = 3 1 1 2

slide-31
SLIDE 31

Example 7 : Gaussian R2, non-diagonal dispersion matrix

Level curves of the target density −10 −8 −6 −4 −2 2 4 6 8 10 −10 −8 −6 −4 −2 2 4 6 8 10 −1 −0.5 0.5 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

−1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1 −1 −0.8 −0.6 −0.4 −0.2 0.2 0.4 0.6 0.8 1

Γ−1 ∝ 1 −0.9 −0.9 1

  • σ2

1 = 0.5

σ2

2 = 2/0.9

slide-32
SLIDE 32

Conclusion

◮ Existence of fluid limits for skip free Markov Chains. ◮ [Not Detailed] Case when for some 0 < β < 1,

ηr(t; x) = 1 r Φ⌊tr1+β⌋, ֒ → ergodicity at a lower rate.

◮ Characterization of the limit fluid ◮ Stable fluid limits → Ergodic Markov Chains, but · · · ◮ more information on the Markov Chain · · · other normalization

(diffusion) ?