Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of - - PowerPoint PPT Presentation

scaling analysis of mcmc algorithms
SMART_READER_LITE
LIVE PREVIEW

Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of - - PowerPoint PPT Presentation

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold Scaling Analysis of MCMC algorithms Alexandre Thiry 1 1 University of Warwick MCQMC, February 2012 Collaboration with Andrew Stuart (Warwick), Gareth Roberts


slide-1
SLIDE 1

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Analysis of MCMC algorithms

Alexandre Thiéry1

1University of Warwick

MCQMC, February 2012 Collaboration with Andrew Stuart (Warwick), Gareth Roberts (Warwick), Natesh Pillai (Harvard) and Alex Beskos (UCL). Funded by CRISM

slide-2
SLIDE 2

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Outline

1

The Scaling Analysis Method

2

High Dimensional MCMC

3

Concentration near a manifold

slide-3
SLIDE 3

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Outline

1

The Scaling Analysis Method

2

High Dimensional MCMC

3

Concentration near a manifold

slide-4
SLIDE 4

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Purposes

Analysis of asymptotic complexity [Roberts and Co-workers, 1997] Avoid Spectral gaps, Log-Sobolev, etc ... Provide more intuition on behaviour of algorithms Easy-to-follow guidelines for tuning MCMC algorithms

slide-5
SLIDE 5

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Sequence of MCMC algorithms

Sequence of target distributions πα index by parameter α Sequence of MCMC proposals. (Almost always) local proposals of the form x⋆ = a(α) x + σ(α) Z Sequence of MCMC chains indexed by parameter α, xα = x1,α, x2,α, x3,α, . . . We are interested in the limit α → α∞

slide-6
SLIDE 6

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Example of Limiting Regime

α = dimension of the state space. Interest in α → α∞ = ∞. Consider target distribution with density of the form πα(x) ∝ exp

  • − Ψ(x)

α

  • Interest in α → α∞ = 0.
slide-7
SLIDE 7

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Interpolation

Choose a time discretisation parameter δ = δ(α) such that δ → 0 as α → α∞. Define the accelerated process zα by zα(t) = xt/δ(α),α

slide-8
SLIDE 8

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Limit

slide-9
SLIDE 9

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Limit

A scaling limit result is a theorem of the form Theorem (Scaling Limit) lim

α→α∞ zα = z

The convergence is on pathspace C([0, T], H). The limiting process is typically a non-trivial diffusion, jump or Levy process.

slide-10
SLIDE 10

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Interpretation

Limiting process z(t) takes Tmix to mix. Using the approximation xk,α = zα(kδ) ≈ z(kδ) it follows that x(·, α) takes roughly k ≈ Tmix/δ(α) steps to mix. Consequently, as α → α∞ the complexity of the MCMC algorithm grows as δ(α)−1.

slide-11
SLIDE 11

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Mixing

slide-12
SLIDE 12

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Outline

1

The Scaling Analysis Method

2

High Dimensional MCMC

3

Concentration near a manifold

slide-13
SLIDE 13

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Some motivations: Bayesian Inverse Problems

Consider an infinite dimensional Hilbert space H. Reconstruction of unknown data x ∈ H from noisy

  • bservation

y = F(x) + (Noise) Suppose that the noise is Gaussian and put a Gaussian prior π0=N(0,C) on the data x to be estimated. Posterior probability distribution π (living on H) is given by dπ dπ0 (x) ∝ e−Φ(x) where Φ(x) = exp

  • − 1

2F(x) − y2 Γ

  • .
slide-14
SLIDE 14

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Temperature Field Reconstruction

slide-15
SLIDE 15

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Some motivations: Conditioned Diffusions

Consider a diffusion with constant volatility coefficient (see Lamperti) on the interval I = [0, T], dX = −∇U(X) dt + σ dW with X0 = x−, XT = x+ Call π the law of Xt∈I ∈ H = L2( I ). Law of diffusion X is absolutely continuous (Girsanov) w.r.t. to Wiener bridge measure π0 dY = σ dW with Y0 = x−, YT = x+ One can explicitly write down (without stoch. integral) the change of probability dπ dπ0 (x) ∝ e−Φ(x)

slide-16
SLIDE 16

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Conditioned Diffusion

slide-17
SLIDE 17

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Finite Dimensional Discretisation

Let ϕ1, ϕ2 . . . , ϕk, . . . be eigenfunctions of covariance

  • perator C.

Let PN(·) denote orthogonal projection, in H, onto span(ϕ1, . . . , ϕN) and πN

D

∼ PN(π0). Finite dimensional (but living on H) posterior πN is given by dπN dπN (x) ∝ e−Φ( PNx ). One can implement all the algorithms in RN but analyse then in H. Other (more natural) discretisation possible.

slide-18
SLIDE 18

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Random Walk Metropolis (RWM) algorithm

RWM for target distribution πα = πN, x⋆ = x +

  • δ(N) PN(ξ)

with ξ D ∼ π0 = N(0, C). Discretisation of Brownian motion with covariance PN(C) between t and t + δ(N). Diffusion Limit Take δ(N) = N−p for any p ≥ 1. The limit lim

N→∞ zN = z

exists and is a non-trivial ergodic diffusion process.

slide-19
SLIDE 19

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Analysis of RWM

  • J. Mattingly, N. Pillai and A.M. Stuart, 2011

Theorem Consider RWM with increment δ(N) ≈ N−1. The limit zN ⇒ z holds weakly in C([0, T], Hs). The limit process z is a H-valued Langevin diffusion that is reversible with respect to π. For δ(N) ∝ N−1, limiting acceptance probability 0 < p < 1. For δ(N) ∝ N−(1+ε), limiting acceptance probability p = 1. For δ(N) ∝ N−(1−ε), acceptance probability is exponentially small. Complexity of RWM grows as O(N) as N → ∞.

slide-20
SLIDE 20

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Langevin Diffusion

Probability distribution with density π(x) ∝ e−L(x). dx = −∇L(x) dt + √ 2 dW is π-reversible. dx = −M∇L(x) dt + √ 2M dW is π-reversible. Case dπ

dπ0 (x) ∝ e−Φ(x) with π0 = N(0, C).

Informally π(x) ∝ e− 1

2x,C−1x−Φ(x).

Because ∇ 1

2x, C−1x + Φ(x)

  • = C−1x + ∇Φ(x),

dx = −(x + C∇Φ(x)) dt + √ 2C dW = drift(x) dt + √ 2C dW is π-reversible. Notice diffusion term.

slide-21
SLIDE 21

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

MALA algorithm

MALA for target distribution πα = πN, x⋆ = x − drift(x) δ(N) +

  • 2C δ(N) PN(ξ)

Euler Discretisation of Langevin Diffusion between t and t + δ(N). Diffusion Limit Take δ(N) = N−p for any p ≥ 1

  • 3. The limit

lim

N→∞ zN = z

exists and is a non-trivial ergodic diffusion process.

slide-22
SLIDE 22

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Analysis of MALA

  • N. Pillai,A.M. Stuart and A.T, 2011

Theorem Consider MALA with increment δ(N) ≈ N− 1

3 . The limit zN ⇒ z

holds weakly in C([0, T], Hs). The limit process z is a H-valued Langevin diffusion that is reversible with respect to π. For δ(N) ∝ N− 1

3 , limiting acceptance probability 0 < p < 1.

For δ(N) ∝ N−( 1

3 +ε), limiting acceptance probability p = 1.

For δ(N) ∝ N−( 1

3 −ε), acceptance probability is

exponentially small. Complexity of MALA grows as O(N

1 3 ) as N → ∞.

slide-23
SLIDE 23

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

What is going wrong?

Consider RWM and MALA for Gaussian targets π = π0 = N(0, C) and Φ ≡ 0. (RWM) x⋆ = x + √ δ ξ with ξ D ∼ N(0, C). (MALA) x⋆ = (1 − δ) x + √ 2δ ξ with ξ D ∼ N(0, C). Consequently, if x D ∼ π = N(0, C) we have (RWM) x⋆ D ∼ N(0, (1 + δ) C). (MALA) x⋆ D ∼ N(0, (1 + δ2) C). In infinite dimensional setting, Gaussian measures N(0, C) and N(0, (1 + ε)C) are singular. RWM and MALA are NOT well-defined on H.

slide-24
SLIDE 24

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

A Robust Algorithm

Target density dπ

dπ0 (x) ∝ e−Φ(x). The ’right’ proposal (called

pCN) should be x⋆ = √ 1 − δ x + √ δ PN(ξ). It is well-defined on H and preserve π0 = N(0, C). Theorem (N.Pillai, A.M. Stuart and A.T. (2011)) Under growth conditions on the potential Φ, the pCN algorithm is robust. For any fixed parameter δ > 0 the average acceptance probability stays bounded away fom 0. RWM complexity grows as O(N) MALA complexity grows as O(N

1 3 )

pCN complexity is O(1)

slide-25
SLIDE 25

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Optimal Proposal Design Principle

Designing proposals which are well-defined on the infinite dimensional parameter space results in MCMC methods which do not suffer from the curse of dimensionality.

slide-26
SLIDE 26

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Outline

1

The Scaling Analysis Method

2

High Dimensional MCMC

3

Concentration near a manifold

slide-27
SLIDE 27

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Low Dimensional Observation

y = F(x) + (noise). unknown data x ∈ RN noisy observation y ∈ Rd low dimension measurement d ≪ N

slide-28
SLIDE 28

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

y = F(x) + (noise)

In the small noise limit, posterior distribution concentrated near the manifold M =

  • x : y = F(x)
  • In general dim M = N − d.
slide-29
SLIDE 29

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Analysis Framework

Gaussian Noise with variance ε2 × Id Prior measure π0(dx) Posterior measure πα = πε dπε dπ0 (x) ∝ exp

  • − F(x) − y2

2ε2

  • .

MCMC exploration with Random Walk proposals x⋆ = x +

  • δ(ε) ξ

Behavior of MCMC when ε → 0 ?

slide-30
SLIDE 30

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Concentration

slide-31
SLIDE 31

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Limiting Manifold

slide-32
SLIDE 32

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Scaling Limit

Theorem (Work in progress) Consider RWM with increment δ(ε) ≈ ε2. The limit zε ⇒ z holds weakly in C([0, T], RN). The limit process z is a M-valued Langevin diffusion. Rigorously proved for flat manifold M Complexity grows as ε−2 Proof: two-scales approach.

slide-33
SLIDE 33

The Scaling Analysis Method High Dimensional MCMC Concentration near a manifold

Thank You!