Adaptive and Interacting Markov chain Monte Carlo Gersende FORT - - PowerPoint PPT Presentation

adaptive and interacting markov chain monte carlo
SMART_READER_LITE
LIVE PREVIEW

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT - - PowerPoint PPT Presentation

Adaptive and Interacting Markov chain Monte Carlo Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom ParisTech Paris, France Talk based on joint works with Eric Moulines (Telecom ParisTech, France) , Pierre


slide-1
SLIDE 1

Adaptive and Interacting Markov chain Monte Carlo

Adaptive and Interacting Markov chain Monte Carlo

Gersende FORT

LTCI CNRS & Telecom ParisTech Paris, France

Talk based on joint works with Eric Moulines (Telecom ParisTech, France), Pierre Priouret (Univ. Paris 6, France) and Pierre Vandekerkhove (Univ. Marne-la-Vall´

ee, France).

Amandine Schreck (Telecom ParisTech, France). Benjamin Jourdain (ENPC, France), Estelle Kuhn (INRA, France), Tony Leli` evre (ENPC,

France) and Gabriel Stoltz (ENPC, France).

slide-2
SLIDE 2

Adaptive and Interacting Markov chain Monte Carlo Introduction

Hastings-Metropolis algorithm (1/2)

Given a target density π

  • n X ⊆ Rd (to simplify the talk)

a proposal transition kernel q(x, y) define {Xk, k ≥ 0} iteratively as (i) draw Y ∼ q(Xk, ·) (ii) compute α(Xk, Y ) = 1 ∧ π(Y ) π(Xk) q(Y, Xk) q(Xk, Y ) (iii) set Xk+1 = Y with prob. α(Xk, Y ) Xk with prob. 1 − α(Xk, Y )

slide-3
SLIDE 3

Adaptive and Interacting Markov chain Monte Carlo Introduction

Hastings-Metropolis algorithm (2/2)

Then (Xk)k≥0 is a Markov chain with transition kernel P

P(x, A) =

  • α(x, y)q(x, y)λ(dy) + 1

IA(x)

  • (1 − α(x, y)) q(x, y) λ(dy)

Under conditions on π and q Ergodic behavior : P k(x, ·)

d

− → π Explicit control of ergodicity P k(x, ·) − πTV ≤ B(x, k) Law of Large Numbers 1 n

n

  • k=1

f(Xk)

a.s.

− →

  • f π dλ

Central Limit Theorem √n

  • 1

n

n

  • k=1

f(Xk) −

  • f π dλ
  • d

− → N(0, σ2

f)

slide-4
SLIDE 4

Adaptive and Interacting Markov chain Monte Carlo Introduction

  • ex. : Efficiency of a Gaussian Random Walk Hastings-Metropolis

When λ ≡ Lebesgue on R and q(x, ·) ≡ N(x, θ) efficiency compared through the (estimated) lag-s autocovariance function γs = E [X0Xs] − (E [X0])2 whenX0 ∼ π

500 1000 −3 −2 −1 1 2 3 50 100 0.2 0.4 0.6 0.8 1 500 1000 −1 −0.5 0.5 1 1.5 2 2.5 50 100 0.2 0.4 0.6 0.8 1 500 1000 −3 −2 −1 1 2 3 50 100 −0.2 0.2 0.4 0.6 0.8 1 1.2

For 3 different values of θ : [top] a path (Xk, k ≥ 1) [bottom] s → γ(s)/γ(0)

֒ → Online Adaption of the design parameters θ

slide-5
SLIDE 5

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC

Introduction Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler The Wang-Landau sampler The Equi-Energy sampler Convergence results Unfortunately ... Ergodic behavior Central Limit Theorems Conclusion Bibliography

slide-6
SLIDE 6

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler

Example 1 : Adaptive Metropolis (1/2)

Proposed by Haario et al. (2001) : learn on the fly the optimal covariance

  • f the Gaussian proposal distribution

Define a process {Xk, k ≥ 0} such that (i) update the chain : P (Xk+1 ∈ A|Fk) ≡ one step of Gaussian HM, with covariance matrix θk (ii) update the estimate of the covariance matrix θk+1 = function (k, θk, Xk+1) .

slide-7
SLIDE 7

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler

Example 1 : Adaptive Metropolis (2/2)

The general framework : Let Pθ be a Gaussian Hastings-Metropolis kernel ; θ is the covariance matrix of the Gaussian proposal distribution. For any θ : πPθ = π The adaptive algorithm : (i) Sample Xk+1|Fk ∼ Pθk(Xk, ·) (ii) Update the parameter θk+1 by using θk, Xk+1. Here, θ is a covariance matrix.

slide-8
SLIDE 8

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Wang-Landau sampler

Example 2 : Wang-Landau (1/4)

Proposed by Wang and Landau (2001) for sampling systems in molecular dynamics ; many metastable states ↔ many local modes separated with deep valleys. Idea : Let X1, · · · , Xd be a partition of X. Set πθ⋆(x) ∝

d

  • i=1

π(x) θ⋆(i)1 IXi(x) θ⋆(i) = π(Xi) The idea is to obtain samples (approx.) under πθ⋆. Then, by an importance ratio, these samples will approximate π. roughly : 1 n

n

  • k=1

δXk ≈ πθ⋆ = ⇒ 1 n

n

  • k=1

θ⋆(i)1 IXk∈Xi δXk ≈ π WL is an algorithm which provides an estimation of θ⋆ and samples

  • approx. distributed under πθ⋆.
slide-9
SLIDE 9

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Wang-Landau sampler

Example 2 : Wang-Landau (2/4)

Define {Xk, k ≥ 0} iteratively (i) Sample Xk+1|Fk ∼ MCMC sampler with target distribution πθk (ii) Update the parameter θk+1 = function (k, θk, Xk+1) The parameter {θk, k ≥ 0} is updated through a Stochastic Approximation procedure θn+1 = θn + γn+1h(θn) + γn+1noisen+1 with mean field h such that if {θk, k ≥ 0} converges, its limiting value is θ⋆.

slide-10
SLIDE 10

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Wang-Landau sampler

Example 2 : Wang-Landau (3/4)

−2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1 1 2 3 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 1 2 3 4 5 6 7 8 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5

Figure: [left] level curves of π [center] Target density π [right] Partition of the state space

0.5 e6 1 e6 1.5 e6 2 e6 2.5 e6 3 e6 0.02 0.04 0.06 0.08 0.1 0.12 0.14 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 0.02 0.04 0.06 0.08 0.1 0.12

Figure: [left] The sequences (θk(i))k. [right] The limiting value θ⋆(i)

slide-11
SLIDE 11

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Wang-Landau sampler

Example 2 : Wang-Landau (3/4)

−2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1 1 2 3 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 1 2 3 4 5 6 7 8 −2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5 −2 −1.5 −1 −0.5 0.5 1 1.5 2 2.5

Figure: [left] level curves of π [center] Target density π [right] Partition of the state space

2 4 6 8 10 12 x 10

4

−2 −1.5 −1 −0.5 0.5 1 1.5 2 beta=4 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10

6

−2 −1.5 −1 −0.5 0.5 1 1.5 2 beta=4

Figure: [left] Wang Landau, T = 110 000. [right] Hastings Metropolis, T = 2 106 ; the red line is at x = 110 000

slide-12
SLIDE 12

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Wang-Landau sampler

Example 2 : Wang-Landau (4/4)

The general framework : Let πθ be a distribution. Let Pθ be MCMC sampler with target distribution πθ. For any θ : πθPθ = πθ The adaptive algorithm : (i) Sample Xk+1|Fk ∼ Pθk(Xk, ·) (ii) Update the parameter θk+1 by using θk, Xk+1. Here, θ = (θ(1), · · · , θ(d)) is a probability on {1, · · · , d}.

slide-13
SLIDE 13

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Equi-Energy sampler

Example 3 : Equi-Energy (1/3)

Proposed by Kou et al. (2006) to sample multimodal target density π Based on an auxiliary process designed to admit π1/T (T > 1) as target distribution.

0.1 0.2 0.3 0.4 0.5

  • 6
  • 4
  • 2

2 4 6

current state

target distribution local move tempered distribution equi-energy jump boundary 1 boundary 2

The transition kernel Xk → Xk+1 is Pθk(Xk, ·) = (1 − ǫ) Q(Xk, ·)

  • MCMC with target π

+ǫ ˜ Qθk(Xk, ·)

  • kernel depending on

the empirical distribution θk of the auxiliary process

slide-14
SLIDE 14

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Equi-Energy sampler

Example 3 : Equi-Energy (2/3)

1 2 3 4 5 6 7 8 9 10 −2 2 4 6 8 10 Target density : mixture of 2−dim Gaussian draws means of the components

target density : π = 20

i=1 N2(µi, Σi)

5 processes with target distribution π1/Tk (TK = 1)

−2 2 4 6 8 10 12 −4 −2 2 4 6 8 10 12 14 Target density at temperature 1 draws means of the components −2 2 4 6 8 10 12 −2 2 4 6 8 10 12 Target density at temperature 2 draws means of the components 1 2 3 4 5 6 7 8 9 10 −2 2 4 6 8 10 12 Target density at temperature 3 draws means of the components 1 2 3 4 5 6 7 8 9 10 −2 2 4 6 8 10 12 Target density at temperature 4 draws means of the components 1 2 3 4 5 6 7 8 9 −2 2 4 6 8 10 Target density at temperature 5 draws means of the components 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9 10 Hastings−Metropolis draws means of the components

slide-15
SLIDE 15

Adaptive and Interacting Markov chain Monte Carlo Examples of adaptive and interacting MCMC The Equi-Energy sampler

Example 3 : Equi-Energy (3/3)

The general framework : Let Pθ be the kernel associated to a EE-transition when the equi-energy jump uses a point sampled under the distribution θ. Under assumptions, for any θ : ∃πθ s.t. πθPθ = πθ. The adaptive algorithm : (i) Sample Xk+1|Fk ∼ Pθk(Xk, ·) (ii) Update the distribution θk+1 by using θk and (auxiliary process)k+1. Here, θk is an empirical distribution on X.

slide-16
SLIDE 16

Adaptive and Interacting Markov chain Monte Carlo Convergence results

Introduction Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler The Wang-Landau sampler The Equi-Energy sampler Convergence results Unfortunately ... Ergodic behavior Central Limit Theorems Conclusion Bibliography

slide-17
SLIDE 17

Adaptive and Interacting Markov chain Monte Carlo Convergence results Unfortunately ...

Unfortunately · · ·

Unfortunately, adaption can destroy the convergence. Consider the following adapted Markov chain.

Let θ ∈ (0, 1). A Markov chain with transition matrix Pθ =

  • 1 − θ

θ θ 1 − θ

  • converges to the stationary distribution

π = (1/2; 1/2).

slide-18
SLIDE 18

Adaptive and Interacting Markov chain Monte Carlo Convergence results Unfortunately ...

Unfortunately · · ·

Unfortunately, adaption can destroy the convergence. Consider the following adapted Markov chain.

Let θ ∈ (0, 1). A Markov chain with transition matrix Pθ =

  • 1 − θ

θ θ 1 − θ

  • converges to the stationary distribution

π = (1/2; 1/2).

Fix t0, t1 ∈ (0, 1). Define an adapted chain as follows : Xk+1|Fk ∼

  • Pt0(Xk, ·)

if Xk = 0 Pt1(Xk, ·) if Xk = 1 ≡ Pθk(Xk, ·) with θk = tXk.

slide-19
SLIDE 19

Adaptive and Interacting Markov chain Monte Carlo Convergence results Unfortunately ...

Unfortunately · · ·

Unfortunately, adaption can destroy the convergence. Consider the following adapted Markov chain.

Let θ ∈ (0, 1). A Markov chain with transition matrix Pθ =

  • 1 − θ

θ θ 1 − θ

  • converges to the stationary distribution

π = (1/2; 1/2).

Fix t0, t1 ∈ (0, 1). Define an adapted chain as follows : Xk+1|Fk ∼

  • Pt0(Xk, ·)

if Xk = 0 Pt1(Xk, ·) if Xk = 1 ≡ Pθk(Xk, ·) with θk = tXk. Then, (Xk)k is a Markov chain, with transition matrix

  • 1 − t0

t0 t1 1 − t1

  • but it converges to the distribution

˜ π∝ (t1, t0) = π.

slide-20
SLIDE 20

Adaptive and Interacting Markov chain Monte Carlo Convergence results Ergodic behavior

Ergodicity (1/2)

Roberts and Rosenthal (2007); F., Moulines and Priouret (2012)

E [f(Xt)] − πθ⋆(f) = E [f(Xt) − E [f(Xt)|Ft−ℓ]] + E

  • E [f(Xt)|Ft−ℓ] − P ℓ

θt−ℓf(Xt−ℓ)

  • + E
  • P ℓ

θt−ℓf(Xt−ℓ) − πθt−ℓ(f)

  • + E
  • πθt−ℓ(f) − πθ⋆(f)
  • Convergence when

the first term is null the second term is small when adaption is diminishing the third term is small when the transition kernels (Pθ, θ ∈ Θ) are ergodic (enough), at a rate which is uniform (enough) in θ (containment condition)

slide-21
SLIDE 21

Adaptive and Interacting Markov chain Monte Carlo Convergence results Ergodic behavior

Ergodicity (2/2)

The last term : E

  • πθt−ℓ(f) − πθ⋆(f)
  • 1

Case 1 : πθ = π for any θ.

  • ex. Adaptive Metropolis

2

Case 2 : explicit expression of πθ.

  • ex. Wang-Landau F., Jourdain, Kuhn, Leli`

evre & Stoltz (2012) 3

Case 3 : NO expression of πθ BUT we have an expression of Pθ. ֒ → F., Moulines & Priouret (2012) check if πθ inherits the smooth-in-θ conditions on the kernel Pθ.

  • ex. Equi-Energy sampler F., Moulines & Priouret (2012) and Schreck, F. & Moulines (2013)
slide-22
SLIDE 22

Adaptive and Interacting Markov chain Monte Carlo Convergence results Central Limit Theorems

Central Limit Theorem (1/2)

n

  • k=1

f(Xk) − πθ⋆(f) =

n

  • k=1
  • f(Xk) − πθk−1(f)
  • +

n

  • k=1

πθk−1(f) − πθ⋆(f)

In the case of a (non adaptive) Markov chain i.e. Pθ = P then the variance in the CLT is given by σ2(f) =

  • π(dx) (Λf)2 (x) −
  • π(dx) Λf(x)

2 where Λf is the solution to the Poisson equation f − π(f) = Λf − PΛf. For adapted and interacting MCMC, it is true that σ2(f) =

  • πθ⋆(dx) (Λθ⋆f)2 (x) −
  • πθ⋆(dx) Λθ⋆f(x)

2 ? ֒ → Not always : adaption/interaction may introduce an additional term.

slide-23
SLIDE 23

Adaptive and Interacting Markov chain Monte Carlo Convergence results Central Limit Theorems

Central Limit Theorem (2/2)

recall : Xk+1|Fk ∼ Pθk(Xk, ·) πθkPθk = πθk and

n

  • k=1

f(Xk) − πθ⋆(f) =

n

  • k=1
  • f(Xk) − πθk−1(f)
  • +

n

  • k=1

πθk−1(f) − πθ⋆(f)

General conditions are provided by F., Moulines, Priouret, Vandekerkhove (2012) For Adaptive Metropolis : Saksman, Vihola (2010), F., Moulines, Priouret, Vandekerkhove (2012)

πθ = π for any θ. Step 1 : show that limn θn = θ⋆ w.p.1 Step2 : NO additional term σ2(f) =

  • π(dx) (Λθ⋆f)2 (x) −
  • π(dx) Λθ⋆f(x)

2

slide-24
SLIDE 24

Adaptive and Interacting Markov chain Monte Carlo Convergence results Central Limit Theorems

Central Limit Theorem (2/2)

recall : Xk+1|Fk ∼ Pθk(Xk, ·) πθkPθk = πθk and

n

  • k=1

f(Xk) − πθ⋆(f) =

n

  • k=1
  • f(Xk) − πθk−1(f)
  • +

n

  • k=1

πθk−1(f) − πθ⋆(f)

General conditions are provided by F., Moulines, Priouret, Vandekerkhove (2012) For Wang-Landau : F. Jourdain, Kuhn, Keli`

evre & Stoltz (2012)

πθPθ = πθ. Step 1 : show that limn θn = θ⋆ w.p.1 Step 2 : NO additional term σ2(f) =

  • πθ⋆(dx) (Λθ⋆f)2 (x) −
  • πθ⋆(dx) Λθ⋆f(x)

2

since the Stochastic Approximation update of θn implies that rapidly enough πθn (f) − πθ⋆ (f) ≤ Cθn − θ⋆ → 0 .

slide-25
SLIDE 25

Adaptive and Interacting Markov chain Monte Carlo Convergence results Central Limit Theorems

Central Limit Theorem (2/2)

recall : Xk+1|Fk ∼ Pθk(Xk, ·) πθkPθk = πθk and

n

  • k=1

f(Xk) − πθ⋆(f) =

n

  • k=1
  • f(Xk) − πθk−1(f)
  • +

n

  • k=1

πθk−1(f) − πθ⋆(f)

General conditions are provided by F., Moulines, Priouret, Vandekerkhove (2012) For Equi-Energy : F., Moulines, Priouret & Vandekerkhove (2012)

πθPθ = πθ. Step 1 : show that limn θn = θ⋆ in some sense (convergence of measures) Step 2 : additional term σ2(f) =

  • πθ⋆(dx) (Λθ⋆f)2 (x) −
  • πθ⋆(dx) Λθ⋆f(x)

2 + γ2(f) where γ2(f) collects the fluctuations of the auxiliary process

n−1/2 ⌊nt⌋

  • j=1
  • f(Yj ) − θ⋆(f)
  • d

− → γ2(f)Bt (Bt)std Brownian

slide-26
SLIDE 26

Adaptive and Interacting Markov chain Monte Carlo Conclusion

Introduction Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler The Wang-Landau sampler The Equi-Energy sampler Convergence results Unfortunately ... Ergodic behavior Central Limit Theorems Conclusion Bibliography

slide-27
SLIDE 27

Adaptive and Interacting Markov chain Monte Carlo Conclusion

Conclusion

There exist tools in the literature to prove the validity of adaptive and interacting MCMC. Results on the asymptotic behavior of the algorithms. What about explicit rate of convergence, explicit control of errors after a fixed number of iterations ? How to define a measure of efficiency ?

slide-28
SLIDE 28

Adaptive and Interacting Markov chain Monte Carlo Bibliography

Introduction Examples of adaptive and interacting MCMC The Adaptive Metropolis sampler The Wang-Landau sampler The Equi-Energy sampler Convergence results Unfortunately ... Ergodic behavior Central Limit Theorems Conclusion Bibliography

slide-29
SLIDE 29

Adaptive and Interacting Markov chain Monte Carlo Bibliography

Adaptive MCMC algorithms (survey) Andrieu, C. and Robert, C. (2001). Controlled markov chain monte carlo methods for optimal

  • sampling. Tech. Rep. 125, Cahiers du Ceremade.

Andrieu, C. and Thoms, J. (2008). A tutorial on adaptive MCMC. Statistics and Computing 18 :343-373. Atchade, Y. and Fort, G. and Moulines, E. and Priouret, P. (2011) Adaptive Markov chain Monte Carlo : Theory and Methods. Bayesian Time Series Models, Cambridge Univ. Press, Chapter 2, 33-53. Atchade, Y.F. and Rosenthal, J.S. (2005). On adaptive Markov chain Monte Carlo algorithm. Bernoulli 11 :815-828. Roberts, G. and Rosenthal, J. (2009). Examples of adaptive MCMC. J. Comp. Graph. Stat. 18 :349-367. Rosenthal, J. S. (2009). MCMC Handbook, chap. Optimal Proposal Distributions and Adaptive

  • MCMC. Chapman & Hall/CRC Press.
slide-30
SLIDE 30

Adaptive and Interacting Markov chain Monte Carlo Bibliography

Convergence of adaptive and interacting MCMC Roberts, G.O. and Rosenthal, J.S. Coupling and ergodicity of adaptive MCMC. J. Appl. Probab. 44 :458-475 (2007). Fort, G., Moulines, E. and Priouret, P. Convergence of interacting MCMC : ergodicity and law of large numbers. Ann. Statist. 39 :3262-3289 (2012) Fort, G., Moulines, E., Priouret, P. and Vandekerkhove, P. Convergence of interacting MCMC : Central Limit Theorem. Bernoulli (2013). Latuszynski, K. and Rosenthal, J.S. The containment condition and AdapFail algorithms. Submitted (2013). Convergence of stochastic approximation scheme

  • A. Benveniste, M. Metivier and P. Priouret. Adaptive algorithms for Stochastic Approximations.

Springer-Verlag (1987).

  • C. Andrieu, E. Moulines and P. Priouret. Stability of stochastic approximation under verifiable
  • conditions. SIAM Journal on Control and Optimisation 44 :283-312 (2005).
  • G. Fort. Central Limit Theorems for Stochastic Approximation with controlled Markov chain
  • dynamics. Submittted (2013).
slide-31
SLIDE 31

Adaptive and Interacting Markov chain Monte Carlo Bibliography

Convergence of Adaptive Metropolis Saksman, H. and Vihola, M. On the ergodicity of the adaptive Maetropolis algorithm on unbounded domains. Ann. Appl. Probab. (2010) 20 2178-2203. Fort, G., Moulines, E. and Priouret, P. Convergence of interacting MCMC : ergodicity and law of large numbers. Ann. Statist. 39 :3262-3289 (2012) Fort, G., Moulines, E., Priouret, P. and Vandekerkhove, P. Convergence of interacting MCMC : Central Limit Theorem. Bernoulli (2013). Convergence of the Equi-Energy sampler Hua, X. and Kou, S.C. Convergence of the Equi-Energy Sampler and Its Application to the Ising Model (2011) Stat. Sin. 24 :1687-1711. Fort, G., Moulines, E. and Priouret, P. Convergence of interacting MCMC : ergodicity and law of large numbers. Ann. Statist. 39 :3262-3289 (2012) Fort, G., Moulines, E., Priouret, P. and Vandekerkhove, P. Convergence of interacting MCMC : Central Limit Theorem. Bernoulli (2013).

  • A. Schreck, G. Fort and E. Moulines. Adaptive Equi-energy sampler : convergence and illustration.

Accepted in Transactions on Modeling and Computer Simulation (2012).

slide-32
SLIDE 32

Adaptive and Interacting Markov chain Monte Carlo Bibliography

Methodology and Convergence analysis of Wang-Landau

  • F. Liang. A general Wang-Landau algorithm for Monte Carlo computation. J. Am. Stat. Assoc.

100 :1311-1327 (2005).

  • F. Liang, C. Liu and R.J. Carroll. Stochastic approximation in Monte Carlo computation. J. Am.
  • Stat. Assoc. 102 :305-320 (2007).
  • Y. Atchad´

e and J.S. Liu. The Wang-Landau algorithm for Monte Carlo computation in general state space. Stat. Sinica, 20(1) :209-233 (2010).

Application of Wang-Landau to Statistics. Convergence results (on the samples (Xt)t) under the assumption that the algorithm is ”stable”

  • G. Fort, B. Jourdain, E. Kuhn, T. Leli`

evre and G. Stoltz. Convergence of the Wang Landau

  • algorithm. In revision (2013).

Sufficient conditions for (i) the stability, the convergence and the rate of convergence of the sequence of weights {θn, n ≥ 0} ; (ii) the convergence of the sampling method.

  • G. Fort, B. Jourdain, E. Kuhn, T. Leli`

evre and G. Stoltz. Efficiency of the Wang Landau algorithm. In revision (2013).

Discussion on the efficiency of the Wang-Landau algorithm

  • L. Bornn, P. Jacob, P. Del Moral and A. Doucet. An Adaptive Wang-Landau Algorithm for

Automatic Density Exploration. To appear in Journal of Computational and Graphical Statistics (2013).

New methods for (i) adaptive binning strategy to automate the difficult task of partitioning the state space, (ii) the use

  • finteracting parallel chains to improve the convergence speed and use of computational resources, and (iii) the use of adaptive

proposal distributions.

  • P. Jacob and R. Ryder. The Wang-Landau algorithm reaches the flat histogram criterion in finite
  • time. To appear in Ann. Appl. Probab. (2013).

The linearized version of the update of the weight vector θt satisfies in finite time the uniformity criterion required in the original Wang-Landau algorithm. This is not guaranteed for some non-linear update.