stochastic approximation for adaptive markov chain monte
play

Stochastic approximation for adaptive Markov chain Monte Carlo - PowerPoint PPT Presentation

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI / CNRS - TELECOM ParisTech, France Stochastic approximation for adaptive


  1. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI / CNRS - TELECOM ParisTech, France

  2. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers I. Examples of adaptive and interacting MCMC samplers 1. Adaptive Hastings-Metropolis algorithm [Haario et al. 1999] 2. Equi-Energy algorithm [Kou et al. 2006] 3. Wang-Landau algorithm [Wang & Landau, 2001]

  3. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm Adaptive Hastings-Metropolis algorithm ◮ Symmetric Random Walk Hastings-Metropolis algorithm Goal: sample a Markov chain with known stationary distribution π on R d (known up to a normalizing constant) Iterative mecanism: given the current sample X n , propose a move to X n + Y Y ∼ q ( · − X n ) accept the move with probability α ( X n , X n + Y ) = 1 ∧ π ( X n + Y ) π ( X n ) and set X n +1 = X n + Y ; otherwise, X n +1 = X n .

  4. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm Adaptive Hastings-Metropolis algorithm ◮ Symmetric Random Walk Hastings-Metropolis algorithm Goal: sample a Markov chain with known stationary distribution π on R d (known up to a normalizing constant) Iterative mecanism: given the current sample X n , propose a move to X n + Y Y ∼ q ( · − X n ) accept the move with probability α ( X n , X n + Y ) = 1 ∧ π ( X n + Y ) π ( X n ) and set X n +1 = X n + Y ; otherwise, X n +1 = X n . Design parameter: how to choose the proposal distribution q ? For example, in the case q ( · − x ) = N d ( x ; θ ) how to scale the proposal i.e. how to choose the covariance matrix θ ?

  5. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm 3 2.5 3 2 2 2 1.5 1 1 1 0 0 0.5 −1 −1 0 −2 −2 −0.5 “goldilock principle” −3 −1 −3 0 500 1000 0 500 1000 0 500 1000 Too small, too large, better variance 1 1 1.2 1 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 −0.2 0 50 100 0 50 100 0 50 100

  6. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ Adaptive Hastings-Metropolis algorithm(s) Based on theoretical results [Gelman et al. 1996; · · · ] when the proposal is Gaussian N d ( x, θ ) , choose θ as the covariance structure of π [Haario et al. 1999] : θ ∝ Σ π . In practice, Σ π is unknown and this quantity is computed “online” with the past samples of the chain n 1 n o ( X n +1 − µ n +1 )( X n +1 − µ n +1 ) T + κ Id d θ n +1 = n + 1 θ n + n + 1 where µ n +1 is the empirical mean. κ > 0 , prevent from badly scaled matrix

  7. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ Adaptive Hastings-Metropolis algorithm(s) Based on theoretical results [Gelman et al. 1996; · · · ] when the proposal is Gaussian N d ( x, θ ) , choose θ as the covariance structure of π [Haario et al. 1999] : θ ∝ Σ π . In practice, Σ π is unknown and this quantity is computed “online” with the past samples of the chain n 1 n o ( X n +1 − µ n +1 )( X n +1 − µ n +1 ) T + κ Id d θ n +1 = n + 1 θ n + n + 1 where µ n +1 is the empirical mean. κ > 0 , prevent from badly scaled matrix OR such that the mean acceptance rate converges to α ⋆ [Andrieu & Robert 2001] . In practice this θ is unknown and this parameter is adapted during the run of the algorithm θ n = τ n Id with log τ n +1 = log τ n + γ n +1 ( α n +1 − α ⋆ ) where α n is the mean acceptance rate. OR · · ·

  8. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ In practice, simultaneous adaptation of the design parameter and simulation. Given the current value of the chain X n and the design parameter θ n Draw the next sample X n +1 with the transition kernel P θ n ( X n , · ) . Update the design parameter: θ n +1 = Ξ n +1 ( θ n , X n +1 , · ) .

  9. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ In practice, simultaneous adaptation of the design parameter and simulation. Given the current value of the chain X n and the design parameter θ n Draw the next sample X n +1 with the transition kernel P θ n ( X n , · ) . Update the design parameter: θ n +1 = Ξ n +1 ( θ n , X n +1 , · ) . ◮ In this MCMC context, we are interested in the behavior of the chain { X n , n ≥ 0 } e.g. Convergence of the marginals: E [ f ( X n )] → π ( f ) for f bounded. n − 1 P n Law of large numbers: k =1 f ( X k ) → π ( f ) (a.s. or P ) Central limit theorem but we have πP θ = π for any θ : all the transition kernels have the same inv. distribution π so, stability / convergence of the adaptation process { θ n , n ≥ 0 } is not the main issue.

  10. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ?

  11. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ? ◮ Idea: (a) build an auxiliary process that moves between the modes far more easily and (b) define the process of interest by running a “classical” MCMC algorithm and sometimes, choose a value of the auxiliary process as the new value of the process of interest: draw a point at random + acceptation-rejection mecanism

  12. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ? ◮ Idea: (a) build an auxiliary process that moves between the modes far more easily and (b) define the process of interest by running a “classical” MCMC algorithm and sometimes, choose a value of the auxiliary process as the new value of the process of interest: draw a point at random + acceptation-rejection mecanism How to define such an auxiliary process ? Ans.: as a process with stationary distribution π β ( β ∈ (0 , 1) ), a tempered version of the target π .

  13. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler ◮ On an example: a K -stage Equi-Energy sampler. Target density : mixture of 2−dim Gaussian 10 8 target density: π = P 20 i =1 N 2 ( µ i , Σ i ) 6 K auxiliary processes: with targets π 1 /T i 4 2 T 1 > T 2 > · · · > T K +1 = 1 0 draws means of the components −2 0 1 2 3 4 5 6 7 8 9 10 Target density at temperature 1 Target density at temperature 2 Target density at temperature 3 14 12 12 12 10 10 10 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 −2 draws draws draws means of the components means of the components means of the components −4 −2 −2 −2 0 2 4 6 8 10 12 −2 0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 9 10 Target density at temperature 4 Target density at temperature 5 Hastings−Metropolis 12 10 10 9 10 8 8 8 7 6 6 6 4 5 4 4 2 2 3 2 0 0 draws draws 1 draws means of the components means of the components means of the components −2 −2 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

  14. Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler ◮ Algorithm: (2 stages) Repeat: Update the adaptation process n − 1 θ n = 1 X δ Yk n k =0 where { Y n , n ≥ 0 } is the auxiliary process with stationary distribution π β . Update the process of interest with transition : X n +1 ∼ P θ n ( X n , · ) where 8 9 > > > > Z Z < = P θn ( x, A ) = (1 − ǫ ) P ( x, A )+ ǫ α ( x, y ) θ n ( dy ) + δ x ( A ) (1 − α ( x, y )) θ n ( dy ) > > A | {z } > > : ; accept/reject mecanism

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend