Stochastic approximation for adaptive Markov chain Monte Carlo - PowerPoint PPT Presentation

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI / CNRS - TELECOM ParisTech, France

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers I. Examples of adaptive and interacting MCMC samplers 1. Adaptive Hastings-Metropolis algorithm [Haario et al. 1999] 2. Equi-Energy algorithm [Kou et al. 2006] 3. Wang-Landau algorithm [Wang & Landau, 2001]

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm Adaptive Hastings-Metropolis algorithm ◮ Symmetric Random Walk Hastings-Metropolis algorithm Goal: sample a Markov chain with known stationary distribution π on R d (known up to a normalizing constant) Iterative mecanism: given the current sample X n , propose a move to X n + Y Y ∼ q ( · − X n ) accept the move with probability α ( X n , X n + Y ) = 1 ∧ π ( X n + Y ) π ( X n ) and set X n +1 = X n + Y ; otherwise, X n +1 = X n .

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm Adaptive Hastings-Metropolis algorithm ◮ Symmetric Random Walk Hastings-Metropolis algorithm Goal: sample a Markov chain with known stationary distribution π on R d (known up to a normalizing constant) Iterative mecanism: given the current sample X n , propose a move to X n + Y Y ∼ q ( · − X n ) accept the move with probability α ( X n , X n + Y ) = 1 ∧ π ( X n + Y ) π ( X n ) and set X n +1 = X n + Y ; otherwise, X n +1 = X n . Design parameter: how to choose the proposal distribution q ? For example, in the case q ( · − x ) = N d ( x ; θ ) how to scale the proposal i.e. how to choose the covariance matrix θ ?

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm 3 2.5 3 2 2 2 1.5 1 1 1 0 0 0.5 −1 −1 0 −2 −2 −0.5 “goldilock principle” −3 −1 −3 0 500 1000 0 500 1000 0 500 1000 Too small, too large, better variance 1 1 1.2 1 0.8 0.8 0.8 0.6 0.6 0.6 0.4 0.4 0.4 0.2 0.2 0.2 0 0 0 −0.2 0 50 100 0 50 100 0 50 100

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ Adaptive Hastings-Metropolis algorithm(s) Based on theoretical results [Gelman et al. 1996; · · · ] when the proposal is Gaussian N d ( x, θ ) , choose θ as the covariance structure of π [Haario et al. 1999] : θ ∝ Σ π . In practice, Σ π is unknown and this quantity is computed “online” with the past samples of the chain n 1 n o ( X n +1 − µ n +1 )( X n +1 − µ n +1 ) T + κ Id d θ n +1 = n + 1 θ n + n + 1 where µ n +1 is the empirical mean. κ > 0 , prevent from badly scaled matrix

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ Adaptive Hastings-Metropolis algorithm(s) Based on theoretical results [Gelman et al. 1996; · · · ] when the proposal is Gaussian N d ( x, θ ) , choose θ as the covariance structure of π [Haario et al. 1999] : θ ∝ Σ π . In practice, Σ π is unknown and this quantity is computed “online” with the past samples of the chain n 1 n o ( X n +1 − µ n +1 )( X n +1 − µ n +1 ) T + κ Id d θ n +1 = n + 1 θ n + n + 1 where µ n +1 is the empirical mean. κ > 0 , prevent from badly scaled matrix OR such that the mean acceptance rate converges to α ⋆ [Andrieu & Robert 2001] . In practice this θ is unknown and this parameter is adapted during the run of the algorithm θ n = τ n Id with log τ n +1 = log τ n + γ n +1 ( α n +1 − α ⋆ ) where α n is the mean acceptance rate. OR · · ·

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ In practice, simultaneous adaptation of the design parameter and simulation. Given the current value of the chain X n and the design parameter θ n Draw the next sample X n +1 with the transition kernel P θ n ( X n , · ) . Update the design parameter: θ n +1 = Ξ n +1 ( θ n , X n +1 , · ) .

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Adaptive Hastings-Metropolis algorithm ◮ In practice, simultaneous adaptation of the design parameter and simulation. Given the current value of the chain X n and the design parameter θ n Draw the next sample X n +1 with the transition kernel P θ n ( X n , · ) . Update the design parameter: θ n +1 = Ξ n +1 ( θ n , X n +1 , · ) . ◮ In this MCMC context, we are interested in the behavior of the chain { X n , n ≥ 0 } e.g. Convergence of the marginals: E [ f ( X n )] → π ( f ) for f bounded. n − 1 P n Law of large numbers: k =1 f ( X k ) → π ( f ) (a.s. or P ) Central limit theorem but we have πP θ = π for any θ : all the transition kernels have the same inv. distribution π so, stability / convergence of the adaptation process { θ n , n ≥ 0 } is not the main issue.

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ?

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ? ◮ Idea: (a) build an auxiliary process that moves between the modes far more easily and (b) define the process of interest by running a “classical” MCMC algorithm and sometimes, choose a value of the auxiliary process as the new value of the process of interest: draw a point at random + acceptation-rejection mecanism

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler Equi-Energy sampler ◮ Proposed by Kou et al. (2006) for the simulation of multi-modal density π . How to define a sampler that both allows local moves for a local exploration of the density. and large jumps in order to visit other modes of the target ? ◮ Idea: (a) build an auxiliary process that moves between the modes far more easily and (b) define the process of interest by running a “classical” MCMC algorithm and sometimes, choose a value of the auxiliary process as the new value of the process of interest: draw a point at random + acceptation-rejection mecanism How to define such an auxiliary process ? Ans.: as a process with stationary distribution π β ( β ∈ (0 , 1) ), a tempered version of the target π .

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler ◮ On an example: a K -stage Equi-Energy sampler. Target density : mixture of 2−dim Gaussian 10 8 target density: π = P 20 i =1 N 2 ( µ i , Σ i ) 6 K auxiliary processes: with targets π 1 /T i 4 2 T 1 > T 2 > · · · > T K +1 = 1 0 draws means of the components −2 0 1 2 3 4 5 6 7 8 9 10 Target density at temperature 1 Target density at temperature 2 Target density at temperature 3 14 12 12 12 10 10 10 8 8 8 6 6 6 4 4 4 2 2 2 0 0 0 −2 draws draws draws means of the components means of the components means of the components −4 −2 −2 −2 0 2 4 6 8 10 12 −2 0 2 4 6 8 10 12 0 1 2 3 4 5 6 7 8 9 10 Target density at temperature 4 Target density at temperature 5 Hastings−Metropolis 12 10 10 9 10 8 8 8 7 6 6 6 4 5 4 4 2 2 3 2 0 0 draws draws 1 draws means of the components means of the components means of the components −2 −2 0 0 1 2 3 4 5 6 7 8 9 10 0 1 2 3 4 5 6 7 8 9 1 2 3 4 5 6 7 8 9

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Examples of adaptive MCMC samplers Equi-Energy sampler ◮ Algorithm: (2 stages) Repeat: Update the adaptation process n − 1 θ n = 1 X δ Yk n k =0 where { Y n , n ≥ 0 } is the auxiliary process with stationary distribution π β . Update the process of interest with transition : X n +1 ∼ P θ n ( X n , · ) where 8 9 > > > > Z Z < = P θn ( x, A ) = (1 − ǫ ) P ( x, A )+ ǫ α ( x, y ) θ n ( dy ) + δ x ( A ) (1 − α ( x, y )) θ n ( dy ) > > A | {z } > > : ; accept/reject mecanism

Stochastic approximation for adaptive Markov chain Monte Carlo - PowerPoint PPT Presentation

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI / CNRS - TELECOM ParisTech, France Stochastic approximation for adaptive

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Part 3 Markov Chain Modeling Markov Chain Model Stochastic model Amounts to sequence of

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Stochastic approximation-based algorithms, when the Monte Carlo bias does not vanish Gersende

Outlines Stochastic Process Discrete Time Markov Chain (DTMC) 2 Stochastic Process

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Polyphenols of chuchuhuazo ( Maytenus macrocarpa bark ) as antioxidant and preservative in fresh

Stability, Told by it Developers The authors of the present manuscript would like to insist on

Machine Translation The noisy channel model [Brown et al. 1990, Knight 1999] Classical and

Rare Breeds Survival Trust John Halmshaw Registered Charity in its 40 th Year Patron

Reimagining Hospital Food: Inside the Health Care Culinary Contest September 26, 2017 Welcome

In the beginning God created the heaven s and the earth. 2 The earth was formless and void, and

Geometrization of three-manifolds. Joan Porti (UAB) RIMS Seminar Representation spaces, twisted

GREEN IT MAJEURE BIS INSTITUT MINES-TLCOM BUSINESS SCHOOL CDRIC GOSSART 19/09/2018

Stochastic approximation for adaptive Markov chain Monte Carlo - PowerPoint PPT Presentation

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI / CNRS - TELECOM ParisTech, France Stochastic approximation for adaptive

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Part 3 Markov Chain Modeling Markov Chain Model Stochastic model Amounts to sequence of

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS &amp; Telecom

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Stochastic approximation-based algorithms, when the Monte Carlo bias does not vanish Gersende

Outlines Stochastic Process Discrete Time Markov Chain (DTMC) 2 Stochastic Process

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

Polyphenols of chuchuhuazo ( Maytenus macrocarpa bark ) as antioxidant and preservative in fresh

Stability, Told by it Developers The authors of the present manuscript would like to insist on

Machine Translation The noisy channel model [Brown et al. 1990, Knight 1999] Classical and

Rare Breeds Survival Trust John Halmshaw Registered Charity in its 40 th Year Patron

Reimagining Hospital Food: Inside the Health Care Culinary Contest September 26, 2017 Welcome

In the beginning God created the heaven s and the earth. 2 The earth was formless and void, and

Geometrization of three-manifolds. Joan Porti (UAB) RIMS Seminar Representation spaces, twisted

GREEN IT MAJEURE BIS INSTITUT MINES-TLCOM BUSINESS SCHOOL CDRIC GOSSART 19/09/2018

Adaptive and Interacting Markov chain Monte Carlo Gersende FORT LTCI CNRS & Telecom