Partial ordering of inhomogeneous Markov chains with applications to - PowerPoint PPT Presentation

Introduction Main result Applications Conclusion Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo methods Jimmy Olsson Department of Mathematics KTH Institute of Technology Stockholm, Sweden Based on joint work with Florian Maire and Randal Douc MCMski IV 7 January 2014, Chamonix logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Markov Chain Monte Carlo (MCMC) methods Let π be some target distribution on some state space ( X , X ) and assume that π is known up to a multiplicative constant only. Given π , MCMC methods allow a Markov chain ( X n ) n with stationary distribution π to be generated. Expectations � π ( f ) := f ( x ) π ( dx ) are estimated using sample averages n − 1 π n ( f ) := 1 � ˆ f ( X k ) . n k = 0 Recall that a Markov transition kernel P on ( X , X ) is π -reversible if π ( dx ) P ( x , dy ) = π ( dy ) P ( y , dx ) . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains To measure the performance of an MCMC sampler with transition kernel P and stationary distribution π we consider the asymptotic variance � n − 1 � � √ 1 � � v ( f , P ) := lim n →∞ Var n ˆ π n ( f ) = lim n Var f ( X k ) . n →∞ k = 0 For given π -reversible kernels P 0 and P 1 we would like to find some easily-checked conditions for when v ( f , P 0 ) ≥ v ( f , P 1 ) for all f belonging to some large class of objective logga functions. Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains (cont’d) Definition Let P 0 and P 1 be π -reversible Markov kernels on ( X , X ) . We say that P 1 dominates P 0 (i) on the off-diagonal, written P 1 � P 0 , if for all ( x , A ) ∈ X ×X , P 1 ( x , A \ { x } ) ≥ P 0 ( x , A \ { x } ) . (ii) in the covariance ordering, written P 1 � P 0 , if for all f ∈ L 2 ( π ) , � � f ( x ) f ( y ) P 1 ( x , dy ) π ( dx ) ≤ f ( x ) f ( y ) P 0 ( x , dy ) π ( dx ) . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains (cont’d) With these definitions the following chain of implications holds true. Theorem (P . H. Peskun [2] and L. Tierney [3]) Let P 0 and P 1 be π -reversible Markov kernels on ( X , X ) . Then � � ∀ f ∈ L 2 ( π ) P 1 � P 0 ⇒ P 1 � P 0 ⇒ v ( f , P 0 ) ≥ v ( f , P 1 ) , . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Comparison of inhomogeneous chains Theorem Let ( X ( 0 ) n ) n and ( X ( 1 ) n ) n be Markov chains evolving as P i Q i P i Q i π ∼ X ( i ) → X ( i ) → X ( i ) → X ( i ) − − − − → · · · 0 1 2 3 where (i) the P i and Q i are π -reversible, (ii) P 1 � P 0 and Q 1 � Q 0 . Then for all f ∈ L 2 ( π ) satisfying a weak summability condition, � n − 1 � � n − 1 � 1 1 f ( X ( 1 ) f ( X ( 0 ) � � lim n Var ) ≤ lim n Var ) . k k n →∞ n →∞ logga k = 0 k = 0 Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion The summability condition The result holds for all f ∈ L 2 ( π ) such that for i ∈ { 0 , 1 } , ∞ � � | Cov ( f ( X ( i ) 0 ) , f ( X ( i ) k )) | + | Cov ( f ( X ( i ) 1 ) , f ( X ( i ) � k + 1 )) | < ∞ . ( ∗ ) k = 1 The condition ( ∗ ) implies that the asymptotic variances exist and are finite. holds when each product P i Q i is V -geometrically ergodic. is not a necessary condition. logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Two slides on the proof (cont’d) The proof of L. Tierney [3] uses spectral theory. It is however, under the summability condition, possible to replicate Tierney’s result without spectral theory by 1 showing that for π -reversible kernels P , ∞ v ( f , P ) = π f 2 − π 2 f + 2 � Cov P ( X 0 , X n ) , n = 1 � 2 setting, using the notation � f , g � := f ( x ) g ( x ) π ( dx ) , ∞ λ n � f , P n � P α := ( 1 − α ) P 0 + α P 1 and w λ ( f , P α ) := α f � , n = 1 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Two slides on the proof (cont’d) 3 and showing that for all λ ∈ ( 0 , 1 ) , α �→ w λ ( f , P α ) is decreasing, yielding w λ ( f , P 1 ) ≤ w λ ( f , P 0 ) . α ∈ L 2 ( π ) such To this aim, show that there is a function f ∗ that ∂ w λ ( f , P α ) = � f ∗ α , λ ( P 1 − P 0 ) f ∗ α � = λ ( � f ∗ α , P 1 f ∗ α � − � f ∗ α , P 0 f ∗ α � ) ≤ 0. ∂α 4 Finally, apply the dominated convergence theorem as λ → 1. Very roughly, the proof of the inhomogeneous case follows the same lines, by splitting sums into even and odd terms logga (with distributions governed by P i resp. Q i ). Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Data augmentation In many applications the density of the target π is analytically intractable or too expensive to evaluate. Often a way of coping with this problem is to augment the data by an auxiliary variable U and consider the extended target π ( dy × du ) := π ( dy ) R ( y , du ) , ˜ where R is some Markov kernel, having the desired distribution π as marginal distribution. A typical example is Bayesian inference in models with latent variables (such as HMM and mixture models) where π and U play the roles of posterior resp. unobserved data. logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Metropolis-Hastings (MH) for data augmentation Algorithm (Freeze) Given ( Y ( 1 ) , U k ) , k 1 generate Y ∗ ∼ s (( Y ( 1 ) , U k ) , y ) dy , k 2 generate U ∗ ∼ t (( Y ( 1 ) , U k , Y ∗ ) , u ) du , k � w. pr. α ( Y ( 1 ) ( Y ∗ , U ∗ ) , U k , Y ∗ , U ∗ ) , 3 let ( Y ( 1 ) k k + 1 , U k + 1 ) ← ( Y ( 1 ) , U k ) otherwise. k Here α ( Y ( 1 ) , U k , Y ∗ , U ∗ ) equals k 1 ∧ π ( Y ∗ ) r ( Y ∗ , U ∗ ) s (( Y ∗ , U ∗ ) , Y ( 1 ) ) t (( Y ∗ , U ∗ , Y ( 1 ) ) , U k ) k k . π ( Y ( 1 ) ) r ( Y ( 1 ) , U k ) s (( Y ( 1 ) , U k ) , Y ∗ ) t (( Y ( 1 ) , U k , Y ∗ ) , U ∗ ) logga k k k k Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Introduction Main result Applications Conclusion Data augmentation-type MCMC methods The Systematic Refreshment Algorithm In some cases it is possible to sample from R ( · , du ) = r ( · , u ) du ; then, “refresh” systematically U k by a random draw ˜ U according to R . Algorithm (Systematic Refreshment) Given Y ( 2 ) , k U ∼ r ( Y ( 2 ) 1 generate ˜ , u ) du , k 2 generate Y ∗ ∼ s (( Y ( 2 ) , ˜ U ) , y ) dy , k 3 generate U ∼ t (( Y ( 2 ) , ˜ U , Y ∗ ) , u ) du , k � w. pr. α ( Y ( 2 ) , ˜ Y ∗ U , Y ∗ , U ) , 4 let Y ( 2 ) k k + 1 ← Y ( 2 ) otherwise. logga k Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Partial ordering of inhomogeneous Markov chains with applications to - PowerPoint PPT Presentation

Introduction Main result Applications Conclusion Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo methods Jimmy Olsson Department of Mathematics KTH Institute of Technology Stockholm, Sweden

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Markov chains and Hidden Markov Models 9000 Markov chains and HMMs We will discuss: Markov

CSCE 471/871 Lecture 3: Markov Chains Markov Chains and and Hidden Markov Models Hidden

Imprecise Markov chains From basic theory to applications II prof. Jasper De Bock Imprecise

Overview Motivation Verifying Continuous-Time Markov Chains 1 Lecture 1+2: Discrete-Time Markov

Discrete time Markov chains Today: Discrete Time Markov Chains, Limiting Discrete time Markov

Discrete Time Markov Chains Discrete-Time Markov Chains Books - Introduction to Stochastic

Overview Verifying Continuous-Time Markov Chains Negative exponential distributions 1 Lecture

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University Markov Chains and

Markov Chains and Hidden Markov Models COMP 571 Luay Nakhleh, Rice University 2 Markov Chains

Hidden Markov Models Discrete Markov Processes 1 Hidden Markov Models Hidden Markov Models 2

Simulation of Discrete-Time Markov Chains Discrete-Time Markov Chains (DTMCs) Numerical Solution

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

Stochastic Processes Markov Processes Hamid R. Rabiee 1 Overview o Markov Property o Markov

Markov chains and MCMC methods Ingo Blechschmidt November 7th, 2014 Kleine Bayessche AG Markov

Markov chains Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad Niemi

Analysis of Lempel-Ziv 78 for Markov sources Ph Jacquet, W. Szpankowski Inria Purdue U the

Multigrid preconditioning for anisotropic positive semidefinite block Toeplitz systems Rainer

Phase Transition and Anisotropic Deformations of Neutron Star Matter Susan Nelmes Durham

A canonical basis for covering quantum groups Sean Clark Joint work with D. Hill and W. Wang

The Odds-algorithm based on sequential updating and its performance F.Thomas Bruss Guy Louchard

Phase Transition Phenomena in Integral Geometry Martin Lotz Warwick University joint work with

2BSDEs with Continuous Coefficients Dylan POSSAMAI Ecole Polytechnique Paris New advances in

Large fringe and non-fringe subtrees in conditional Galton-Watson trees Xing Shi Cai, Luc Devroye