partial ordering of inhomogeneous markov chains with
play

Partial ordering of inhomogeneous Markov chains with applications to - PowerPoint PPT Presentation

Introduction Main result Applications Conclusion Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo methods Jimmy Olsson Department of Mathematics KTH Institute of Technology Stockholm, Sweden


  1. Introduction Main result Applications Conclusion Partial ordering of inhomogeneous Markov chains with applications to Markov chain Monte Carlo methods Jimmy Olsson Department of Mathematics KTH Institute of Technology Stockholm, Sweden Based on joint work with Florian Maire and Randal Douc MCMski IV 7 January 2014, Chamonix logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  2. Introduction Main result Applications Conclusion Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  3. Introduction Main result Applications Conclusion Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  4. Introduction Main result Applications Conclusion Markov Chain Monte Carlo (MCMC) methods Let π be some target distribution on some state space ( X , X ) and assume that π is known up to a multiplicative constant only. Given π , MCMC methods allow a Markov chain ( X n ) n with stationary distribution π to be generated. Expectations � π ( f ) := f ( x ) π ( dx ) are estimated using sample averages n − 1 π n ( f ) := 1 � ˆ f ( X k ) . n k = 0 Recall that a Markov transition kernel P on ( X , X ) is π -reversible if π ( dx ) P ( x , dy ) = π ( dy ) P ( y , dx ) . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  5. Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains To measure the performance of an MCMC sampler with transition kernel P and stationary distribution π we consider the asymptotic variance � n − 1 � � √ 1 � � v ( f , P ) := lim n →∞ Var n ˆ π n ( f ) = lim n Var f ( X k ) . n →∞ k = 0 For given π -reversible kernels P 0 and P 1 we would like to find some easily-checked conditions for when v ( f , P 0 ) ≥ v ( f , P 1 ) for all f belonging to some large class of objective logga functions. Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  6. Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains (cont’d) Definition Let P 0 and P 1 be π -reversible Markov kernels on ( X , X ) . We say that P 1 dominates P 0 (i) on the off-diagonal, written P 1 � P 0 , if for all ( x , A ) ∈ X ×X , P 1 ( x , A \ { x } ) ≥ P 0 ( x , A \ { x } ) . (ii) in the covariance ordering, written P 1 � P 0 , if for all f ∈ L 2 ( π ) , � � f ( x ) f ( y ) P 1 ( x , dy ) π ( dx ) ≤ f ( x ) f ( y ) P 0 ( x , dy ) π ( dx ) . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  7. Introduction Main result Applications Conclusion Comparison of π -reversible Markov chains (cont’d) With these definitions the following chain of implications holds true. Theorem (P . H. Peskun [2] and L. Tierney [3]) Let P 0 and P 1 be π -reversible Markov kernels on ( X , X ) . Then � � ∀ f ∈ L 2 ( π ) P 1 � P 0 ⇒ P 1 � P 0 ⇒ v ( f , P 0 ) ≥ v ( f , P 1 ) , . logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  8. Introduction Main result Applications Conclusion Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  9. Introduction Main result Applications Conclusion Comparison of inhomogeneous chains Theorem Let ( X ( 0 ) n ) n and ( X ( 1 ) n ) n be Markov chains evolving as P i Q i P i Q i π ∼ X ( i ) → X ( i ) → X ( i ) → X ( i ) − − − − → · · · 0 1 2 3 where (i) the P i and Q i are π -reversible, (ii) P 1 � P 0 and Q 1 � Q 0 . Then for all f ∈ L 2 ( π ) satisfying a weak summability condition, � n − 1 � � n − 1 � 1 1 f ( X ( 1 ) f ( X ( 0 ) � � lim n Var ) ≤ lim n Var ) . k k n →∞ n →∞ logga k = 0 k = 0 Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  10. Introduction Main result Applications Conclusion The summability condition The result holds for all f ∈ L 2 ( π ) such that for i ∈ { 0 , 1 } , ∞ � � | Cov ( f ( X ( i ) 0 ) , f ( X ( i ) k )) | + | Cov ( f ( X ( i ) 1 ) , f ( X ( i ) � k + 1 )) | < ∞ . ( ∗ ) k = 1 The condition ( ∗ ) implies that the asymptotic variances exist and are finite. holds when each product P i Q i is V -geometrically ergodic. is not a necessary condition. logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  11. Introduction Main result Applications Conclusion Two slides on the proof (cont’d) The proof of L. Tierney [3] uses spectral theory. It is however, under the summability condition, possible to replicate Tierney’s result without spectral theory by 1 showing that for π -reversible kernels P , ∞ v ( f , P ) = π f 2 − π 2 f + 2 � Cov P ( X 0 , X n ) , n = 1 � 2 setting, using the notation � f , g � := f ( x ) g ( x ) π ( dx ) , ∞ λ n � f , P n � P α := ( 1 − α ) P 0 + α P 1 and w λ ( f , P α ) := α f � , n = 1 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  12. Introduction Main result Applications Conclusion Two slides on the proof (cont’d) 3 and showing that for all λ ∈ ( 0 , 1 ) , α �→ w λ ( f , P α ) is decreasing, yielding w λ ( f , P 1 ) ≤ w λ ( f , P 0 ) . α ∈ L 2 ( π ) such To this aim, show that there is a function f ∗ that ∂ w λ ( f , P α ) = � f ∗ α , λ ( P 1 − P 0 ) f ∗ α � = λ ( � f ∗ α , P 1 f ∗ α � − � f ∗ α , P 0 f ∗ α � ) ≤ 0. ∂α 4 Finally, apply the dominated convergence theorem as λ → 1. Very roughly, the proof of the inhomogeneous case follows the same lines, by splitting sums into even and odd terms logga (with distributions governed by P i resp. Q i ). Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  13. Introduction Main result Applications Conclusion Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  14. Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Outline Introduction 1 Main result 2 Applications 3 Data augmentation-type MCMC methods Pseudo-marginal methods Conclusion 4 logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  15. Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Data augmentation In many applications the density of the target π is analytically intractable or too expensive to evaluate. Often a way of coping with this problem is to augment the data by an auxiliary variable U and consider the extended target π ( dy × du ) := π ( dy ) R ( y , du ) , ˜ where R is some Markov kernel, having the desired distribution π as marginal distribution. A typical example is Bayesian inference in models with latent variables (such as HMM and mixture models) where π and U play the roles of posterior resp. unobserved data. logga Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  16. Introduction Main result Applications Conclusion Data augmentation-type MCMC methods Metropolis-Hastings (MH) for data augmentation Algorithm (Freeze) Given ( Y ( 1 ) , U k ) , k 1 generate Y ∗ ∼ s (( Y ( 1 ) , U k ) , y ) dy , k 2 generate U ∗ ∼ t (( Y ( 1 ) , U k , Y ∗ ) , u ) du , k � w. pr. α ( Y ( 1 ) ( Y ∗ , U ∗ ) , U k , Y ∗ , U ∗ ) , 3 let ( Y ( 1 ) k k + 1 , U k + 1 ) ← ( Y ( 1 ) , U k ) otherwise. k Here α ( Y ( 1 ) , U k , Y ∗ , U ∗ ) equals k 1 ∧ π ( Y ∗ ) r ( Y ∗ , U ∗ ) s (( Y ∗ , U ∗ ) , Y ( 1 ) ) t (( Y ∗ , U ∗ , Y ( 1 ) ) , U k ) k k . π ( Y ( 1 ) ) r ( Y ( 1 ) , U k ) s (( Y ( 1 ) , U k ) , Y ∗ ) t (( Y ( 1 ) , U k , Y ∗ ) , U ∗ ) logga k k k k Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

  17. Introduction Main result Applications Conclusion Data augmentation-type MCMC methods The Systematic Refreshment Algorithm In some cases it is possible to sample from R ( · , du ) = r ( · , u ) du ; then, “refresh” systematically U k by a random draw ˜ U according to R . Algorithm (Systematic Refreshment) Given Y ( 2 ) , k U ∼ r ( Y ( 2 ) 1 generate ˜ , u ) du , k 2 generate Y ∗ ∼ s (( Y ( 2 ) , ˜ U ) , y ) dy , k 3 generate U ∼ t (( Y ( 2 ) , ˜ U , Y ∗ ) , u ) du , k � w. pr. α ( Y ( 2 ) , ˜ Y ∗ U , Y ∗ , U ) , 4 let Y ( 2 ) k k + 1 ← Y ( 2 ) otherwise. logga k Jimmy Olsson KTH Institute of Technology Partial ordering of inhomogeneous Markov chains

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend