Advanced Simulation - Lecture 5 George Deligiannidis February 1st, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 5 George Deligiannidis February 1st, 2016

Irreducibility and aperiodicity Definition Given a distribution µ over X , a Markov chain is µ -irreducible if K t ( x , A ) > 0. ∀ x ∈ X ∀ A : µ ( A ) > 0 ∃ t ∈ N A µ -irreducible Markov chain of transition kernel K is periodic if there exists some partition of the state space X 1 , ..., X d for d ≥ 2, such that � 1 � = j = i + s mod d � X t ∈ X i � � ∀ i , j , t , s : P X t + s ∈ X j . 0 otherwise. Otherwise the chain is aperiodic. Lecture 5 Continuous State Markov Chains 2 / 40

Recurrence and Harris Recurrence For any measurable set A of X , let ∞ ∑ η A = I A ( X k ) = # of visits to A . k = 1 Definition A µ -irreducible Markov chain is recurrent if for any measurable set A ⊂ X : µ ( A ) > 0, then ∀ x ∈ A E x ( η A ) = ∞ . A µ -irreducible Markov chain is Harris recurrent if for any measurable set A ⊂ X : µ ( A ) > 0, then ∀ x ∈ X P x ( η A = ∞ ) = 1. Harris recurrence is stronger than recurrence. Lecture 5 Continuous State Markov Chains 3 / 40

Invariant Distribution and Reversibility Definition A distribution of density π is invariant or stationary for a Markov kernel K , if � X π ( x ) K ( x , y ) dx = π ( y ) . A Markov kernel K is π -reversible if �� ∀ f f ( x , y ) π ( x ) K ( x , y ) dxdy �� = f ( y , x ) π ( x ) K ( x , y ) dxdy where f is a bounded measurable function. Lecture 5 Continuous State Markov Chains 4 / 40

Detailed balance In practice it is easier to check the detailed balance condition: ∀ x , y ∈ X π ( x ) K ( x , y ) = π ( y ) K ( y , x ) Lemma If detailed balance holds, then π is invariant for K and K is π -reversible. Example: the Gaussian AR process is π -reversible, π -invariant for τ 2 � � π ( x ) = N x ; 0, 1 − ρ 2 when | ρ | < 1. Lecture 5 Continuous State Markov Chains 5 / 40

Checking for recurrence It’s often straightforward to check for irreducibility, or for an invariant measure but not so for recurrence. Proposition If the chain is µ -irreducible and admits an invariant measure then the chain is recurrent. Remark: A chain that is µ -irreducible and admits an invariant measure is called a positive. Lecture 5 Continuous State Markov Chains 6 / 40

Law of Large Numbers Theorem If K is a π -irreducible, π -invariant Markov kernel, then for any integrable function ϕ : X → R : t 1 � ∑ ϕ ( X i ) = X ϕ ( x ) π ( x ) dx lim t t → ∞ i = 1 almost surely, for π − almost all starting values x. Theorem If K is a π -irreducible, π -invariant, Harris recurrent Markov chain, then for any integrable function ϕ : X → R : t 1 � ∑ lim ϕ ( X i ) = X ϕ ( x ) π ( x ) dx t t → ∞ i = 1 almost surely, for any starting value x. Lecture 5 Limit theorems 7 / 40

Convergence Theorem Suppose the kernel K is π -irreducible, π -invariant, aperiodic. Then, we have � � K t ( x , y ) − π ( y ) � � � dy = 0 lim t → ∞ X for π − almost all starting values x. Under some additional conditions, one can prove that a chain is geometrically ergodic, i.e. there exists ρ < 1 and a function M : X → R + such that for all measurable set A : | K n ( x , A ) − π ( A ) | ≤ M ( x ) ρ n , for all n ∈ N . In other words, we can obtain a rate of convergence. Lecture 5 Limit theorems 8 / 40

Central Limit Theorem Theorem Under regularity conditions, for a Harris recurrent, π -invariant Markov chain, we can prove � � √ t 1 � 0, σ 2 ( ϕ ) D ∑ ϕ ( X i ) − − t → ∞ N − → � � t X ϕ ( x ) π ( x ) dx , t i = 1 where the asymptotic variance can be written ∞ σ 2 ( ϕ ) = V π [ ϕ ( X 1 )] + 2 ∑ C ov π [ ϕ ( X 1 ) , ϕ ( X k )] . k = 2 This formula shows that (positive) correlations increase the asymptotic variance, compared to i.i.d. samples for which the variance would be V π ( ϕ ( X )) . Lecture 5 Limit theorems 9 / 40

Central Limit Theorem Example: for the AR Gaussian model, x ; 0, τ 2 / ( 1 − ρ 2 ) π ( x ) = N � � for | ρ | < 1 and τ 2 C ov ( X 1 , X k ) = ρ k − 1 V [ X 1 ] = ρ k − 1 1 − ρ 2 . Therefore with ϕ ( x ) = x , � � τ 2 ∞ τ 2 τ 2 1 + ρ σ 2 ( ϕ ) = ρ k ∑ 1 + 2 = 1 − ρ = ( 1 − ρ ) 2 , 1 − ρ 2 1 − ρ 2 k = 1 which increases when ρ → 1. Lecture 5 Limit theorems 10 / 40

Markov chain Monte Carlo We are interested in sampling from a distribution π , for instance a posterior distribution in a Bayesian framework. Markov chains with π as invariant distribution can be constructed to approximate expectations with respect to π . For example, the Gibbs sampler generates a Markov chain targeting π defined on R d using the full conditionals π ( x i | x 1 , . . . , x i − 1 , x i + 1 , . . . , x d ) . Lecture 5 MCMC 11 / 40

Gibbs Sampling Assume you are interested in sampling from x R d . π ( x ) = π ( x 1 , x 2 , ..., x d ) , Notation: x − i : = ( x 1 , ..., x i − 1 , x i + 1 , ..., x d ) . � � X ( 1 ) 1 , ..., X ( 1 ) Systematic scan Gibbs sampler . Let be the d initial state then iterate for t = 2, 3, ... 1. Sample X ( t ) � ·| X ( t − 1 ) , ..., X ( t − 1 ) � ∼ π X 1 | X − 1 . 1 2 d · · · j. Sample X ( t ) � ·| X ( t ) 1 , ..., X ( t ) j − 1 , X ( t − 1 ) j + 1 , ..., X ( t − 1 ) � ∼ π X j | X − j . j d · · · d. Sample X ( t ) � ·| X ( t ) 1 , ..., X ( t ) � ∼ π X d | X − d . d d − 1 Lecture 5 MCMC Gibbs Sampling 12 / 40

Gibbs Sampling Is the joint distribution π uniquely specified by the conditional distributions π X i | X − i ? Does the Gibbs sampler provide a Markov chain with the correct stationary distribution π ? If yes, does the Markov chain converge towards this invariant distribution? It will turn out to be the case under some mild conditions. Lecture 5 MCMC Gibbs Sampling 13 / 40

Hammersley-Clifford Theorem I Theorem Consider a distribution whose density π ( x 1 , x 2 , ..., x d ) is such that supp ( π ) = ⊗ d i = 1 supp ( π X i ) . Then for any ( z 1 , ..., z d ) ∈ supp ( π ) , we have � � x 1: j − 1 , z j + 1: d � � x j π X j | X − j d ∏ � . π ( x 1 , x 2 , ..., x d ) ∝ � x 1: j − 1 , z j + 1: d � � π X j | X − j z j j = 1 Remark: The condition above is the positivity condition. Equivalently, if π X i ( x i ) > 0 for i = 1, . . . , d , then π ( x 1 , . . . , x d ) > 0. Lecture 5 MCMC Gibbs Sampling 14 / 40

Proof of Hammersley-Clifford Theorem Proof. We have π ( x 1: d − 1 , x d ) = π X d | X − d ( x d | x 1: d − 1 ) π ( x 1: d − 1 ) , π ( x 1: d − 1 , z d ) = π X d | X − d ( z d | x 1: d − 1 ) π ( x 1: d − 1 ) . Therefore π ( x 1: d ) = π ( x 1: d − 1 , z d ) π ( x 1: d − 1 , x d ) π ( x 1: d − 1 , z d ) = π ( x 1: d − 1 , z d ) π ( x 1: d − 1 , x d ) / π ( x 1: d − 1 ) π ( x 1: d − 1 , z d ) / π ( x 1: d − 1 ) π X d | X 1: d − 1 ( x d | x 1: d − 1 ) = π ( x 1: d − 1 , z d ) π X d | X 1: d − 1 ( z d | x 1: d − 1 ) . Lecture 5 MCMC Gibbs Sampling 15 / 40

Proof. Similarly, we have π ( x 1: d − 1 , z d ) π ( x 1: d − 1 , z d ) = π ( x 1: d − 2 , z d − 1 , z d ) π ( x 1: d − 2 , z d − 1 , z d ) π ( x 1: d − 1 , z d ) / π ( x 1: d − 2 , z d ) = π ( x 1: d − 2 , z d − 1 , z d ) π ( x 1: d − 2 , z d − 1 , z d ) / π ( x 1: d − 2 , z d ) π X d − 1 | X − ( d − 1 ) ( x d − 1 | x 1: d − 2 , z d ) = π ( x 1: d − 2 , z d − 1 , z d ) π X d − 1 | X − ( d − 1 ) ( z d − 1 | x 1: d − 2 , z d ) hence π X d − 1 | X − ( d − 1 ) ( x d − 1 | x 1: d − 2 , z d ) π ( x 1: d ) = π ( x 1: d − 2 , z d − 1 , z d ) π X d − 1 | X − ( d − 1 ) ( z d − 1 | x 1: d − 2 , z d ) π X d | X − d ( x d | x 1: d − 1 ) × π X d | X − d ( z d | x 1: d − 1 ) Lecture 5 MCMC Gibbs Sampling 16 / 40

Proof. By z ∈ supp ( π ) we have that π X i ) ( z i ) > 0 for all i . Also, we are allowed to suppose that π X i ( x i ) > 0 for all i . Thus all the conditional probabilities we introduce are positive since π X j | X − j ( x j | x 1 , . . . , x j − 1 , z j + 1 , . . . , z d ) = π ( x 1 , . . . , x j − 1 , x j , z j + 1 , . . . , z d ) π ( x 1 , . . . , x j − 1 , z j , z j + 1 , . . . , z d ) > 0. By iterating we have the theorem. Lecture 5 MCMC Gibbs Sampling 17 / 40

Example: Non-Integrable Target Consider the following conditionals on R + π X 1 | X 2 ( x 1 | x 2 ) = x 2 exp ( − x 2 x 1 ) π X 2 | X 1 ( x 2 | x 1 ) = x 1 exp ( − x 1 x 2 ) . We might expect that these full conditionals define a joint probability density π ( x 1 , x 2 ) . Hammersley-Clifford would give π X 1 | X 2 ( x 1 | z 2 ) π X 2 | X 1 ( x 2 | x 1 ) π ( x 1 , x 2 , ..., x d ) ∝ π X 1 | X 2 ( z 1 | z 2 ) π X 2 | X 1 ( z 2 | x 1 ) = z 2 exp ( − z 2 x 1 ) x 1 exp ( − x 1 x 2 ) z 2 exp ( − z 2 z 1 ) x 1 exp ( − x 1 z 2 ) ∝ exp ( − x 1 x 2 ) . However �� exp ( − x 1 x 2 ) dx 1 dx 2 = ∞ so π X 1 | X 2 ( x 1 | x 2 ) = x 2 exp ( − x 2 x 1 ) and π X 2 | X 1 ( x 1 | x 2 ) = x 1 exp ( − x 1 x 2 ) are not compatible. Lecture 5 MCMC Gibbs Sampling 18 / 40

Advanced Simulation - Lecture 5 George Deligiannidis February 1st, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 5 George Deligiannidis February 1st, 2016 Irreducibility and aperiodicity Definition Given a distribution over X , a Markov chain is -irreducible if K t ( x , A ) > 0. x X A : ( A ) > 0

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Complexity and optimization of the Gibbs Sampler for multilevel linear models Giacomo Zanella

Gibbs Sampling from -Determinantal Point Processes Alireza Rezaei University of Washington

Multi-parameter models - Gibbs Sampling Applied Bayesian Statistics Dr. Earvin Balderama

The Gibbs Sampler CSE 527 Lecture 9 Lawrence, et al. Detecting Subtle Sequence

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani

The Software Package A NTS InFields A NTS InFields is a Software Package for Simulation and

Lecture 12 The remaining samples x 1 ,,x S is an approximation of p*(x) Notation D = ( x

STAT 339 Hidden Markov Models III 21 April 2017 Bayesian Estimation / Model Averaging Outline

Advanced Simulation - Lecture 5 George Deligiannidis February 1st, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 5 George Deligiannidis February 1st, 2016 Irreducibility and aperiodicity Definition Given a distribution over X , a Markov chain is -irreducible if K t ( x , A ) > 0. x X A : ( A ) > 0

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson &amp; Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Complexity and optimization of the Gibbs Sampler for multilevel linear models Giacomo Zanella

Gibbs Sampling from -Determinantal Point Processes Alireza Rezaei University of Washington

Multi-parameter models - Gibbs Sampling Applied Bayesian Statistics Dr. Earvin Balderama

The Gibbs Sampler CSE 527 Lecture 9 Lawrence, et al. Detecting Subtle Sequence

Probabilistic &amp; Unsupervised Learning Sampling Methods Maneesh Sahani

The Software Package A NTS InFields A NTS InFields is a Software Package for Simulation and

Lecture 12 The remaining samples x 1 ,,x S is an approximation of p*(x) Notation D = ( x

STAT 339 Hidden Markov Models III 21 April 2017 Bayesian Estimation / Model Averaging Outline

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani