Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, 2016

Irreducibility and Recurrence Proposition Assume π satisfies the positivity condition, then the Gibbs sampler yields a π − irreducible and recurrent Markov chain. Proof. Recurrence. Will follow from irreducibility and the fact that π is invariant. Irreducibility. Let X ⊂ R d , such that π ( X ) = 1. Write K for the kernel and let A ⊂ X such that π ( A ) > 0. Then for any x ∈ X � K ( x , A ) = A K ( x , y ) d y � = A π X 1 | − 1 ( y 1 | x 2 , . . . , x d ) × · · · × π X d | X − d ( y d | y 1 , . . . , y d − 1 ) d y . Lecture 6 Gibbs Sampling Asymptotics 2 / 1

Proof. Thus if for some x ∈ X and A with π ( A ) > 0 we have K ( x , A ) = 0, we must have that π X 1 | X − 1 ( y 1 | x 2 , . . . , x d ) × · · · × π X d | X − d ( y d | y 1 , . . . , y d − 1 ) = 0, for π -almost all y = ( y 1 , . . . , y d ) ∈ A . Therefore we must also have that � � � � y 1: j − 1 , x j + 1: d y j π X j | X − j d ∏ � � = 0, π ( y 1 , x 2 , ..., y d ) ∝ � � y 1: j − 1 , x j + 1: d π X j | X − j x j j = 1 for almost all y = ( y 1 , . . . , y d ) ∈ A and thus π ( A ) = 0 obtaining a contradiction. Lecture 6 Gibbs Sampling Asymptotics 3 / 1

LLN for Gibbs Sampler Theorem Assume the positivity condition is satisfied then we have for any integrable function ϕ : X → R : � � X ( i ) � t lim 1 ∑ ϕ = X ϕ ( x ) π ( x ) dx t i = 1 for π − almost all starting value X ( 1 ) . Lecture 6 Gibbs Sampling Asymptotics 4 / 1

Example: Bivariate Normal Distribution Let X : = ( X 1 , X 2 ) ∼ N ( µ , Σ ) where µ = ( µ 1 , µ 2 ) and � σ 2 � ρ Σ = 1 . σ 2 ρ 2 The Gibbs sampler proceeds as follows in this case � � � � 1 Sample X ( t ) X ( t − 1 ) µ 1 + ρ / σ 2 , σ 2 1 − ρ 2 / σ 2 ∼ N − µ 2 1 2 2 2 � � � � 2 Sample X ( t ) X ( t ) µ 2 + ρ / σ 2 , σ 2 2 − ρ 2 / σ 2 ∼ N 1 − µ 1 . 2 1 1 By proceeding this way, we generate a Markov chain X ( t ) whose successive samples are correlated. If successive values of X ( t ) are strongly correlated, then we say that the Markov chain mixes slowly. Lecture 6 Gibbs Sampling Asymptotics 5 / 1

Bivariate Normal Distribution ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● −2 0 2 x Figure: Case where ρ = 0.1, first 100 steps. Lecture 6 Gibbs Sampling Asymptotics 6 / 1

Bivariate Normal Distribution 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −2 0 2 x Figure: Case where ρ = 0.99, first 100 steps. Lecture 6 Gibbs Sampling Asymptotics 7 / 1

Bivariate Normal Distribution 0.4 0.6 0.3 density density 0.4 0.2 0.2 0.1 0.0 0.0 −2 0 2 −2 0 2 X X (a) Figure A (b) Figure B Figure: Histogram of the first component of the chain after 1000 iterations. Small ρ on the left, large ρ on the right. Lecture 6 Gibbs Sampling Asymptotics 8 / 1

Bivariate Normal Distribution 0.4 0.4 0.3 0.3 density density 0.2 0.2 0.1 0.1 0.0 0.0 −2 0 2 −2 0 2 X X (a) b (b) b Figure: Histogram of the first component of the chain after 10000 iterations. Small ρ on the left, large ρ on the right. Lecture 6 Gibbs Sampling Asymptotics 9 / 1

Bivariate Normal Distribution 0.4 0.4 0.3 0.3 density density 0.2 0.2 0.1 0.1 0.0 0.0 −2 0 2 −2 0 2 X X (a) Figure A (b) Figure B Figure: Histogram of the first component of the chain after 100000 iterations. Small ρ on the left, large ρ on the right. Lecture 6 Gibbs Sampling Asymptotics 10 / 1

Gibbs Sampling and Auxiliary Variables Gibbs sampling requires sampling from π X j | X − j . In many scenarios, we can include a set of auxiliary variables Z 1 , ..., Z p and have an “extended” distribution of joint density � � π x 1 , ..., x d , z 1 , ..., z p such that � � � dz 1 ... dz d = π ( x 1 , ..., x d ) . π x 1 , ..., x d , z 1 , ..., z p which is such that its full conditionals are easy to sample. Mixture models, Capture-recapture models, Tobit models, Probit models etc. Lecture 6 Gibbs Sampling Asymptotics 11 / 1

Mixtures of Normals mixture population 1 0.5 population 2 population 3 0.4 density 0.3 0.2 0.1 0.0 -2 -1 0 1 Independent data y 1 , ..., y n t K � � ∑ µ k , σ 2 Y i | θ ∼ p k N k k = 1 � � p 1 , ..., p K , µ 1 , ..., µ K , σ 2 1 , ..., σ 2 where θ = . K Lecture 6 Gibbs Sampling Asymptotics 12 / 1

Bayesian Model Likelihood function  2 � � n n K − ( y i − µ k ) p k   . ∏ ∏ ∑ p ( y 1 , ..., y n | θ ) = p ( y i | θ ) = � exp 2 σ 2 2 πσ 2 i = 1 i = 1 k = 1 k k Let’s fix K = 2, σ 2 k = 1 and p k = 1/ K for all k . Prior model K ∏ p ( θ ) = p ( µ k ) k = 1 where µ k ∼ N ( α k , β k ) . Let us fix α k = 0, β k = 1 for all k . Not obvious how to sample p ( µ 1 | µ 2 , y 1 , . . . , y n ) . Lecture 6 Gibbs Sampling Asymptotics 13 / 1

Auxiliary Variables for Mixture Models Associate to each Y i an auxiliary variable Z i ∈ { 1, ..., K } such that � � µ k , σ 2 P ( Z i = k | θ ) = p k and Y i | Z i = k , θ ∼ N k so that K � � y i ; µ k , σ 2 ∑ p ( y i | θ ) = P ( Z i = k ) N k k = 1 The extended posterior is given by n p ( θ , z 1 , ..., z n | y 1 , ..., y n ) ∝ p ( θ ) ∏ P ( z i | θ ) p ( y i | z i , θ ) . i = 1 Gibbs samples alternately P ( z 1: n | y 1: n , µ 1: K ) p ( µ 1: K | y 1: n , z 1: n ) . Lecture 6 Gibbs Sampling Asymptotics 14 / 1

Gibbs Sampling for Mixture Model We have n P ( z 1: n | y 1: n , θ ) = ∏ P ( z i | y i , θ ) i = 1 where P ( z i | θ ) p ( y i | z i , θ ) P ( z i | y i , θ ) = ∑ K k = 1 P ( z i = k | θ ) p ( y i | z i = k , θ ) Let n k = ∑ n i = 1 1 { k } ( z i ) , n k y k = ∑ n i = 1 y i 1 { k } ( z i ) then � n k y k � 1 µ k | z 1: n , y 1: n ∼ N , . 1 + n k 1 + n k Lecture 6 Gibbs Sampling Asymptotics 15 / 1

Mixtures of Normals 0.2 density 0.1 0.0 −5.0 −2.5 0.0 2.5 5.0 observations Figure: 200 points sampled from 1 2 N ( − 2, 1 ) + 1 2 N ( 2, 1 ) . Lecture 6 Gibbs Sampling Asymptotics 16 / 1

Mixtures of Normals 3 density 2 1 0 −2 0 2 µ 1 3 density 2 1 0 −2 0 2 µ 2 Figure: Histogram of the parameters obtained by 10, 000 iterations of Gibbs sampling. Lecture 6 Gibbs Sampling Asymptotics 17 / 1

Mixtures of Normals 2 1 value 0 −1 −2 0 2500 5000 7500 10000 iteration µ 1 µ 2 variable Figure: Traceplot of the parameters obtained by 10, 000 iterations of Gibbs sampling. Lecture 6 Gibbs Sampling Asymptotics 18 / 1

Gibbs sampling in practice Many posterior distributions can be automatically decomposed into conditional distributions by computer programs. This is the idea behind BUGS (Bayesian inference Using Gibbs Sampling), JAGS (Just another Gibbs Sampler). Lecture 6 Gibbs Sampling Asymptotics 19 / 1

Outline Given a target π ( x ) = π ( x 1 , x 2 , ..., x d ) , Gibbs sampling � � � � x − j works by sampling from π X j | X − j x j for j = 1, ..., d . Sampling exactly from one of these full conditionals might be a hard problem itself. Even if it is possible, the Gibbs sampler might converge slowly if components are highly correlated. If the components are not highly correlated then Gibbs sampling performs well, even when d → ∞ , e.g. with an error increasing “only” polynomially with d . Metropolis–Hastings algorithm (1953, 1970) is a more general algorithm that can bypass these problems. Additionally Gibbs can be recovered as a special case. Lecture 6 Gibbs Sampling Asymptotics 20 / 1

Metropolis–Hastings algorithm Target distribution on X = R d of density π ( x ) . Proposal distribution: for any x , x ′ ∈ X , we have q ( x ′ | x ) ≥ 0 and � X q ( x ′ | x ) dx ′ = 1. Starting with X ( 1 ) , for t = 2, 3, ... � ·| X ( t − 1 ) � 1 Sample X ⋆ ∼ q . 2 Compute X ( t − 1 ) � � � X ⋆ �   � � X ⋆ | X ( t − 1 ) � π ( X ⋆ ) q  1,  . = min α � X ( t − 1 ) � � X ⋆ | X ( t − 1 ) � π q � X ⋆ | X ( t − 1 ) � , set X ( t ) = X ⋆ , 3 Sample U ∼ U [ 0,1 ] . If U ≤ α otherwise set X ( t ) = X ( t − 1 ) . Lecture 6 Metropolis–Hastings 21 / 1

Metropolis–Hastings algorithm 2 ● y 0 −2 −2 0 2 x Figure: Metropolis–Hastings on a bivariate Gaussian target. Lecture 6 Metropolis–Hastings 22 / 1

Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, 2016 Irreducibility and Recurrence Proposition Assume satisfies the positivity condition, then the Gibbs sampler yields a irreducible and recurrent Markov chain.

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Course Overview and Introduction Probabilistic Graphical Models Sharif University of Technology

19-11-2019 Department of Large Animal Sciences The multivariate normal distribution Anders

Monte Carlo Methods Lecture notes for MAP001169 Based on Script by Martin Sk old adopted by

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building

Multiparameter models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019

Common Probability Distributions Several simple probability distributions are useful in may

CS354 Nathan Sprague October 13, 2020 Probabilistic State Representations: Continuous

MATLAB Tutorial http://www.mathworks.co.kr/help/pdf_doc/matlab/getstart.pdf Introduction of

Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 George Deligiannidis February 3rd, 2016 Irreducibility and Recurrence Proposition Assume satisfies the positivity condition, then the Gibbs sampler yields a irreducible and recurrent Markov chain.

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson &amp; Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Course Overview and Introduction Probabilistic Graphical Models Sharif University of Technology

19-11-2019 Department of Large Animal Sciences The multivariate normal distribution Anders

Monte Carlo Methods Lecture notes for MAP001169 Based on Script by Martin Sk old adopted by

Multivariate Normal Distribution Max Turgeon STAT 4690Applied Multivariate Analysis Building

Multiparameter models (cont.) Dr. Jarad Niemi STAT 544 - Iowa State University February 21, 2019

Common Probability Distributions Several simple probability distributions are useful in may

CS354 Nathan Sprague October 13, 2020 Probabilistic State Representations: Continuous

MATLAB Tutorial http://www.mathworks.co.kr/help/pdf_doc/matlab/getstart.pdf Introduction of

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation