Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, 2018 Patrick Rebeschini Lecture 6 1/ 25

Markov chain Monte Carlo We are interested in sampling from a distribution π , for instance a posterior distribution in a Bayesian framework. Markov chains with π as invariant distribution can be constructed to approximate expectations with respect to π . For example, the Gibbs sampler generates a Markov chain targeting π defined on R d using the full conditionals π ( x i | x 1 , . . . , x i − 1 , x i +1 , . . . , x d ) . Patrick Rebeschini Lecture 6 2/ 25

Gibbs Sampling Assume you are interested in sampling from π ( x ) = π ( x 1 , x 2 , ..., x d ) . Notation: x − i := ( x 1 , ..., x i − 1 , x i +1 , ..., x d ). � X (1) 1 , ..., X (1) � Systematic scan Gibbs sampler . Let be the d initial state then iterate for t = 2 , 3 , ... 1. Sample X ( t ) � ·| X ( t − 1) , ..., X ( t − 1) � ∼ π X 1 | X − 1 . 1 2 d · · · j. Sample X ( t ) � ·| X ( t ) 1 , ..., X ( t ) j − 1 , X ( t − 1) j +1 , ..., X ( t − 1) � ∼ π X j | X − j . j d · · · d. Sample X ( t ) � ·| X ( t ) 1 , ..., X ( t ) � ∼ π X d | X − d . d − 1 d Patrick Rebeschini Lecture 6 3/ 25

Gibbs Sampling Is the joint distribution π uniquely specified by the conditional distributions π X i | X − i ? Does the Gibbs sampler provide a Markov chain with the correct stationary distribution π ? If yes, does the Markov chain converge towards this invariant distribution? It will turn out to be the case under some mild conditions. Patrick Rebeschini Lecture 6 4/ 25

Hammersley-Clifford Theorem Theorem . Consider a distribution whose density π ( x 1 , x 2 , ..., x d ) is such that supp( π ) = ⊗ d i =1 supp( π X i ). Then for any ( z 1 , ..., z d ) ∈ supp( π ), we have d π X j | X − j ( x j | x 1: j − 1 , z j +1: d ) � π ( x 1 , x 2 , ..., x d ) ∝ π X j | X − j ( z j | x 1: j − 1 , z j +1: d ) . j =1 Proof: we have π ( x 1: d − 1 , x d ) = π X d | X − d ( x d | x 1: d − 1 ) π ( x 1: d − 1 ) , π X d | X − d ( z d | x 1: d − 1 ) π ( x 1: d − 1 ) . π ( x 1: d − 1 , z d ) = Therefore π ( x 1: d ) = π ( x 1: d − 1 , z d ) π X d | X − d ( x d | x 1: d − 1 ) π X d | X − d ( z d | x 1: d − 1 ) . Patrick Rebeschini Lecture 6 5/ 25

Hammersley-Clifford Theorem Similarly, we have π ( x 1: d − 1 , z d ) = π X d − 1 | X − ( d − 1) ( x d − 1 | x 1: d − 2 , z d ) π ( x 1: d − 2 , z d ) , π ( x 1: d − 2 , z d − 1 , z d ) = π X d − 1 | X − ( d − 1) ( z d − 1 | x 1: d − 2 , z d ) π ( x 1: d − 2 , z d ) hence π X d − 1 | X − ( d − 1) ( x d − 1 | x 1: d − 2 , z d ) π ( x 1: d ) = π ( x 1: d − 2 , z d − 1 , z d ) π X d − 1 | X − ( d − 1) ( z d − 1 | x 1: d − 2 , z d ) × π X d | X − d ( x d | x 1: d − 1 ) π X d | X − d ( z d | x 1: d − 1 ) By iterating, we obtain the theorem, where the multiplicative constant is exactly π ( z 1 , . . . , z d ). Patrick Rebeschini Lecture 6 6/ 25

Example: Non-Integrable Target Consider the following conditionals on R + π X 1 | X 2 ( x 1 | x 2 ) = x 2 exp ( − x 2 x 1 ) π X 2 | X 1 ( x 2 | x 1 ) = x 1 exp ( − x 1 x 2 ) . We might expect that these full conditionals define a joint probability density π ( x 1 , x 2 ). Hammersley-Clifford would give π ( x 1 , x 2 , ..., x d ) ∝ π X 1 | X 2 ( x 1 | z 2 ) π X 2 | X 1 ( x 2 | x 1 ) π X 1 | X 2 ( z 1 | z 2 ) π X 2 | X 1 ( z 2 | x 1 ) = z 2 exp ( − z 2 x 1 ) x 1 exp ( − x 1 x 2 ) z 2 exp ( − z 2 z 1 ) x 1 exp ( − x 1 z 2 ) ∝ exp ( − x 1 x 2 ) . � � exp ( − x 1 x 2 ) dx 1 dx 2 is not finite so However π X 1 | X 2 ( x 1 | x 2 ) = x 2 exp ( − x 2 x 1 ) and π X 2 | X 1 ( x 1 | x 2 ) = x 1 exp ( − x 1 x 2 ) are not compatible. Patrick Rebeschini Lecture 6 7/ 25

Example: Positivity condition violated 2 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 y ● ● ● ● −1 −2 −2 −1 0 1 2 x Figure: Gibbs sampling targeting π ( x, y ) ∝ 1 [ − 1 , 0] × [ − 1 , 0] ∪ [0 , 1] × [0 , 1] ( x, y ). Patrick Rebeschini Lecture 6 8/ 25

Invariance of the Gibbs sampler The kernel of the Gibbs sampler (case d = 2) is K ( x ( t − 1) , x ( t ) ) = π X 1 | X 2 ( x ( t ) | x ( t − 1) ) π X 2 | X 1 ( x ( t ) | x ( t ) 1 ) 1 2 2 Case d > 2: d π X j | X − j ( x ( t ) | x ( t ) 1: j − 1 , x ( t − 1) K ( x ( t − 1) , x ( t ) ) = � j +1: d ) j j =1 Proposition : The systematic scan Gibbs sampler kernel admits π as invariant distribution. Proof for d = 2. We have � � K ( x, y ) π ( x ) dx = π ( y 2 | y 1 ) π ( y 1 | x 2 ) π ( x 1 , x 2 ) dx 1 dx 2 � = π ( y 2 | y 1 ) π ( y 1 | x 2 ) π ( x 2 ) dx 2 = π ( y 2 | y 1 ) π ( y 1 ) = π ( y 1 , y 2 ) = π ( y ) . Patrick Rebeschini Lecture 6 9/ 25

Irreducibility and Recurrence Proposition : Assume π satisfies the positivity condition, then the Gibbs sampler yields a π − irreducible and recurrent Markov chain. Theorem . Assume the positivity condition is satisfied then we have for any integrable function ϕ : X → R : t lim 1 � � X ( i ) � � ϕ = ϕ ( x ) π ( x ) dx t X i =1 for π − almost all starting value X (1) . Patrick Rebeschini Lecture 6 10/ 25

Example: Bivariate Normal Distribution Let X := ( X 1 , X 2 ) ∼ N ( µ, Σ) where µ = ( µ 1 , µ 2 ) and � � σ 2 ρ 1 Σ = . σ 2 ρ 2 The Gibbs sampler proceeds as follows in this case 1 Sample X ( t ) � � X ( t − 1) � � µ 1 + ρ/σ 2 , σ 2 1 − ρ 2 /σ 2 ∼ N − µ 2 1 2 2 2 2 Sample X ( t ) � � X ( t ) � � µ 2 + ρ/σ 2 , σ 2 2 − ρ 2 /σ 2 ∼ N − µ 1 . 1 1 2 1 By proceeding this way, we generate a Markov chain X ( t ) whose successive samples are correlated. If successive values of X ( t ) are strongly correlated, then we say that the Markov chain mixes slowly. Patrick Rebeschini Lecture 6 11/ 25

Bivariate Normal Distribution ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 ● ● −2 0 2 x Figure: Case where ρ = 0 . 1, first 100 steps. Patrick Rebeschini Lecture 6 12/ 25

Bivariate Normal Distribution 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 ● ● y ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ●● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● −2 −2 0 2 x Figure: Case where ρ = 0 . 99, first 100 steps. Patrick Rebeschini Lecture 6 13/ 25

Bivariate Normal Distribution 0.4 0.6 0.3 density density 0.4 0.2 0.2 0.1 0.0 0.0 −2 0 2 −2 0 2 X X Figure: Histogram of the first component of the chain after 1000 iterations. Small ρ on the left, large ρ on the right. Patrick Rebeschini Lecture 6 14/ 25

Bivariate Normal Distribution 0.4 0.4 0.3 0.3 density density 0.2 0.2 0.1 0.1 0.0 0.0 −2 0 2 −2 0 2 X X Figure: Histogram of the first component of the chain after 10000 iterations. Small ρ on the left, large ρ on the right. Patrick Rebeschini Lecture 6 15/ 25

Bivariate Normal Distribution 0.4 0.4 0.3 0.3 density density 0.2 0.2 0.1 0.1 0.0 0.0 −2 0 2 −2 0 2 X X Figure: Histogram of the first component of the chain after 100000 iterations. Small ρ on the left, large ρ on the right. Patrick Rebeschini Lecture 6 16/ 25

Gibbs Sampling and Auxiliary Variables Gibbs sampling requires sampling from π X j | X − j . In many scenarios, we can include a set of auxiliary variables Z 1 , ..., Z p and have an “extended” distribution of joint density π ( x 1 , ..., x d , z 1 , ..., z p ) such that � π ( x 1 , ..., x d , z 1 , ..., z p ) dz 1 ...dz d = π ( x 1 , ..., x d ) . which is such that its full conditionals are easy to sample. Mixture models, Capture-recapture models, Tobit models, Probit models etc. Patrick Rebeschini Lecture 6 17/ 25

Mixtures of Normals mixture population 1 0.5 population 2 population 3 0.4 density 0.3 0.2 0.1 0.0 -2 -1 0 1 Independent data y 1 , ..., y n t K � � � µ k , σ 2 Y i | θ ∼ p k N k k =1 � p 1 , ..., p K , µ 1 , ..., µ K , σ 2 1 , ..., σ 2 � . where θ = K Patrick Rebeschini Lecture 6 18/ 25

Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, 2018 Patrick Rebeschini Lecture 6 1/ 25 Markov chain Monte Carlo We are interested in sampling from a distribution , for instance a posterior distribution in a Bayesian

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Flood Insurance Coverage in Dare County: Before and After Hurricane Floyd Craig E. Landry

Mobile Ad Hoc Networks Summary and Further Reading Part I of 13th and Last Week 18.07.2007

Automatic Deployment of Peer-to-Peer Network Infrastructures PhD Scientific Report R azvan

[P2P S YSTEMS : G NUTELLA & B IT T ORRENT ] Shrideep Pallickara Computer Science Colorado

26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate

LMS and GAMLSS Flexible Regression and Smoothing Mikis Stasinopoulos 1 and Bob Rigby 1 1

Course Introduction Hybrid Modeling Marco Chiarandini Department of Mathematics & Computer

Lets Play Applanting... Ajit Hatti (Co-Founder) Null Open Security Community HELLO From

Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, - PowerPoint PPT Presentation

Advanced Simulation - Lecture 6 Patrick Rebeschini January 31th, 2018 Patrick Rebeschini Lecture 6 1/ 25 Markov chain Monte Carlo We are interested in sampling from a distribution , for instance a posterior distribution in a Bayesian

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson &amp; Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Flood Insurance Coverage in Dare County: Before and After Hurricane Floyd Craig E. Landry

Mobile Ad Hoc Networks Summary and Further Reading Part I of 13th and Last Week 18.07.2007

Automatic Deployment of Peer-to-Peer Network Infrastructures PhD Scientific Report R azvan

[P2P S YSTEMS : G NUTELLA &amp; B IT T ORRENT ] Shrideep Pallickara Computer Science Colorado

26:010:557 / 26:620:557 Social Science Research Methods Dr. Peter R. Gillett Associate

LMS and GAMLSS Flexible Regression and Smoothing Mikis Stasinopoulos 1 and Bob Rigby 1 1

Course Introduction Hybrid Modeling Marco Chiarandini Department of Mathematics &amp; Computer

Lets Play Applanting... Ajit Hatti (Co-Founder) Null Open Security Community HELLO From

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

[P2P S YSTEMS : G NUTELLA & B IT T ORRENT ] Shrideep Pallickara Computer Science Colorado

Course Introduction Hybrid Modeling Marco Chiarandini Department of Mathematics & Computer