Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 - PowerPoint PPT Presentation

Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 Patrick Rebeschini Lecture 16 1/ 25

Outline Particle methods for static problems. Evidence estimation. The End! Patrick Rebeschini Lecture 16 2/ 25

Sequence of posterior distributions Consider the problem of estimating � ϕ ( x ) π ( x ) dx for test function ϕ : X → R and target distribution π on X . Introduce intermediate distributions π 0 , π 1 , . . . , π T such that: π 0 is easy to sample from, π t and π t +1 are not too different, π T = π . Patrick Rebeschini Lecture 16 3/ 25

Sequence of posterior distributions Case where π ( θ ) ∝ p ( θ ) p ( y 1: T | θ ). Introduce the partial posterior π t ( θ ) ∝ p ( θ ) p ( y 1: t | θ ). We have π 0 ( θ ) = p ( θ ), the prior distribution. We have π T ( θ ) = π ( θ ) = p ( θ | y 1: T ), the full posterior distribution. We can expect π t ( θ ) to be similar to π t +1 ( θ ), at least when t is large, when the data is i.i.d. Patrick Rebeschini Lecture 16 4/ 25

Sequence of posterior distributions General target distribution π ( x ). Introduce a simple parametric distribution q . We can introduce π t ( x ) ∝ π ( x ) γ t q ( x ) 1 − γ t where 0 = γ 0 < γ 1 < . . . < γ T = 1. Then π 0 = q and π T = π . If γ t is close to γ t +1 , then π t is close to π t +1 . Patrick Rebeschini Lecture 16 5/ 25

Sequential Importance Sampling: algorithm At time t = 0 Sample X i ∼ π 0 ( · ) for i ∈ { 1 , . . . , N } . Compute the weights 0 = 1 w i ∀ i ∈ { 1 , . . . , N } N . At time t ≥ 1 Compute the weights w i t = w i t − 1 × ω i t π t ( X i ) = w i t − 1 × π t − 1 ( X i ) . At all times, ( w i t , X i ) N i =1 approximates π t . Patrick Rebeschini Lecture 16 6/ 25

Sequential Importance Sampling: diagnostics As already seen for IS, we can compute the effective sample size � 2 �� N i =1 w i 1 t � = ESS t = � 2 . �� N � n i =1 ( w i � W i t ) 2 i =1 t t = N − 1 for all i . ESS t = N if W i t ≈ 1, and for j � = i , W j If there exists i such that W i t ≈ 0, then ESS t ≈ 1. As a rule of thumb, the higher the ESS the better our approximation. Patrick Rebeschini Lecture 16 7/ 25

Toy model Data y t ∼ N ( µ, 1), 1000 points generated with µ ⋆ = 2. Prior µ ∼ N (0 , 1) Sequence of partial posterior π t ( µ ) ∝ p ( µ ) � t s =1 p ( y s | µ ). Incremental weights: π t ( µ ) π t − 1 ( µ ) ∝ p ( y t | µ ) We can look at the evolution of the effective sample size with t . Patrick Rebeschini Lecture 16 8/ 25

Toy model 1024 128 ESS 32 2 0 250 500 750 1000 time Figure: Effective sample size against “time”, using sequential importance sampling. Patrick Rebeschini Lecture 16 9/ 25

Sequential Importance Sampling with Resampling At time t = 0 Sample X i 0 ∼ π 0 ( · ) for i ∈ { 1 , . . . , N } . Compute the weights 0 = 1 w i ∀ i ∈ { 1 , . . . , N } N . At time t ≥ 1 � → � N − 1 , X i � � w i t − 1 , X i . Resample t − 1 t − 1 t = ¯ Define X i X i t − 1 . Compute the weights π t ( X i t ) w i t = ω i t = t ) . π t − 1 ( X i Problem: there are less and less unique values in ( X i t ) N i =1 . Patrick Rebeschini Lecture 16 10/ 25

Toy model 1024 128 ESS 32 2 0 250 500 750 1000 time Figure: Effective sample size against “time”, using sequential importance sampling with resampling. Patrick Rebeschini Lecture 16 11/ 25

Toy model 1024 128 # unique 32 2 0 250 500 750 1000 time Figure: Number of unique values against “time”, using sequential importance sampling with resampling. Patrick Rebeschini Lecture 16 12/ 25

Move steps Consider particles ( N − 1 , ¯ X i t ) N i =1 , approximating π t . The ESS is maximum (= N ), but multiple values within ( ¯ X i t ) N i =1 are identical. How to diversify the particles while still approximating π t ? Apply a Markov kernel K t to each ¯ X i t : t ∼ K t ( ¯ X i X i ∀ i ∈ { 1 , . . . , N } t , dx ) Can we find K t such that ( N − 1 , X i t ) still approximates π t ? e.g. N 1 � a.s. � ϕ ( X i t ) − − − − → ϕ ( x ) π t ( x ) dx. N N →∞ i =1 Patrick Rebeschini Lecture 16 13/ 25

Move steps Assume that K t leaves π t invariant: � π t ( x ) K t ( x, y ) dx = π t ( y ) . If ¯ t ∼ K t ( ¯ X i t ∼ π t , then X i X i t , dy ) also follows π t . Also, if ¯ X ∼ q and w ( x ) = π ( x ) /q ( x ), such that E q [ w ( ¯ X ) ϕ ( ¯ X )] = E π [ ϕ ( X )] then, for X ∼ K ( ¯ X, dx ), E qK [ w ( ¯ X ) ϕ ( X )] = E π [ ϕ ( X )] . Patrick Rebeschini Lecture 16 14/ 25

Move steps Result: if ( w i , ¯ X i ) approximates π , then we can apply a X i and not change the weights. π -invariant kernel to each ¯ Draw X i ∼ K ( ¯ X i , dx ) for all i ∈ { 1 , . . . , N } . Keep the weights unchanged. ( w i , X i ) still approximates π . Patrick Rebeschini Lecture 16 15/ 25

Sequential Monte Carlo Sampler: algorithm At time t = 0 Sample X i 0 ∼ π 0 ( · ) for i ∈ { 1 , . . . , N } . Compute the weights 0 = 1 w i ∀ i ∈ { 1 , . . . , N } N . At time t ≥ 1 � → � N − 1 , X i � � w i t − 1 , X i Resample . t − 1 t − 1 Draw t ∼ K t − 1 ( ¯ X i X i t − 1 , dx ) . Compute the weights π t ( X i t ) w i t = ω i t = t ) . π t − 1 ( X i At all times, ( w i t , X i t ) N i =1 approximates π t . Patrick Rebeschini Lecture 16 16/ 25

Adaptive resampling Problem: if K t is a Metropolis-Hastings with invariant distribution π t , computational cost typically linear in t . Thus applying K t at each step t ∈ { 1 , . . . , T } yields an overall cost in T � O ( t ) = O ( T 2 ) t =1 We can save time by performing the resample-move step only when necessary. Use the Effective Sample Size to know whether to trigger a resample-move step. Patrick Rebeschini Lecture 16 17/ 25

Sequential Monte Carlo Sampler: algorithm At time t = 0 Sample X i 0 ∼ π 0 ( · ) for i ∈ { 1 , . . . , N } . Compute the weights 0 = 1 w i ∀ i ∈ { 1 , . . . , N } N At time t ≥ 1 If ESS < ˜ N , then: � � i Resample � w i t − 1 , X i � N − 1 , X . → t − 1 t − 1 Draw t ∼ K t − 1 ( ¯ X i X i t − 1 , dx ) . Compute the weights π t ( X i t ) w i t = w i t − 1 × t ) . π t − 1 ( X i At all times, ( w i t , X i t ) N i =1 approximates π t . Patrick Rebeschini Lecture 16 18/ 25

Toy model 1024 128 ESS 32 2 0 250 500 750 1000 time Figure: Effective sample size against “time”, using sequential Monte Carlo sampling. Dashed lines indicate resampling times. Patrick Rebeschini Lecture 16 19/ 25

Toy model 1024 128 Nunique 32 2 0 250 500 750 1000 time Figure: Number of unique values against “time”, using sequential Monte Carlo sampling. Dashed lines indicate resampling times. Patrick Rebeschini Lecture 16 20/ 25

Toy model 15 10 π T ( µ ) 5 0 1.90 1.95 2.00 2.05 2.10 2.15 µ Figure: Posterior approximation (in black), and true posterior (in red), after 1000 observations. Patrick Rebeschini Lecture 16 21/ 25

Evidence estimation Assume ( w i t , X i t ) N i =1 approximates the posterior distribution π t ( θ ) ∝ p ( θ ) p ( y 1: t | θ ) at time t . Then the following estimator � N i =1 w i t p ( y t +1 | X i t ) � N i =1 w i t converges to � p ( y t +1 | θ ) π t ( θ ) dθ p ( y t +1 | θ ) p ( θ ) p ( y 1: t | θ ) � = dθ p ( y 1: t ) 1 � = p ( y 1: t +1 | θ ) p ( θ ) dθ = p ( y t +1 | y 1: t ) . p ( y 1: t ) Similar to the likelihood estimator in particle filters. Patrick Rebeschini Lecture 16 22/ 25

Toy model We compare the SMC estimator with the estimator obtained by importance sampling from the prior distribution: X i ∼ p ( θ ) , ∀ i ∈ { 1 , . . . , N } L i t = p ( y 1: t | X i ) , ∀ i ∈ { 1 , . . . , N } N p N ( y 1: t ) = 1 � L i t . N i =1 We plot the following relative error t = | p N ( y 1: t ) − p ( y 1: t ) | r N . | p ( y 1: t ) | Patrick Rebeschini Lecture 16 23/ 25

Toy model IS SMC 1e−01 1e−02 1e−03 1e−04 N 1e−05 r t 1e−06 1e−07 0 250 500 750 10000 250 500 750 1000 time Figure: Relative error using SMC samplers and importance sampling from the prior; 10 independent experiments. Patrick Rebeschini Lecture 16 24/ 25

The End Inversion, Transformation, Composition, Accept-reject, Importance Sampling, Metropolis–Hastings, Gibbs sampling, Adaptive Multiple Importance Sampling, Reversible Jump, Slice Sampling, Sequential Importance Sampling, Particle filter, Pseudo-marginal Metropolis–Hastings, Sequential Monte Carlo Sampler. . . What about Nested Sampling, Path Sampling, Hamiltonian MCMC, Pinball Sampler, Sequential Quasi-Monte Carlo, Multiple-Try Metropolis–Hastings, Coupling From the Past, Multilevel Splitting, Wang–Landau algorithm, Free Energy MCMC, Stochastic Gradient Langevin Dynamics, Firefly MCMC, Configurational Bias Monte Carlo. . . ? Good luck for the exams! Patrick Rebeschini Lecture 16 25/ 25

Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 - PowerPoint PPT Presentation

Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 Patrick Rebeschini Lecture 16 1/ 25 Outline Particle methods for static problems. Evidence estimation. The End! Patrick Rebeschini Lecture 16 2/ 25 Sequence of

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Conditional simulations of max-stable processes C. Dombry , . yi-Minko , M. Ribatet

Estimation risk for the VaR of portfolios driven by semi-parametric multivariate models Christian

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

Generative and discriminative classification techniques Machine Learning and Category

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant Brantley, Hal Daum III,

Asynchronous random sampling for decentralized detection Georgios Fellouris , Columbia

Sequence to Sequence Models for Machine Translation CMSC 723 / LING 723 / INST 725 Marine

Welcome to the Membership Meeting! 1:30 - 3:00pm @JerseyWaterWrks #JWWMembership2020 Zoom

Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 - PowerPoint PPT Presentation

Advanced Simulation - Lecture 16 Patrick Rebeschini March 7th, 2018 Patrick Rebeschini Lecture 16 1/ 25 Outline Particle methods for static problems. Evidence estimation. The End! Patrick Rebeschini Lecture 16 2/ 25 Sequence of

Outline Narcisse Ngada DESY, MKK 1) What is simulation ? 14.05.2014 2) Why simulation ? 3)

Grid simulation (AliEn) Outline GRID simulation Simulation tool Ptolemy (Berkeley)

T7 Cloud Simulation On-demand access simulation December 2016 T7 Cloud Simulation December 2016

Simulation Simulation CHAPTER 1 INTRODUCTION TO SIMULATION 2 MODELING CHAPTER 1 INTRODUCTION

Automated Configuration of Co-simulation with Domain Specific Hints Co-simulation on the rise

MD3311 Simulation Results Joschua Dilly 28.01.2019 MD3311 Simulation Results 2 Introduction

Surgical Simulation: Surgical Simulation: We dont need simulation. We dont need

Why Bayesian methods in Simulation? Simulation Simulation Model Inputs BAYESIAN IDEAS

Statistical Simulation in Python Tushar Shanker Data Scientist DataCamp Statistical Simulation

Chapter 2 Simulation Examples Banks, Carson, Nelson &amp; Nicol Discrete-Event System Simulation

Simulation of stationary processes Timo Tiihonen 2014 Tactical aspects of simulation

Architecture without explicit locks for logic Importance Of Simulation simulation on SIMD

Simulation Monte Carlo Monte Carlo simulation Outcome of a single stochastic simulation run

Malaysian Healthy Ageing Society Plenary Lecture Plenary Lecture Plenary Lecture Plenary

1/88 Presentation: Advanced Techniques 2/88 Presentation: Advanced Techniques 3/88

Advanced Nutrition Course Advanced Nutrition Course 6 Week Advanced Nutrition Live Online

Conditional simulations of max-stable processes C. Dombry , . yi-Minko , M. Ribatet

Estimation risk for the VaR of portfolios driven by semi-parametric multivariate models Christian

Contextual Multi-armed Bandit Algorithm for Semiparametric Reward Model Gi-Soo Kim, Myunghee Cho

Generative and discriminative classification techniques Machine Learning and Category

Non-Monotonic Sequential Text Generation Sean Welleck, Kiant Brantley, Hal Daum III,

Asynchronous random sampling for decentralized detection Georgios Fellouris , Columbia

Sequence to Sequence Models for Machine Translation CMSC 723 / LING 723 / INST 725 Marine

Welcome to the Membership Meeting! 1:30 - 3:00pm @JerseyWaterWrks #JWWMembership2020 Zoom

Chapter 2 Simulation Examples Banks, Carson, Nelson & Nicol Discrete-Event System Simulation