Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with - PowerPoint PPT Presentation

Markov chain Monte Carlo Density Estimation Introduction to Markov Chain Monte Carlo Olivier Le Maître 1 with Omar Knio (KAUST) 1 Centre de Mathématiques Appliquées, CNRS Ecole Polytechnique, Palaiseau, France https://perso.limsi.fr/olm/ olivier.le-maitre@polytechnique.edu DCSE Fall school on Reduced Order Modeling and UQ

Markov chain Monte Carlo Density Estimation Bayesian inference Recall the Bayesian formula: P post ( q | D ) ∝ L ( D | q ) P prior ( q ) . Likelihood involves discrepancies between observations and model predictions: L ( D | q ) = H ( D − U ( q )) . Generally, no close form expression for U ( q ) , but P post ( q | D ) can be evaluated. The objective is to generate samples q ∼ P post ( q | D ) .

Markov chain Monte Carlo Density Estimation Table of content 1 Markov chain Monte Carlo Overview Metropolis Algorithm Quality of the MCMC chain 2 Density Estimation Histogram Method Kernel Density estimation Smoothing parameter

Markov chain Monte Carlo Density Estimation Overview Background Markov chain Monte Carlo (MCMC) methods: class of algorithms aimed at simulating direct draws from some complex distribution of interest. Origin of the name: one uses the previous sample values to randomly (Monte Carlo) generate the next sample value, thus creating a Markov chain. A Markov chain is indeed defined as a process where the transition probability between the current and following state is only a function of the current state. Many different flavors of generating a Markov Chain. Focus on Metropolis-Hastings algorithm: a random walk using a proposal density and a method for accepting/rejecting proposed moves.

Markov chain Monte Carlo Density Estimation Overview MCMC: features The states of the chain after a large number of steps are used as samples of the desired distribution. The quality of the sample improves as a function of the number of steps. The more difficult problem is to determine how many steps are needed to converge to the stationary distribution within an acceptable error: usually one needs at least ∼ 10000 samples. A good chain is one that has rapid mixing, i.e. the stationary distribution is reached quickly starting from an arbitrary position and the target probability is explored well and efficiently. A common application of these algorithms is for numerically calculating multi-dimensional integrals. Let’s look in detail at the Metropolis algorithm and how to generate samples from a certain distribution.

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Metropolis (MH) MH algorithm can draw samples from a target probability distribution, π , requiring only the knowledge of a function proportional to the target PDF . It uses a proposal distribution, P , to generate (Markov chain) candidates that are accepted or rejected according to a certain rule. Let P be a Gaussian for simplicity. 1 1 Let ξ t be the current state. 0.8 0.6 0.4 � t=0 0.2 � 2 0 − 0.2 − 0.4 − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Metropolis (MH) MH algorithm can draw samples from a target probability distribution, π , requiring only the knowledge of a function proportional to the target PDF . It uses a proposal distribution, P , to generate (Markov chain) candidates that are accepted or rejected according to a certain rule. Let P be a Gaussian for simplicity. 1 1 Let ξ t be the current state. 0.8 2 Draw a candidate ξ ′ from a Gaussian centered on the current state: ξ ′ ∼ N ( ξ t , Cov ) where Cov is 0.6 chosen a priori. 0.4 � t=0 0.2 � 2 0 � ’ − 0.2 − 0.4 − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Metropolis (MH) MH algorithm can draw samples from a target probability distribution, π , requiring only the knowledge of a function proportional to the target PDF . It uses a proposal distribution, P , to generate (Markov chain) candidates that are accepted or rejected according to a certain rule. Let P be a Gaussian for simplicity. 1 1 Let ξ t be the current state. 0.8 2 Draw a candidate ξ ′ from a Gaussian centered on the current state: ξ ′ ∼ N ( ξ t , Cov ) where Cov is 0.6 chosen a priori. 0.4 � t=0 3 Calculate the ratio: 0.2 r = π ( ξ ′ ) π ( ξ t ) , � 2 0 � ’ − 0.2 − 0.4 − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Metropolis (MH) MH algorithm can draw samples from a target probability distribution, π , requiring only the knowledge of a function proportional to the target PDF . It uses a proposal distribution, P , to generate (Markov chain) candidates that are accepted or rejected according to a certain rule. Let P be a Gaussian for simplicity. 1 1 Let ξ t be the current state. 0.8 2 Draw a candidate ξ ′ from a Gaussian centered on the current state: ξ ′ ∼ N ( ξ t , Cov ) where Cov is 0.6 chosen a priori. 0.4 � t=0 3 Calculate the ratio: 0.2 r = π ( ξ ′ ) π ( ξ t ) , � 2 0 � ’ − 0.2 4 Draw a random number α ∼ U ( 0 , 1 ) . − 0.4 − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Metropolis (MH) MH algorithm can draw samples from a target probability distribution, π , requiring only the knowledge of a function proportional to the target PDF . It uses a proposal distribution, P , to generate (Markov chain) candidates that are accepted or rejected according to a certain rule. Let P be a Gaussian for simplicity. 1 1 Let ξ t be the current state. 0.8 2 Draw a candidate ξ ′ from a Gaussian centered on the current state: ξ ′ ∼ N ( ξ t , Cov ) where Cov is 0.6 chosen a priori. 0.4 � t=2 � t=0 0.2 3 Calculate the ratio: r = π ( ξ ′ ) � 2 π ( ξ t ) , 0 � t=1 − 0.2 4 Draw a random number α ∼ U ( 0 , 1 ) . − 0.4 5 Chain moves (i.e. candidate is − 0.6 accepted/rejected) according to: − 0.8 � ξ ′ if α < r , − 1 ξ t + 1 = − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1 otherwise . ξ t 6 Repeat for next step t .

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Example Suppose that you want to test MCMC to sample a certain bimodal PDF , π , which is proportional to a mixture of two bivariate gaussians: π ∝ 0 . 5 ∗ N ( µ 1 , Σ 1 ) + 0 . 5 ∗ N ( µ 2 , Σ 2 ) . Method: Metropolis algorithm. What about sensitivity to n and proposal amplitude?

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Example 1 0.8 0.6 0.4 � t=0 0.2 1 Choose the proposal distribution: e.g. a gaussian � 2 0 with covariance: Σ prop = 0 . 1 ∗ I 2 . � ’ − 0.2 2 Choose a starting point: ξ 0 = { 10 , 10 } . − 0.4 3 Run the machinery for n steps: draw a candidate, accept/reject, repeat loop. − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1 What about sensitivity to n and proposal amplitude?

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Example 1 0.8 0.6 0.4 � t=2 � t=0 0.2 1 Choose the proposal distribution: e.g. a gaussian � 2 0 � t=1 with covariance: Σ prop = 0 . 1 ∗ I 2 . − 0.2 2 Choose a starting point: ξ 0 = { 10 , 10 } . − 0.4 3 Run the machinery for n steps: draw a candidate, accept/reject, repeat loop. − 0.6 − 0.8 − 1 − 1 − 0.8 − 0.6 − 0.4 − 0.2 0 0.2 0.4 0.6 0.8 1 � 1 What about sensitivity to n and proposal amplitude?

Markov chain Monte Carlo Density Estimation Metropolis Algorithm Example 4 Plot the n samples (this case n = 5000) The proposal amplitude, 0 . 1, must be varied to obtain good mixing and fast convergence. The number of samples, n , must be as large as possibile to have a reliable statistics. What about sensitivity to n and proposal amplitude?

Markov chain Monte Carlo Density Estimation Quality of the MCMC chain Sensitivity to number of steps The proposal distribution has covariance: Σ prop = 0 . 1 ∗ I 2 . Results for 3 different values of total steps n = 500, 5000 and 25000. The larger n , the better the approximation. n = 500 n = 5000 n = 25000

Markov chain Monte Carlo Density Estimation Quality of the MCMC chain Sensitivity to proposal amplitude The proposal amplitude must be tuned to obtain good exploration of the space and fast convergence of the chain toward the high-probability regions. Results shown for 0 . 005 ∗ I 2 , 0 . 1 ∗ I 2 and 50 ∗ I 2 . The smaller the proposal amplitude, the larger the number of the accepted moves. Large proposals lead to small acceptance and slow exploration of the space. Ideally, the acceptance rate should be between 30 to 60 % . amplitude = 0 . 005 amplitude = 0 . 1 amplitude = 50 acceptance rate ∼ 95 %. acceptance rate ∼ 50 %. acceptance rate ∼ 4 %.

Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with - PowerPoint PPT Presentation

Markov chain Monte Carlo Density Estimation Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with Omar Knio (KAUST) 1 Centre de Mathmatiques Appliques, CNRS Ecole Polytechnique, Palaiseau, France https://perso.limsi.fr/olm/

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI /

Asteroid orbital inversion using Asteroid orbital inversion using Markov-chain Monte Carlo

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

8: Hidden Markov Models Machine Learning and Real-world Data Helen Yannakoudakis 1 Computer

Markov Chains and Hidden Markov Models CE417: Introduction to Artificial Intelligence Sharif

The computational complexity of analyzing infinite-state structured Markov Chains and structured

Markov chains and the number of occurrences of a word in a sequence (4.54.9, 11.1,2,4,6)

Randomness in Computing L ECTURE 26 Last time Randomized algorithm for 3SAT Gamblers

Markov Decision Processes and Dynamic Programming A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS

Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with - PowerPoint PPT Presentation

Markov chain Monte Carlo Density Estimation Introduction to Markov Chain Monte Carlo Olivier Le Matre 1 with Omar Knio (KAUST) 1 Centre de Mathmatiques Appliques, CNRS Ecole Polytechnique, Palaiseau, France https://perso.limsi.fr/olm/

Markov chain Monte Carlo Dr. Jarad Niemi STAT 544 - Iowa State University April 2, 2018 Jarad

Markov Chain Monte Carlo Methods Michel Bierlaire michel.bierlaire@epfl.ch Transport and

MARKOV CHAIN MONTE CARLO METHODS MARKOV CHAIN MONTE CARLO METHODS MARKO LAINE, FMI MARKO LAINE,

Monte Carlo Generators Monte Carlo Generators Monte Carlo Generators QCD Lecture III P .

Introduction to Machine Learning CMU-10701 Markov Chain Monte Carlo Methods Barnabs Pczos

Monte Carlo Methods Guojin Chen Christopher Cprek Chris Rambicure Monte Carlo Methods 1.

Probabilistic Graphical Models Probabilistic Graphical Models Markov Chain Monte Carlo Inference

Markov chain Monte Carlo Reminder Need to sample large, non-standard distributions: Markov

Monte Carlo Approximation of Monte Carlo Filters Adam M. Johansen et al. Collaborators Include:

Markov Chain Monte Carlo (MCMC) Inference Seung-Hoon Na Chonbuk National University Monte Carlo

BROCHURE 2019 TETRA JUICES DEL MONTE DEL MONTE 6 x 1L GOLD PINEAPPLE 6 x 1L 6 x 1L 6 x 1L

Markov Chains Markov Processes Discrete-time Markov Chains Continuous-time Markov Chains Dr

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Stochastic approximation for adaptive Markov chain Monte Carlo algorithms Gersende FORT LTCI /

Asteroid orbital inversion using Asteroid orbital inversion using Markov-chain Monte Carlo

Bayesian inference &amp; Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

8: Hidden Markov Models Machine Learning and Real-world Data Helen Yannakoudakis 1 Computer

Markov Chains and Hidden Markov Models CE417: Introduction to Artificial Intelligence Sharif

The computational complexity of analyzing infinite-state structured Markov Chains and structured

Markov chains and the number of occurrences of a word in a sequence (4.54.9, 11.1,2,4,6)

Randomness in Computing L ECTURE 26 Last time Randomized algorithm for 3SAT Gamblers

Markov Decision Processes and Dynamic Programming A. LAZARIC ( SequeL Team @INRIA-Lille ) ENS

Markov decision processes and interval Markov chains: exploiting the connection Mingmei Teo

Under Interval and Fuzzy From the . . . Symmetric Markov Chains Uncertainty, Symmetric In

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly

Bayesian inference & Markov chain Monte Carlo Note 1: Many slides for this lecture were kindly