CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling - PowerPoint PPT Presentation

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Importance Sampling (Contd.) Choose q ( x ) that minimizes variance of ˆ I n ( f ) var q ( f ( x ) w ( x )) = E q [ f 2 ( x ) w 2 ( x )] − I 2 ( f ) Applying Jensen’s and optimizing, we get | f ( x ) | p ( x ) q ∗ ( x ) = � | f ( x ) | p ( x ) dx Efficient sampling focuses on regions of high | f ( x ) | p ( x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Importance Sampling (Contd.) Choose q ( x ) that minimizes variance of ˆ I n ( f ) var q ( f ( x ) w ( x )) = E q [ f 2 ( x ) w 2 ( x )] − I 2 ( f ) Applying Jensen’s and optimizing, we get | f ( x ) | p ( x ) q ∗ ( x ) = � | f ( x ) | p ( x ) dx Efficient sampling focuses on regions of high | f ( x ) | p ( x ) Super efficient sampling, variance lower than even q ( x ) = p ( x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Importance Sampling (Contd.) Choose q ( x ) that minimizes variance of ˆ I n ( f ) var q ( f ( x ) w ( x )) = E q [ f 2 ( x ) w 2 ( x )] − I 2 ( f ) Applying Jensen’s and optimizing, we get | f ( x ) | p ( x ) q ∗ ( x ) = � | f ( x ) | p ( x ) dx Efficient sampling focuses on regions of high | f ( x ) | p ( x ) Super efficient sampling, variance lower than even q ( x ) = p ( x ) Exploited to evaluate probability of rare events, q ( x ) ∝ I E ( x ) p ( x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Importance Sampling (Contd.)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if Irreducible, transition graph is connected 1

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if Irreducible, transition graph is connected 1 Aperiodic, does not get trapped in cycles 2

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if Irreducible, transition graph is connected 1 Aperiodic, does not get trapped in cycles 2 Sufficient condition to ensure p ( x ) is the invariant distribution p ( x i ) T ( x i − 1 | x i ) = p ( x i − 1 ) T ( x i | x i − 1 )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if Irreducible, transition graph is connected 1 Aperiodic, does not get trapped in cycles 2 Sufficient condition to ensure p ( x ) is the invariant distribution p ( x i ) T ( x i − 1 | x i ) = p ( x i − 1 ) T ( x i | x i − 1 ) MCMC samplers, invariant distribution = target distribution

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains Use a Markov chain to explore the state space Markov chain in a discrete space is a process with p ( x i | x i − 1 , . . . , x 1 ) = T ( x i | x i − 1 ) A chain is homogenous if T is invariant for all i MC will stabilize into an invariant distribution if Irreducible, transition graph is connected 1 Aperiodic, does not get trapped in cycles 2 Sufficient condition to ensure p ( x ) is the invariant distribution p ( x i ) T ( x i − 1 | x i ) = p ( x i − 1 ) T ( x i | x i − 1 ) MCMC samplers, invariant distribution = target distribution Design of samplers for fast convergence

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E L = link matrix for the web graph

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E L = link matrix for the web graph E = uniform random matrix, to ensure irreducibility, aperiodicity

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E L = link matrix for the web graph E = uniform random matrix, to ensure irreducibility, aperiodicity Invariant distribution p ( x ) represents rank of webpage x

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E L = link matrix for the web graph E = uniform random matrix, to ensure irreducibility, aperiodicity Invariant distribution p ( x ) represents rank of webpage x Continuous spaces, T becomes an integral kernel K � p ( x i ) K ( x i +1 | x i ) dx i = p ( x i +1 ) x i

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Markov Chains (Contd.) Random walker on the web Irreducibility, should be able to reach all pages Aperiodicity, do not get stuck in a loop PageRank uses T = L + E L = link matrix for the web graph E = uniform random matrix, to ensure irreducibility, aperiodicity Invariant distribution p ( x ) represents rank of webpage x Continuous spaces, T becomes an integral kernel K � p ( x i ) K ( x i +1 | x i ) dx i = p ( x i +1 ) x i p ( x ) is the corresponding eigenfunction

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x ) Algorithm: For i = 0 , . . . , ( n − 1)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x ) Algorithm: For i = 0 , . . . , ( n − 1) Sample u ∼ U (0 , 1)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x ) Algorithm: For i = 0 , . . . , ( n − 1) Sample u ∼ U (0 , 1) Sample x ∗ ∼ q ( x ∗ | x i )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x ) Algorithm: For i = 0 , . . . , ( n − 1) Sample u ∼ U (0 , 1) Sample x ∗ ∼ q ( x ∗ | x i ) Then � � � 1 , p ( x ∗ ) q ( x i | x ∗ ) x ∗ if u < A ( x i , x ∗ ) = min p ( x i ) q ( x ∗ | x i ) x i +1 = otherwise x i

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm Most popular MCMC method Based on a proposal distribution q ( x ∗ | x ) Algorithm: For i = 0 , . . . , ( n − 1) Sample u ∼ U (0 , 1) Sample x ∗ ∼ q ( x ∗ | x i ) Then � � � 1 , p ( x ∗ ) q ( x i | x ∗ ) x ∗ if u < A ( x i , x ∗ ) = min p ( x i ) q ( x ∗ | x i ) x i +1 = otherwise x i The transition kernel is K MH ( x i +1 | x i ) = q ( x i +1 | x i ) A ( x i , x i +1 ) + δ x i ( x i +1 ) r ( x i ) where r ( x i ) is the term associated with rejection � r ( x i ) = q ( x | x i )(1 − A ( x i , x )) dx x

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution Basic properties

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution Basic properties Irreducibility, ensure support of q contains support of p

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution Basic properties Irreducibility, ensure support of q contains support of p Aperiodicity, ensured since rejection is always a possibility

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution Basic properties Irreducibility, ensure support of q contains support of p Aperiodicity, ensured since rejection is always a possibility Independent sampler: q ( x ∗ | x i ) = q ( x ∗ ) so that 1 , p ( x ∗ ) q ( x i ) � � A ( x i , x ∗ ) = min q ( x ∗ ) p ( x i )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.) By construction p ( x i ) K MH ( x i +1 | x i ) = p ( x i +1 ) K MH ( x i | x i +1 ) Implies p ( x ) is the invariant distribution Basic properties Irreducibility, ensure support of q contains support of p Aperiodicity, ensured since rejection is always a possibility Independent sampler: q ( x ∗ | x i ) = q ( x ∗ ) so that 1 , p ( x ∗ ) q ( x i ) � � A ( x i , x ∗ ) = min q ( x ∗ ) p ( x i ) Metropolis sampler: symmetric q ( x ∗ | x i ) = q ( x i | x ∗ ) 1 , p ( x ∗ ) � � A ( x i , x ∗ ) = min p ( x i )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers The Metropolis-Hastings Algorithm (Contd.)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s) Simulate a non-homogenous Markov chain

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s) Simulate a non-homogenous Markov chain Invariant distribution at iteration i is p i ( x ) ∝ p 1 / T i ( x )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s) Simulate a non-homogenous Markov chain Invariant distribution at iteration i is p i ( x ) ∝ p 1 / T i ( x ) Sample update follows � 1 �  Ti ( x ∗ ) q ( x i | x ∗ ) 1 , p x ∗ if u < A ( x i , x ∗ ) = min   1 x i +1 = Ti ( x i ) q ( x ∗ | x i ) p  otherwise x i 

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s) Simulate a non-homogenous Markov chain Invariant distribution at iteration i is p i ( x ) ∝ p 1 / T i ( x ) Sample update follows � 1 �  Ti ( x ∗ ) q ( x i | x ∗ ) 1 , p x ∗ if u < A ( x i , x ∗ ) = min   1 x i +1 = Ti ( x i ) q ( x ∗ | x i ) p  otherwise x i  T i decreases following a cooling schedule, lim i →∞ T i = 0

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing Problem: To find global maximum of p ( x ) Initial idea: Run MCMC, estimate ˆ p ( x ), compute max Issue: MC may not come close to the mode(s) Simulate a non-homogenous Markov chain Invariant distribution at iteration i is p i ( x ) ∝ p 1 / T i ( x ) Sample update follows � 1 �  Ti ( x ∗ ) q ( x i | x ∗ ) 1 , p x ∗ if u < A ( x i , x ∗ ) = min   1 x i +1 = Ti ( x i ) q ( x ∗ | x i ) p  otherwise x i  T i decreases following a cooling schedule, lim i →∞ T i = 0 1 Cooling schedule needs proper choice, e.g., T i = C log( i + T 0 )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Simulated Annealing (Contd.)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Monte Carlo EM E-step involves computing an expectation � Q ( θ, θ n ) = log p ( x , z | θ ) p ( z | x , θ n ) dx x

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Monte Carlo EM E-step involves computing an expectation � Q ( θ, θ n ) = log p ( x , z | θ ) p ( z | x , θ n ) dx x Estimate the expectation using MCMC

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Monte Carlo EM E-step involves computing an expectation � Q ( θ, θ n ) = log p ( x , z | θ ) p ( z | x , θ n ) dx x Estimate the expectation using MCMC Draw samples using MH with acceptance probability 1 , p ( x | z ∗ , θ n ) p ( z ∗ | θ n ) q ( z | z ∗ ) � � A ( z , z ∗ ) = min p ( x | z , θ n ) p ( z | θ n ) q ( z ∗ | z ) Several variants:

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Monte Carlo EM E-step involves computing an expectation � Q ( θ, θ n ) = log p ( x , z | θ ) p ( z | x , θ n ) dx x Estimate the expectation using MCMC Draw samples using MH with acceptance probability 1 , p ( x | z ∗ , θ n ) p ( z ∗ | θ n ) q ( z | z ∗ ) � � A ( z , z ∗ ) = min p ( x | z , θ n ) p ( z | θ n ) q ( z ∗ | z ) Several variants: Stochastic EM: Draw one sample

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Monte Carlo EM E-step involves computing an expectation � Q ( θ, θ n ) = log p ( x , z | θ ) p ( z | x , θ n ) dx x Estimate the expectation using MCMC Draw samples using MH with acceptance probability 1 , p ( x | z ∗ , θ n ) p ( z ∗ | θ n ) q ( z | z ∗ ) � � A ( z , z ∗ ) = min p ( x | z , θ n ) p ( z | θ n ) q ( z ∗ | z ) Several variants: Stochastic EM: Draw one sample Monte Carlo EM: Draw multiple samples

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals Global proposals explore the entire space (with probability α )

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals Global proposals explore the entire space (with probability α ) Local proposals discover finer details (with probability (1 − α ))

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals Global proposals explore the entire space (with probability α ) Local proposals discover finer details (with probability (1 − α )) Example: Target has many narrow peaks

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals Global proposals explore the entire space (with probability α ) Local proposals discover finer details (with probability (1 − α )) Example: Target has many narrow peaks Global proposal gets the peaks

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Mixtures of MCMC Kernels Powerful property of MCMC: Combination of Samplers Let K 1 , K 2 be kernels with invariant distribution p Mixture kernel α K 1 + (1 − α ) K 2 , α ∈ [0 , 1] converges to p Cycle kernel K 1 K 2 converges to p Mixtures can use global and local proposals Global proposals explore the entire space (with probability α ) Local proposals discover finer details (with probability (1 − α )) Example: Target has many narrow peaks Global proposal gets the peaks Local proposals get the neighborhood of peaks (random walk)

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks Each block can be updated separately

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks Each block can be updated separately Convergence is faster if correlated variables are blocked

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks Each block can be updated separately Convergence is faster if correlated variables are blocked Transition kernel is given by n b K MH ( j ) ( x ( i +1) | x ( i ) b j , x ( i +1) K MHCycle ( x ( i +1) | x ( i ) ) = � − [ b j ] ) b j j =1

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks Each block can be updated separately Convergence is faster if correlated variables are blocked Transition kernel is given by n b K MH ( j ) ( x ( i +1) | x ( i ) b j , x ( i +1) K MHCycle ( x ( i +1) | x ( i ) ) = � − [ b j ] ) b j j =1 Trade-off on block size

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Cycles of MCMC Kernels Split a multi-variate state into blocks Each block can be updated separately Convergence is faster if correlated variables are blocked Transition kernel is given by n b K MH ( j ) ( x ( i +1) | x ( i ) b j , x ( i +1) K MHCycle ( x ( i +1) | x ( i ) ) = � − [ b j ] ) b j j =1 Trade-off on block size If block size is small, chain takes long time to explore the space

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling - PowerPoint PPT Presentation

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee September 27, 2007 Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Problems

CSci 8980: Advanced Topics in Graphical Models Variational Inference Instructor: Arindam Banerjee

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

Modern Computational Statistics Lecture 8: Advanced MCMC Cheng Zhang School of Mathematical

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

Testing MCMC Samplers Jason M.T. Roos First European Bayesian Summit in Marketing Testing MCMC

Additional notes on MCMC sampling Shravan Vasishth March 18, 2020 For more details on MCMC, some

Special Topics: CSci 8980 Machine Learning in Computer Systems Jon B. Weissman (jon@cs.umn.edu)

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Sampling and Monte Carlo Integration Michael Gutmann Probabilistic Modelling and Reasoning

System Acceptance and Regression System, Acceptance, and Regression Testing (c) 2007 Mauro

Paxos Made Simple John Nguyen Slides adapted from Leslie Lamport and Thomas Marshall Problem

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani

1 Ex. 1 The mean salt content of a certain type of potato chips is supposed to be 2.0mg. The salt

Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling - PowerPoint PPT Presentation

Basics MCMC Gibbs Sampling Auxiliary Variable Samplers CSci 8980: Advanced Topics in Graphical Models MCMC, Gibbs Sampling Instructor: Arindam Banerjee September 27, 2007 Basics MCMC Gibbs Sampling Auxiliary Variable Samplers Problems

CSci 8980: Advanced Topics in Graphical Models Variational Inference Instructor: Arindam Banerjee

CSci 8980: Advanced Topics in Graphical Models Mixture Models, EM, Exponential Families

Gibbs-non-Gibbs dynamical transitions. A large-deviation paradigm R. Fern andez F. den

CSci 8980: Advanced Topics in Graphical Models Application: Gene Expression Analysis Instructor:

CSci 8980: Advanced Topics in Graphical Models Analysis of Genetic Variation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Expectation Propagation Instructor: Arindam

CSci 8980: Advanced Topics in Graphical Models Gaussian Processes Instructor: Arindam Banerjee

STAT 339 Markov Chain Monte Carlo (MCMC) 7 April 2017 Some theory and intuition about MCMC

Modern Computational Statistics Lecture 8: Advanced MCMC Cheng Zhang School of Mathematical

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State)

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

An MCMC library for probabilistic programming Rob Zinkov June 13th, 2014 Rob Zinkov An MCMC

Testing MCMC Samplers Jason M.T. Roos First European Bayesian Summit in Marketing Testing MCMC

Additional notes on MCMC sampling Shravan Vasishth March 18, 2020 For more details on MCMC, some

Special Topics: CSci 8980 Machine Learning in Computer Systems Jon B. Weissman (jon@cs.umn.edu)

Rejection Sampling Variational Inference Karan Grewal CSC2547 / STA4273 Overview Variational

Sampling and Monte Carlo Integration Michael Gutmann Probabilistic Modelling and Reasoning

System Acceptance and Regression System, Acceptance, and Regression Testing (c) 2007 Mauro

Paxos Made Simple John Nguyen Slides adapted from Leslie Lamport and Thomas Marshall Problem

Probabilistic &amp; Unsupervised Learning Sampling Methods Maneesh Sahani

1 Ex. 1 The mean salt content of a certain type of potato chips is supposed to be 2.0mg. The salt

Introduction to MCMC DB Breakfast 09/30/2011 Guozhang Wang Motivation: Statistical

Approximate inference: Sampling methods Probabilistic Graphical Models Sharif University of

Probabilistic & Unsupervised Learning Sampling Methods Maneesh Sahani