gibbs sampling
play

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, - PowerPoint PPT Presentation

Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 1 / 32 Outline Two-component Gibbs sampler Full conditional distribution K -component Gibbs sampler Blocked Gibbs


  1. Gibbs sampling Dr. Jarad Niemi Iowa State University March 29, 2018 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 1 / 32

  2. Outline Two-component Gibbs sampler Full conditional distribution K -component Gibbs sampler Blocked Gibbs sampler Metropolis-within-Gibbs Slice sampler Latent variable augmentation Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 2 / 32

  3. Two-component Gibbs sampling Two component Gibbs sampler Suppose our target distribution is p ( θ | y ) with θ = ( θ 1 , θ 2 ) and we can sample from p ( θ 1 | θ 2 , y ) and p ( θ 2 | θ 1 , y ). Beginning with an initial value � � θ (0) 1 , θ (0) , an iteration of the Gibbs sampler involves 2 � � 1. Sampling θ ( t ) θ 1 | θ ( t − 1) ∼ p , y . 1 2 � � 2. Sampling θ ( t ) θ 2 | θ ( t ) ∼ p 1 , y . 2 Thus in order to run a Gibbs sampler, we need to derive the full conditional for θ 1 and θ 2 , i.e. the distribution for θ 1 and θ 2 conditional on everything else. Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 3 / 32

  4. Two-component Gibbs sampling Bivariate normal example Bivariate normal example Let our target be � 1 � ρ θ ∼ N 2 (0 , Σ ) Σ = . ρ 1 Then � ρθ 2 , [1 − ρ 2 ] � θ 1 | θ 2 ∼ N � ρθ 1 , [1 − ρ 2 ] � θ 2 | θ 1 ∼ N are the conditional distributions. � θ 0 1 , θ 0 � Assuming initial value , the Gibbs sampler proceeds as follows: 2 Iteration Sample θ 1 Sample θ 2 � � θ (1) θ (1) ρθ (1) � ρθ 0 2 , [1 − ρ 2 ] � 1 , [1 − ρ 2 ] 1 ∼ N ∼ N 1 2 . . . � � � � θ ( t ) ρθ ( t − 1) θ ( t ) ρθ ( t ) , [1 − ρ 2 ] 1 , [1 − ρ 2 ] ∼ N ∼ N t 1 2 2 . . . Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 4 / 32

  5. Two-component Gibbs sampling Bivariate normal example R code for bivariate normal Gibbs sampler gibbs_bivariate_normal = function(theta0, n_points, rho) { theta = matrix(theta0, nrow=n_points, ncol=2, byrow=TRUE) v = sqrt(1-rho^2) for (i in 2:n_points) { theta[i,1] = rnorm(1, rho*theta[i-1,2], v) theta[i,2] = rnorm(1, rho*theta[i ,1], v) } return(theta) } theta = gibbs_bivariate_normal(c(-3,3), n<-20, rho=rho<-0.9) Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 5 / 32

  6. Two-component Gibbs sampling Bivariate normal example 3 2 1 θ 2 0 −1 −2 −3 −3 −2 −1 0 1 2 3 θ 1 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 6 / 32

  7. Two-component Gibbs sampling Normal model Normal model ind ∼ N ( µ, σ 2 ) and we assume the prior Suppose Y i σ 2 ∼ Inv- χ 2 ( v , s 2 ) . µ ∼ N ( m , C ) and Note: this is NOT the conjugate prior. The full posterior we are interested in is ( σ 2 ) − n / 2 exp p ( µ, σ 2 | y ) ∝ 2 σ 2 ( � n i =1 ( y i − µ ) 2 � 2 C ( µ − m ) 2 � � 1 � − 1 − exp × ( σ 2 ) − ( v / 2+1) exp � − vs 2 � 2 σ 2 To run the Gibbs sampler, we need to derive µ | σ 2 , y and σ 2 | µ, y Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 7 / 32

  8. Two-component Gibbs sampling Normal model Derive µ | σ 2 , y . Recall ( σ 2 ) − n / 2 exp p ( µ, σ 2 | y ) ∝ − 1 � n i =1 ( y i − µ ) 2 � − 1 2 C ( µ − m ) 2 � � � exp 2 σ 2 � � × ( σ 2 ) − ( v / 2+1) exp − vs 2 2 σ 2 Now find µ | σ 2 , y : p ( µ | σ 2 , y ) ∝ p ( µ, σ 2 | y ) � n − 1 i =1 ( y i − µ ) 2 � − 1 2 C ( µ − m ) 2 � � � ∝ exp exp 2 σ 2 � �� � � ��� µ 2 − 2 µ − 1 σ 2 / n + 1 1 y σ 2 / n + m ∝ exp 2 C C thus µ | σ 2 , y ∼ N ( m ′ , C ′ ) where = C ′ � � m ′ σ 2 / n + m y C � − 1 � C ′ σ 2 / n + 1 1 = C Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 8 / 32

  9. Two-component Gibbs sampling Normal model Derive σ 2 | µ, y . Recall ( σ 2 ) − n / 2 exp � n p ( µ, σ 2 | y ) ∝ − 1 i =1 ( y i − µ ) 2 � − 1 2 C ( µ − m ) 2 � � � exp 2 σ 2 × ( σ 2 ) − ( v / 2+1) exp � � − vs 2 2 σ 2 Now find σ 2 | µ, y : p ( σ 2 | µ, y ) ∝ p ( µ, σ 2 | y ) ∝ ( σ 2 ) − ([ v + n ] / 2+1) exp vs 2 + � n − 1 i =1 ( y i − µ ) 2 �� � � 2 σ 2 and thus σ 2 | µ, y ∼ Inv- χ 2 ( v ′ , ( s ′ ) 2 ) where v ′ = v + n = vs 2 + � n v ′ ( s ′ ) 2 i =1 ( y i − µ ) 2 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 9 / 32

  10. Two-component Gibbs sampling Normal model R code for Gibbs sampler # Data and prior y = rnorm(10) m = 0; C = 10 v = 1; s = 1 # Initial values mu = 0 sigma2 = 1 # Save structures n_iter = 1000 mu_keep = rep(NA, n_iter) sigma_keep = rep(NA, n_iter) # Pre-calculate n = length(y) sum_y = sum(y) vp = v+n vs2 = v*s^2 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 10 / 32

  11. Two-component Gibbs sampling Normal model R code for Gibbs sampler # Gibbs sampler for (i in 1:n_iter) { # Sample mu Cp = 1/(n/sigma2+1/C) mp = Cp*(sum_y/sigma2+m/C) mu = rnorm(1, mp, sqrt(Cp)) # Sample sigma vpsp2 = vs2 + sum((y-mu)^2) sigma2 = 1/rgamma(1, vp/2, vpsp2/2) # Save iterations mu_keep[i] = mu sigma_keep[i] = sqrt(sigma2) } Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 11 / 32

  12. Two-component Gibbs sampling Normal model Posteriors mu sigma 3.0 1 2.5 0 2.0 value 1.5 −1 1.0 −2 0 250 500 750 1000 0 250 500 750 1000 t mu sigma 2.0 1.5 1.5 1.0 density 1.0 0.5 0.5 0.0 0.0 −2 −1 0 1 1.0 1.5 2.0 2.5 3.0 value Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 12 / 32

  13. K -component Gibbs sampler K -component Gibbs sampler Suppose θ = ( θ 1 , . . . , θ K ), then an iteration of a K -component Gibbs sampler is � � θ ( t ) θ 1 | θ ( t − 1) , . . . , θ ( t − 1) ∼ p , y 1 2 K � � θ ( t ) θ 2 | θ ( t ) 1 , θ ( t − 1) , . . . , θ ( t − 1) ∼ p , y 2 3 K . . . � � θ ( t ) θ k | θ ( t ) 1 . . . , θ ( t ) k − 1 , θ ( t − 1) k +1 , . . . , θ ( t − 1) ∼ p , y k K . . . � � θ ( t ) θ K | θ ( t ) 1 , . . . , θ ( t ) ∼ p K − 1 , y K The distributions above are called the full conditional distributions. If some of the θ k are vectors, then this is called a block Gibbs sampler. Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 13 / 32

  14. K -component Gibbs sampler Hierarchical normal model Let ind ind ∼ N ( µ i , σ 2 ) , ∼ N ( η, τ 2 ) Y ij µ i for i = 1 , . . . , I , j = 1 , . . . , n i , n = � I i =1 n i and prior p ( η, τ 2 , σ ) ∝ IG ( τ 2 ; a τ , b τ ) IG ( σ 2 ; a σ , b σ ) . The full conditionals are p ( µ | η, σ 2 , τ 2 , y ) = � n i =1 p ( µ i | η, σ 2 , τ 2 , y i ) �� � − 1 � � � � � σ 2 / n i + η y i p ( µ i | η, σ 2 , τ 2 , y i ) σ 2 / n i + 1 1 σ 2 / n i + 1 1 = N , τ 2 τ 2 τ 2 p ( η | µ, σ 2 , τ 2 , y ) µ, τ 2 / I � � = N � n j = IG ( a σ + n / 2 , b σ + � I p ( σ 2 | µ, η, τ 2 , y ) j =1 ( y ij − µ i ) 2 / 2) i =1 p ( τ 2 | µ, η, σ 2 , y ) = IG ( a τ + I / 2 , b τ + � I i =1 ( µ i − η ) 2 / 2) where n i y i = � n i j =1 y ij and I µ = � I i =1 µ i . Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 14 / 32

  15. Metropolis-within-Gibbs Metropolis-within-Gibbs We have discussed two Markov chain approaches to sample from a target distribution: Metropolis-Hastings algorithm Gibbs sampling Gibbs sampling assumed we can sample from p ( θ k | θ − k , y ) for all k , but what if we cannot sample from all of these full conditional distributions? For those p ( θ k | θ − k ) that cannot be sampled directly, a single iteration of the Metropolis-Hastings algorithm can be substituted. Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 15 / 32

  16. Metropolis-within-Gibbs Bivariate normal with ρ = 0 . 9 Reconsider the bivariate normal example substituting a Metropolis step in place of a Gibbs step: gibbs_and_metropolis = function(theta0, n_points, rho) { theta = matrix(theta0, nrow=n_points, ncol=2, byrow=TRUE) v = sqrt(1-rho^2) for (i in 2:n_points) { theta[i,1] = rnorm(1, rho*theta[i-1,2], v) # Now do a random-walk Metropolis step theta_prop = rnorm(1, theta[i-1,2], 2.4*v) # optimal proposal variance logr = dnorm(theta_prop, rho*theta[i,1], v, log=TRUE) - dnorm(theta[i-1,2], rho*theta[i,1], v, log=TRUE) theta[i,2] = ifelse(log(runif(1))<logr, theta_prop, theta[i-1,2]) } return(theta) } theta = gibbs_and_metropolis(c(-3,3), n, rho) length(unique(theta[,2]))/length(theta[,2]) # acceptance rate [1] 0.5 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 16 / 32

  17. Metropolis-within-Gibbs 3 2 1 θ 2 0 −1 −2 −3 −3 −2 −1 0 1 2 3 θ 1 Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 17 / 32

  18. Metropolis-within-Gibbs Hierarchical normal model Let ind ind ∼ N ( µ i , σ 2 ) , ∼ N ( η, τ 2 ) Y ij µ i for i = 1 , . . . , I , j = 1 , . . . , n i , n = � I i =1 n i and prior p ( η, τ, σ ) ∝ Ca + ( τ ; 0 , b τ ) Ca + ( σ ; 0 , b σ ) . The full conditionals are exactly the same except � n j p ( σ | µ, η, τ 2 , y ) ∝ IG ( σ 2 ; n / 2 , � I j =1 ( y ij − µ i ) 2 / 2) Ca + ( σ ; 0 , b σ ) i =1 ∝ IG ( τ 2 ; I / 2 , � I p ( τ 2 | µ, η, σ 2 , y ) i =1 ( µ i − η ) 2 / 2) Ca + ( τ ; 0 , b τ ) where n i y i = � n i j =1 y ij and I µ = � I i =1 µ i . Jarad Niemi (Iowa State) Gibbs sampling March 29, 2018 18 / 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend