introduction to bayesian statistics
play

Introduction to Bayesian Statistics Lecture 7: Multiparameter models - PowerPoint PPT Presentation

Introduction to Bayesian Statistics Lecture 7: Multiparameter models (III) Rung-Ching Tsai Department of Mathematics National Taiwan Normal University April 15, 2015 Multiparameter model: the multinomial model y = ( y 1 , , y J )


  1. Introduction to Bayesian Statistics Lecture 7: Multiparameter models (III) Rung-Ching Tsai Department of Mathematics National Taiwan Normal University April 15, 2015

  2. Multiparameter model: the multinomial model • y = ( y 1 , · · · , y J ) ∼ multinomial( n ; θ 1 , · · · , θ J ) with � J j =1 y j = n , use Bayesian approach to estimate θ = ( θ 1 , · · · , θ J ). i.e., ◦ Likelihood: J � y j p ( y | θ ) ∝ θ j j =1 ◦ Prior of θ : choose the conjugate prior of a Dirichlet distribution, Dirichlet( α 1 , · · · , α J ), for θ : J J α j − 1 � � p ( θ | α ) ∝ θ with θ j = 1 . j j =1 j =1 where Dirichlet is a multivariate generalization of the beta distribution. ◦ Posterior of θ J α j + y j − 1 � p ( θ | y ) = p ( θ ) p ( y | θ ) ∝ θ , i . e ., θ | y ∼ Dirichlet ( α 1 + y 1 , · · · , α J + y J ) j j =1 2 of 13

  3. Multiparameter model: the multivariate normal model iid • y 1 , · · · , y n ∼ MVN( µ , Σ ), Σ known, use Bayesian approach to estimate µ . ◦ choose a conjugate prior for µ , µ ∼ MVN( µ 0 , Λ 0 ) � − 1 � p ( µ ) ∝ | Λ 0 | − 1 / 2 exp 2( µ − µ 0 ) T Λ − 1 0 ( µ − µ 0 ) ◦ likelihood of µ : � n � − 1 � | Σ | − n / 2 exp ( y i − µ ) T Σ − 1 ( y i − µ ) p ( y 1 , · · · , y n | µ , Σ ) ∝ 2 i =1 � − 1 � | Σ | − n / 2 exp 2tr( Σ − 1 S 0 ) = where S 0 = � n i =1 ( y i − µ )( y i − µ ) T 3 of 13

  4. Multiparameter model: multivariate normal, Σ known iid • y 1 , · · · , y n ∼ MVN( µ , Σ ), Σ known, use Bayesian approach to estimate µ . ◦ find the posterior distribution of µ : p ( µ | y 1 , · · · , y n , Σ ) ∝ p ( µ ) p ( y 1 , · · · , y n | µ ) | Σ | − n / 2 exp( − 1 2[( µ − µ 0 ) T Λ − 1 ∝ 0 ( µ − µ 0 ) n � ( y i − µ ) T Σ − 1 ( y i − µ )]) + i =1 � − 1 � 2( µ − µ n ) T Λ − 1 ∝ exp n ( µ − µ n ) that is, p ( µ | y 1 , · · · , y n , Σ ) ∼ MVN( µ n , Λ n ), where µ n = ( Λ − 1 + n Σ − 1 ) − 1 ( Λ − 1 0 µ 0 + n Σ − 1 ¯ y ) and Λ − 1 = Λ − 1 + n Σ − 1 0 n 0 4 of 13

  5. Multiparameter model: multivariate normal, Σ known • p ( µ | y 1 , · · · , y n , Σ ) ∼ MVN( µ n , Λ n ), where µ n = ( Λ − 1 + n Σ − 1 ) − 1 ( Λ − 1 0 µ 0 + n Σ − 1 ¯ y ) and Λ − 1 = Λ − 1 + n Σ − 1 0 n 0 � µ (1) � � � � µ (1) Λ (11) Λ (12) � n n n • Let µ = , µ n = and Λ n = . µ (2) µ (2) Λ (21) Λ (22) n n n ◦ posterior marginal distribution of subvectors of µ : n , Λ (11) p ( µ (1) | y 1 , · · · , y n , Σ ) ∼ MVN( µ (1) ) n ◦ posterior conditional distribution of subvectors of µ : + β 1 | 2 ( µ (2) − µ (2) p ( µ (1) | µ (2) , y 1 , · · · , y n , Σ ) ∼ MVN( µ (1) n ) , Λ 1 | 2 ) n where β 1 | 2 = Λ (12) ) − 1 , and Λ 1 | 2 = Λ (11) ( Λ (22) − Λ (12) ( Λ (22) ) − 1 Λ (21) . n n n n n n 5 of 13

  6. Multiparameter model: multivariate normal, Σ known • p ( µ | y 1 , · · · , y n , Σ ) ∼ MVN( µ n , Λ n ), where µ n = ( Λ − 1 + n Σ − 1 ) − 1 ( Λ − 1 0 µ 0 + n Σ − 1 ¯ y ) and Λ − 1 = Λ − 1 + n Σ − 1 0 n 0 • Let ˜ y ∼ MVN( µ , Σ ), new observation. ◦ posterior predictive distribution of ˜ y , Σ known p (˜ y , µ | y 1 , · · · , y n ) = N(˜ y | µ , Σ )N( µ | µ n , Λ n ) is the exponential of a quadratic form in (˜ y , µ ), hence ˜ y ∼ N( µ n , Σ + Λ n ) where E(˜ y | y ) = E(E(˜ y | µ , y ) | y ) = E( µ | y ) = µ n var(˜ E(Var(˜ y | µ , y ) | y ) + var(E(˜ y | y ) = y | µ , y ) | y )) = E( Σ | y ) + var( µ | y ) = Σ + Λ n 6 of 13

  7. Multiparameter model: multivariate normal, Σ known iid • y 1 , · · · , y n ∼ MVN( µ , Σ ), Σ known, use Bayesian approach to estimate µ . ◦ prior for µ : choose a non-informative prior, p ( µ ) ∼ 1 ◦ likelihood of µ : � n � − 1 � ( y i − µ ) T Σ − 1 ( y i − µ ) | Σ | − n / 2 exp p ( y 1 , · · · , y n | µ , Σ ) ∝ 2 i =1 � − 1 � | Σ | − n / 2 exp 2tr( Σ − 1 S 0 ) = where S 0 = � n i =1 ( y i − µ )( y i − µ ) T ◦ posterior for µ : p ( µ | y 1 , · · · , y n , Σ ) ∝ p ( µ ) p ( y 1 , · · · , y n | µ , Σ ) ∝ p ( y 1 , · · · , y n | µ , Σ ) , i.e., y , Σ µ | Σ , y 1 , · · · , y n ∼ MVN(¯ n ) . 7 of 13

  8. Multivariate normal model, Σ unknown iid • y 1 , · · · , y n ∼ MVN( µ , Σ ), both µ and Σ known, use Bayesian approach to estimate µ . ◦ take a conjugate prior for ( µ , Σ ): p ( µ , Σ ) = p ( Σ ) p ( µ | Σ ) Inv-Wishart ν 0 ( Λ − 1 Σ ∼ 0 ) µ | Σ ∼ MVN( µ 0 , Σ /κ 0 ) i.e., the joint prior density p ( µ , Σ ) � − 1 � 2tr( Λ 0 Σ − 1 ) − κ 0 p ( µ , Σ ) ∝ | Σ | − (( ν 0 + d ) / 2+1) exp 2 ( µ − µ 0 ) T Σ − 1 ( µ − µ 0 ) . We label this the N-Inverse-Wishart( µ 0 , Λ 0 /κ 0 ; ν 0 , Λ 0 ) ◦ likelihood: � � − 1 p ( y 1 , · · · , y n | µ , Σ ) ∝ | Σ | − n / 2 exp 2tr( Σ − 1 S 0 ) where S 0 = � n i =1 ( y i − µ )( y i − µ ) T 8 of 13

  9. Joint posterior distribution, p ( µ , Σ | y 1 , · · · , y n ) iid • y 1 , · · · , y n ∼ MVN( µ , Σ ) ◦ prior of ( µ , Σ ): µ , Σ ∼ N-Inverse-Wishart( µ 0 , Λ 0 /κ 0 ; ν 0 , Λ 0 ) ◦ the joint posterior distribution of ( µ , Σ ): p ( µ , Σ | y 1 , · · · , y n ) ∝ p ( µ , Σ ) p ( y 1 , · · · , y n | µ , Σ ) � − 1 2tr( Λ 0 Σ − 1 ) − κ 0 � | Σ | − ( ν 0+ d ) +1 exp 2 ( µ − µ 0 ) T Σ − 1 ( µ − µ 0 ) ∝ 2 � − 1 � | Σ | − n / 2 exp 2tr( Σ − 1 S 0 ) × = N-Inv-Wishart( µ n , Λ n /κ n ; ν n , Λ n ) . (1) where κ 0 n • µ n = κ 0 + n µ 0 + κ 0 + n ¯ y • κ n = κ 0 + n • ν n = ν 0 + n y − µ 0 ) T with S = � n κ 0 n y ) T • Λ n = Λ 0 + S + κ 0 + n (¯ y − µ 0 )(¯ i =1 ( y i − ¯ y )( y i − ¯ 9 of 13

  10. Conditional posterior distribution, p ( µ | Σ , y 1 , · · · , y n ) • p ( µ , Σ | y 1 , · · · , y n ) = p ( µ | Σ , y 1 , · · · , y n ) p ( Σ | y 1 , · · · , y n ) • the conditional posterior density of µ given Σ is proportional to the joint posterior density (1) with Σ held constant, µ | Σ , y 1 , · · · , y n ∼ MVN( µ n , Σ ) κ n 10 of 13

  11. Marginal posterior distribution, p ( Σ | y 1 , · · · , y n ) • p ( µ , Σ | y 1 , · · · , y n ) = p ( µ | Σ , y 1 , · · · , y n ) p ( Σ | y 1 , · · · , y n ) • p ( Σ | y 1 , · · · , y n ) requires averaging the joint distribution p ( µ , Σ | y 1 , · · · , y n ) over µ , as a result, we have Σ | y 1 , · · · , y n ∼ Inv-Wishart ν n ( Λ − 1 n ) y − µ 0 ) T with κ 0 n where Λ n = Λ 0 + S + κ 0 + n (¯ y − µ 0 )(¯ S = � n y ) T i =1 ( y i − ¯ y )( y i − ¯ 11 of 13

  12. Marginal posterior distribution of µ , p ( µ | y 1 , · · · , y n ) • Estimand of interest: µ • To obtain the marginal posterior distribution of µ : ◦ our results from the univariate normal is generalized to the multivariate case: µ | y 1 , · · · , y n ∼ t ν n − d +1 ( µ n , Λ n / ( κ n ( ν n − d + 1))) where κ 0 n κ 0 + n ¯ • µ n = κ 0 + n µ 0 + y • κ n = κ 0 + n , ν n = ν 0 + n y − µ 0 ) T with S = � n κ 0 n y ) T • Λ n = Λ 0 + S + κ 0 + n (¯ y − µ 0 )(¯ i =1 ( y i − ¯ y )( y i − ¯ ◦ By simulation: • first draw Σ from p ( Σ | y 1 , · · · , y n ) with Σ | y 1 , · · · , y n ∼ Inv-Wishart ν n ( Λ − 1 n ) , • then draw µ from p ( µ | Σ , y 1 , · · · , y n ) with µ | Σ , y 1 , · · · , y n ∼ MVN( µ n , Σ κ n ) . 12 of 13

  13. the multivariate normal model: Non-informative prior iid • y 1 , · · · , y n ∼ MVN( µ , Σ ), both µ and Σ known, use Bayesian approach to estimate µ . ◦ a common non-informative prior is the Jeffreys prior density: p ( µ , Σ ) ∝ | Σ | − ( d +1) / 2 , which is the limit of the conjugate prior density as κ 0 → 0, ν 0 → − 1, | Λ 0 | → 0. ◦ the marginal and conditional densities can be written as Σ | y 1 , · · · , y n ∼ Inv-Wishart n − 1 ( S ) , y , Σ µ | Σ , y 1 , · · · , y n ∼ MVN(¯ n ) . ◦ marginal posterior of µ µ | y 1 , · · · , y n ∼ t n − d (¯ y , S / ( n ( n − d ))) . 13 of 13

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend