probabilistic graphical models probabilistic graphical
play

Probabilistic Graphical Models Probabilistic Graphical Models - PowerPoint PPT Presentation

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019 Siamak Ravanbakhsh Learning objectives Learning objectives multivariate Gaussian density: different parametrizations marginalization and


  1. Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019 Siamak Ravanbakhsh

  2. Learning objectives Learning objectives multivariate Gaussian density: different parametrizations marginalization and conditioning expression as Markov & Bayesian networks

  3. Univariate Gaussian density Univariate Gaussian density ( x − μ )2 − 1 p ( x ; μ , σ ) = e 2 σ 2 2 πσ 2 motivated by central limit theorem max-entropy dist. with a fixed variance 2 μ ∈ ℜ, σ > 0 E [ X ] = μ , E [ X ] − 2 E [ X ] = 2 σ 2

  4. Multivariate Gaussian Multivariate Gaussian x ∈ ℜ n is a column vector (convention) 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − T ( ) 2 ∣2 π Σ∣ ( x − μ )2 − compre to 1 p ( x ; μ , σ ) = e 2 σ 2 2 πσ 2 1 − n − (2 π ) ∣Σ∣ 2 2

  5. Multivariate Gaussian: Multivariate Gaussian: sufficient statistics sufficient statistics 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ μ = E [ X ] Σ = V ar ( X ) i , i i Σ = E [ XX ] − E [ X ] E [ X ] the covariance matrix T T Σ = Cov ( X , X ) i , j i j n × n n × n only captures these two statistics

  6. Multivariate Gaussian: Multivariate Gaussian: covariance matrix covariance matrix since ( y E [( X − E [ X ])( X − E [ X ]) ] y ) = 2 y Σ y = T T T a > 0 move this expectation out

  7. Multivariate Gaussian: Multivariate Gaussian: covariance matrix covariance matrix since ( y E [( X − E [ X ])( X − E [ X ]) ] y ) = 2 y Σ y = T T T a > 0 move this expectation out is symmetric positive definite (PD) y Σ y > ∀ y ; ∥ y ∥ > 0 Σ ≻ 0 T 0 the inverse of a PD matrix is PD the precision matrix −1 Λ = Σ ≻ 0 is diagnoalized by orthonormal matrices

  8. Multivariate Gaussian: Multivariate Gaussian: covariance matrix covariance matrix since ( y E [( X − E [ X ])( X − E [ X ]) ] y ) = 2 y Σ y = T T T a > 0 move this expectation out is symmetric positive definite (PD) y Σ y > ∀ y ; ∥ y ∥ > 0 Σ ≻ 0 T 0 the inverse of a PD matrix is PD the precision matrix −1 Λ = Σ ≻ 0 is diagnoalized by orthonormal matrices Σ = QDQ T diagonal orthogonal rows & columns of unit norm T = Q Q = T QQ I rotation and reflection

  9. Multivariate Gaussian: Multivariate Gaussian: covariance matrix covariance matrix Σ = QDQ T diagonal (scaling) orthogonal rows & columns of unit norm T = Q Q = T QQ I rotation and reflection Scaling along axes in some rotated/reflected coordinate system

  10. Multivariate Gaussian: Multivariate Gaussian: example example 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ T [ 4, 2 [ −.87, −.48 −.48, .87 ] [ 5.1, 0 0, .39 ] [ −.87, −.48 1 ] −.48, .87 ] Σ = ≈ 2, 2 Q T D Q columns of Q are the new bases

  11. Multivariate Gaussian: example Multivariate Gaussian: example 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ T [ 4, 2 [ −.87, −.48 −.48, .87 ] [ 5.1, 0 0, .39 ] [ −.87, −.48 1 ] −.48, .87 ] Σ = ≈ 2, 2 Q T D Q columns of Q are the new bases Alternatively [ cos(208°), sin(208°) approximately sin(208°), − cos(208°) ] θ /2 = 104° reflection of coordinates by the line making an angle

  12. Multivariate Gaussian: Multivariate Gaussian: from univariates from univariates given n univariate Gaussians X ∼ N (0, I )

  13. Multivariate Gaussian: Multivariate Gaussian: from univariates from univariates given n univariate Gaussians X ∼ N (0, I ) scale them by D 1 D X ∼ N (0, D ) ii 2 rotate/reflect using Q 1 QD X ∼ N (0, QDQ ) = T N (0, Σ) 2

  14. Multivariate Gaussian: Multivariate Gaussian: from univariates from univariates given n univariate Gaussians X ∼ N (0, I ) scale them by D 1 D X ∼ N (0, D ) ii 2 rotate/reflect using Q 1 QD X ∼ N (0, QDQ ) = T N (0, Σ) 2 more generally X ∼ N ( μ , Σ) ⇒ A X + b ∼ N ( Aμ + b , A Σ A ) T

  15. parametrization parametrization moment form (mean parametrization) 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣

  16. parametrization parametrization moment form (mean parametrization) 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ −1 η = Σ : local potential μ : precision matrix Λ = Σ −1

  17. parametrization parametrization moment form (mean parametrization) 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ −1 η = Σ : local potential μ : precision matrix Λ = Σ −1 information form (cannonical parametrization) ∣Λ∣ 1 1 p ( x ; η , Λ) = x Λ x + η x − exp − ( T T η Λ η T ) (2 π ) n 2 2

  18. parametrization parametrization moment form (mean parametrization) 1 1 −1 p ( x ; μ , Σ) = ( x − μ ) Σ ( x − μ ) exp − ( T ) 2 ∣2 π Σ∣ −1 η = Σ : local potential −1 μ μ = Λ η : precision matrix Λ = Σ −1 Σ = Λ −1 information form (cannonical parametrization) ∣Λ∣ 1 1 p ( x ; η , Λ) = x Λ x + η x − exp − ( T T η Λ η T ) (2 π ) n 2 2 the relationship between the two types goes beyond Gaussians

  19. Marginalization Marginalization moment form X ∼ N ( μ , Σ) is useful for marginalization: μ = [ μ , μ B T ] A X = [ X , X B T ] A [ Σ , Σ BB ] AA AB Σ = Σ , Σ X ∼ N ( μ , Σ ) BA A m m

  20. Marginalization Marginalization moment form X ∼ N ( μ , Σ) is useful for marginalization: μ = [ μ , μ B T ] A X = [ X , X B T ] A [ Σ , Σ BB ] AA AB Σ = Σ , Σ X ∼ N ( μ , Σ ) BA A m m = μ μ m A Σ = Σ m A

  21. Marginalization Marginalization moment form X ∼ N ( μ , Σ) is useful for marginalization: μ = [ μ , μ B T ] A X = [ X , X B T ] A [ Σ , Σ BB ] AA AB Σ = Σ , Σ X ∼ N ( μ , Σ ) BA A m m = μ μ m A Σ = Σ m A marginalization as a linear transformation: , 0 ] A = [ I AA X ∼ N ( μ , Σ) ⇒ A X ∼ N μ ( , Σ AA ) A

  22. Marginal independencies Marginal independencies : moment form : moment form covariance means dependence & vice versa ⊥ X ∣ ∅ ⇔ Σ = Cov ( X , X ) = 0 X i , j i j i j why? 2 , 0 [ X j ] [ μ j ] [ σ 2 ] 2 2 marginalize to get N ( μ , Σ) i i ∼ N ( i ) = N ( x ; μ , σ ) N ( x ; μ , σ ) 0, σ i i j j i j X μ j

  23. Marginal independencies Marginal independencies : moment form : moment form covariance means dependence & vice versa ⊥ X ∣ ∅ ⇔ Σ = Cov ( X , X ) = 0 X i , j i j i j why? 2 , 0 [ X j ] [ μ j ] [ σ 2 ] 2 2 marginalize to get N ( μ , Σ) i i ∼ N ( i ) = N ( x ; μ , σ ) N ( x ; μ , σ ) 0, σ i i j j i j X μ j Gaussian is special in this sense correlation : normalized covariance Cov ( X , X ) ρ ( X , X ) = i j i j V ar ( X ) V ar ( X ) i j image from wikipedia

  24.  Conditional Conditional independencies: independencies: information form information form zeros of the precision matrix mean conditional independence X − { X ⊥ ∣ , X } ⇔ Λ = 0 X X i , j i j i j adjacency matrix in the Markov network ( Gaussian MRF ) Λ = 0 ∣Λ∣ 1 1 p ( x ; η , Λ) = x Λ x + η x − ⎡ Λ ⎤ exp − T η Λ η T ( ) , 0, Λ , 0 (2 π ) n 2 2 11 1,3 ⎢ ⎥ 0, Λ , Λ , 0 ⎢ ⎥ 2,2 2,3 Λ = Λ , Λ , Λ , Λ ⎣ 4,4 ⎦ 3,1 3,2 3,3 3,4 0, 0, Λ , Λ why? 4,3 X X X 3 1 2 X 4

  25.  Conditional Conditional independencies: independencies: information form information form zeros of the precision matrix mean conditional independence X − { X ⊥ ∣ , X } ⇔ Λ = 0 X X i , j i j i j adjacency matrix in the Markov network ( Gaussian MRF ) Λ = 0 ∣Λ∣ 1 1 p ( x ; η , Λ) = x Λ x + η x − ⎡ Λ ⎤ exp − T η Λ η T ( ) , 0, Λ , 0 (2 π ) n 2 2 11 1,3 ⎢ ⎥ 0, Λ , Λ , 0 ⎢ ⎥ 2,2 2,3 Λ = Λ , Λ , Λ , Λ ⎣ 4,4 ⎦ 3,1 3,2 3,3 3,4 write it as the product of factors: 0, 0, Λ , Λ why? 4,3 X X X 3 1 2 ( x , x ) = − x Λ corresponding ψ x i , j i , j i j i j potentials 1 2 ( x ) = − Λ + ψ x η x i , i i i i i 2 i X 4

  26. Gaussian MRF Gaussian MRF: : information form information form ⎡ Λ ⎤ , 0, Λ , 0 ∣Λ∣ 1 1 p ( x ; η , Λ) = x Λ x + η x − 11 1,3 exp − T η Λ η T ( ) ⎢ ⎥ 0, Λ , Λ , 0 ⎢ ⎥ (2 π ) n 2 2 2,2 2,3 Λ = Λ , Λ , Λ , Λ ⎣ 4,4 ⎦ 3,1 3,2 3,3 3,4 0, 0, Λ , Λ 4,3 ( X , X ) = − x Λ corresponding ψ x i , j i , j i j i j X X X 3 1 2 potentials 2 ( X ) = −Λ + ψ x η x i , i i i i i i X 4 Λ should be positive definite otherwise the partition function ∞ 1 is not well-defined x Λ x + η x )d x Z = ∫ −∞ exp(− T T 2

  27. Conditioning : information form Conditioning : information form marginalization: easy in the moment form conditioning: easy in the information form X = [ X , X X X B T ] ∣ ∼ N ( η , Λ ) A A ∣ B A ∣ B A B η = [ η , η B T ] Λ = Λ A A ∣ B AA why? [ Λ , Λ BB ] AA AB Λ = X = + Λ η η Λ , Λ A ∣ B A AB B BA X X A B 2 X A 1 X A 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend