graphical models from an algebraic perspective
play

Graphical models from an algebraic perspective Elina Robeva MIT - PowerPoint PPT Presentation

Graphical models from an algebraic perspective Elina Robeva MIT ICERM Nonlinear Algebra Bootcamp September 11, 2018 1 / 29 Overview Undirected graphical models Definition and parametric description Markov properties and implicit


  1. Graphical models from an algebraic perspective Elina Robeva MIT ICERM Nonlinear Algebra Bootcamp September 11, 2018 1 / 29

  2. Overview • Undirected graphical models • Definition and parametric description • Markov properties and implicit description • Discrete and Gaussian • Directed graphical models • Definition and parametric description • Markov properties, d -separation, and implicit description • Discrete and Gaussian • model equivalence • Mixed graphical models 2 / 29

  3. Undirected graphical models Let G = ( V , E ) be an undirected graph and C ( G ) the set of maximal cliques of G . Let ( X v : v ∈ V ) ∈ X := � v ∈ V X v be a random vector. Notation: X A = � v ∈ A X v , X A = ( X v : v ∈ A ), x A = ( x v : v ∈ A ). For each C ∈ C ( G ) let φ C : X C → R ≥ 0 be a continuous function called a clique potential . The undirected graphical model (or markov random field ) corresponding to G and X is the set of all probability density functions on X of the form p ( x ) = 1 � φ C ( x C ) Z C ∈C ( G ) where � � Z = φ C ( x C ) d µ ( x ) X C ∈C ( G ) is the normalizing constant. 3 / 29

  4. Undirected graphical models Let G = ( V , E ) be an undirected graph and C ( G ) the set of maximal cliques of G . Let ( X v : v ∈ V ) ∈ X := � v ∈ V X v be a random vector. Notation: X A = � v ∈ A X v , X A = ( X v : v ∈ A ), x A = ( x v : v ∈ A ). For each C ∈ C ( G ) let φ C : X C → R ≥ 0 be a continuous function called a clique potential . The undirected graphical model (or markov random field ) corresponding to G and X is the set of all probability density functions on X of the form p ( x ) = 1 � φ C ( x C ) Z C ∈C ( G ) where � � Z = φ C ( x C ) d µ ( x ) X C ∈C ( G ) is the normalizing constant. 3 / 29

  5. Undirected graphical models Example 1 p ( x 1 , x 2 , x 3 , x 4 ) = 1 Z φ 12 ( x 1 , x 2 ) φ 13 ( x 1 , x 3 ) φ 14 ( x 1 , x 4 ) . 2 4 3 Example 1 2 3 p ( x 1 , x 2 , x 3 , x 4 , x 5 ) = 1 Z φ 123 ( x 1 , x 2 , x 3 ) φ 25 ( x 2 , x 5 ) φ 34 ( x 3 , x 4 ) φ 45 ( x 4 , x 5 ) . 5 4 4 / 29

  6. Discrete undirected graphical models Suppose that X v = [ r v ], r v ∈ N . Then, X ∈ X = � v ∈ V [ r v ]. We use parameters θ C C ∈ C ( G ) , x r ∈ [ r v ] . x C := φ C ( x C ) , Then, we get the rational parametrization 1 � θ C p x = x C . Z ( θ ) C ∈C ( G ) The graphical model corresponding to G consists of all discrete distributions p = ( p x : x ∈ X ) that factor in this way. Example Let r 1 = r 2 = r 3 = r 4 = 2. The parametrization has the form 1 1 Z ( θ ) θ (12) x 1 x 2 θ (13) x 1 x 3 θ (14) p x 1 x 2 x 3 x 4 = x 1 x 4 . 2 4 The ideal I G is the ideal of the image of this parametrization. 3 5 / 29

  7. Discrete undirected graphical models Example Let r 1 = r 2 = r 3 = r 4 = 2. The parametrization has the form 1 1 Z ( θ ) θ (12) x 1 x 2 θ (13) x 1 x 3 θ (14) p x 1 x 2 x 3 x 4 = x 1 x 4 . 2 4 The ideal I G is the ideal of the image of this parametrization. 3 S = QQ[a (1,1)..a (2,2), b (1,1)..b (2,2), c (1,1)..c (2,2)] R = QQ[p (1,1,1,1)..p (2,2,2,2)] L = {} for i from 0 to 15 do ( s = last baseName (vars R) (0,i); L = append(L, a (s 0,s 1)*b (s 0,s 2)*c (s 0,s 3)) ) phi = map(S, R, L) I = ker phi Output: I G = � 2-minors of M 1 � + � 2-minors of M 2 � + � 2-minors of M 3 � + � 2-minors of M 4 � where � � � � p 0000 p 0001 p 0010 p 0011 p 1000 p 1001 p 1010 p 1011 M 1 = , M 2 = p 0100 p 0101 p 0110 p 0111 p 1100 p 1101 p 1110 p 1111 � p 0000 p 0001 p 0100 p 0101 � � p 1000 p 1001 p 1100 p 1101 � M 3 = , M 4 = . p 0010 p 0011 p 0110 p 0111 p 1010 p 1011 p 1110 p 1111 6 / 29

  8. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u 7 / 29

  9. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . 7 / 29

  10. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . The parametric description of the Gaussian graphical model with respect to G = ( V , E ) is M G = { Σ = K − 1 : K ≻ 0 and K uv = 0 for all ( u , v ) �∈ E } . The ideal of the model I G is the ideal of the image of this parametrization. 7 / 29

  11. Gaussian undirected graphical models X = ( X v : v ∈ V ) ∼ N ( µ, Σ) Gaussian random vector, K = Σ − 1 . The density of X is p ( x ) = 1 � − 1 � 2 ( x − µ ) T K ( x − µ ) Z exp When does it factorize according to G = ( V , E ), i.e. p ( x ) = 1 � C ∈C ( G ) φ C ( x C )? Z p ( x ) = 1 � − 1 � � � − 1 � � 2 ( x v − µ v ) 2 K vv 2 ( x v − µ v )( x u − µ u ) K vu exp exp . Z v ∈ V v � = u The density factorizes according to G = ( V , E ) if and only if K uv = 0 for all ( u , v ) �∈ E . The parametric description of the Gaussian graphical model with respect to G = ( V , E ) is M G = { Σ = K − 1 : K ≻ 0 and K uv = 0 for all ( u , v ) �∈ E } . The ideal of the model I G is the ideal of the image of this parametrization. 7 / 29

  12. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example 8 / 29

  13. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example Global Markov property: 1 X 2 ⊥ ⊥ X 3 | X 1 X 2 ⊥ ⊥ X 4 | X 1 2 4 X 3 ⊥ ⊥ X 4 | X 1 3 8 / 29

  14. Markov properties and conditional independence for undirected graphical models A different way to define undirected graphical models is via conditional independence statements. Let G = ( V , E ). For A , B , C ⊆ V , say that A and B are separated by C if every path between a ∈ A and b ∈ B goes through a vertex in C . The global Markov property associated to G consists of all conditional independence statements X A ⊥ ⊥ X B | X C for all disjoint sets A , B , C such that C separates A and B . Example Global Markov property: 1 X 2 ⊥ ⊥ X 3 | X 1 X 2 ⊥ ⊥ X 4 | X 1 2 4 X 3 ⊥ ⊥ X 4 | X 1 3 8 / 29

  15. Conditional independence for discrete distributions For discrete random variables conditional independence yields polynomial equations in ( p x : x ∈ X ) . How? Example If V = { 1 , 2 } and X = [ m 1 ] × [ m 2 ], then X 1 ⊥ ⊥ X 2 is the same as p ij = p i + p + j for all i ∈ [ m 1 ] , j ∈ [ m 2 ] . Equivalently, the matrix  p 1+  . � p +1 � P = ( p ij ) = . · · · p + m 2 ,   .   p m 1 + has rank 1. So, equivalently its 2 × 2 minors vanish, i.e. p ij p k ℓ − p i ℓ p kj = 0 for all i , k ∈ [ m 1 ] , j , ℓ ∈ [ m 2 ] . 9 / 29

  16. Conditional independence for discrete distributions For discrete random variables conditional independence yields polynomial equations in ( p x : x ∈ X ) . How? Example If V = { 1 , 2 } and X = [ m 1 ] × [ m 2 ], then X 1 ⊥ ⊥ X 2 is the same as p ij = p i + p + j for all i ∈ [ m 1 ] , j ∈ [ m 2 ] . Equivalently, the matrix  p 1+  . � p +1 � P = ( p ij ) = . · · · p + m 2 ,   .   p m 1 + has rank 1. So, equivalently its 2 × 2 minors vanish, i.e. p ij p k ℓ − p i ℓ p kj = 0 for all i , k ∈ [ m 1 ] , j , ℓ ∈ [ m 2 ] . 9 / 29

  17. Conditional independence for discrete distributions Proposition Let X be a discrete random vector with sample space X = � n i =1 [ m i ] . Then for disjoint sets A , B , C ⊂ [ n ] , we have that X A ⊥ ⊥ X B | X C if and only if p i A i B i C + p j A j B i C + − p i A j B i C + p j A i B i C + = 0 for all i A � = j A ∈ X A , i B � = j B ∈ X B , i C ∈ X C . 10 / 29

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend