probabilistic graphical models
play

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for - PowerPoint PPT Presentation

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 , 2 , 3 , , = 1 Finding the normalizer Computing the marginals Two Problems for Graphical


  1. Probabilistic Graphical Models CMSC 691 UMBC

  2. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals

  3. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦 𝑑

  4. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π‘Ž π‘œ (𝑀) = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 π‘œ =𝑀 𝑑 𝑦 𝑑 Example: 3 variables, fix the 2 nd dimension π‘Ž 2 (𝑀) = ෍ ෍ ΰ·‘ πœ” 𝑑 (𝑦 = 𝑦 1 , 𝑀, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

  5. Two Problems for Graphical Models π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 Finding the normalizer Computing the marginals Sum over all variable combinations, with the x n coordinate fixed π‘Ž π‘œ (𝑀) = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) π‘Ž = ෍ ΰ·‘ πœ” 𝑑 (𝑦 𝑑 ) 𝑦:𝑦 π‘œ =𝑀 𝑑 𝑦 𝑑 Example: 3 Q : Why are these difficult? variables, fix the 2 nd dimension A : Many different combinations π‘Ž 2 (𝑀) = ෍ ෍ ΰ·‘ πœ” 𝑑 (𝑦 = 𝑦 1 , 𝑀, 𝑦 3 ) 𝑦 1 𝑦 3 𝑑

  6. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂

  7. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 )

  8. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 ) Vertices ↔ random variables Edges show dependencies among random variables

  9. Probabilistic Graphical Models A graph G that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Graph G = (vertices V, edges E) Distribution π‘ž(π‘Œ 1 , … , π‘Œ 𝑂 ) Vertices ↔ random variables Edges show dependencies among random variables Two main flavors: directed graphical models and undirected graphical models

  10. Outline Directed Graphical Models Undirected Graphical Models Factor Graphs

  11. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗

  12. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗 Benefit: read the independence properties are transparent

  13. Directed Graphical Models A directed (acyclic) graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes into factors of π‘Œ 𝑗 conditioned on the parents of π‘Œ 𝑗 A graph/joint distribution that follows this is a Bayesian network

  14. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 β€œparents of” topological sort

  15. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , 𝑦 4 , 𝑦 5 = ???

  16. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , 𝑦 4 , 𝑦 5 = π‘ž 𝑦 1 π‘ž 𝑦 3 π‘ž 𝑦 2 𝑦 1 , 𝑦 3 π‘ž 𝑦 4 𝑦 2 , 𝑦 3 π‘ž(𝑦 5 |𝑦 2 , 𝑦 4 )

  17. Bayesian Networks: Directed Acyclic Graphs 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = ΰ·‘ π‘ž 𝑦 𝑗 𝜌(𝑦 𝑗 )) 𝑗 exact inference in general DAGs is NP-hard inference in trees can be exact

  18. Directed Graphical Model Notation 𝑦 1 𝑦 2 𝑦 3 5 𝑦 4 Unshaded nodes Shaded nodes are are unobserved observed R.V.s (latent) R.V.s

  19. D-Separation: Testing for Conditional Independence d-separation X & Y are d-separated if for all paths P, one of the following is true: Variables X & Y are P has a chain with an observed middle node conditionally independent given Z if all X Y (undirected) paths from P has a fork with an observed parent node (any variable in) X to (any variable in) Y are X Y d-separated by Z P includes a β€œv - structure” or β€œcollider” with all unobserved descendants X Z Y

  20. D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a β€œv - structure” or β€œcollider” with all unobserved descendants X Z Y not observing Z blocks the path from X to Y

  21. D-Separation: Testing for Conditional Independence d-separation Variables X & Y are conditionally independent given Z if all (undirected) paths from (any variable X & Y are d-separated if for all paths P, one of in) X to (any variable in) Y are d-separated by Z the following is true: P has a chain with an observed middle node observing Z blocks the path from X to Y X Z Y P has a fork with an observed parent node Z observing Z blocks the path from X to Y X Y P includes a β€œv - structure” or β€œcollider” with all unobserved descendants not observing Z blocks X Z Y the path from X to Y π‘ž 𝑦, 𝑧, 𝑨 = π‘ž 𝑦 π‘ž 𝑧 π‘ž(𝑨|𝑦, 𝑧) π‘ž 𝑦, 𝑧 = ෍ π‘ž 𝑦 π‘ž 𝑧 π‘ž(𝑨|𝑦, 𝑧) = π‘ž 𝑦 π‘ž 𝑧 𝑨

  22. Markov Blanket the set of nodes needed to form the complete conditional for a variable x i π‘ž(𝑦 1 , … , 𝑦 𝑂 ) π‘ž 𝑦 𝑗 𝑦 π‘˜β‰ π‘— = ∫ π‘ž 𝑦 1 , … , 𝑦 𝑂 𝑒𝑦 𝑗 x Ο‚ 𝑙 π‘ž(𝑦 𝑙 |𝜌 𝑦 𝑙 ) factorization = of graph ∫ Ο‚ 𝑙 π‘ž 𝑦 𝑙 𝜌 𝑦 𝑙 ) 𝑒𝑦 𝑗 factor out terms not dependent on x i Markov blanket of a node x Ο‚ 𝑙:𝑙=𝑗 or π‘—βˆˆπœŒ 𝑦 𝑙 π‘ž(𝑦 𝑙 |𝜌 𝑦 𝑙 ) is its parents, children, and = children's parents ∫ Ο‚ 𝑙:𝑙=𝑗 or π‘—βˆˆπœŒ 𝑦 𝑙 π‘ž 𝑦 𝑙 𝜌 𝑦 𝑙 ) 𝑒𝑦 𝑗 (in this example, shading does not show observed/latent)

  23. Outline Directed Graphical Models Undirected Graphical Models Factor Graphs

  24. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph

  25. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields

  26. Undirected Graphical Models An undirected graph G=(V,E) that represents a probability distribution over random variables π‘Œ 1 , … , π‘Œ 𝑂 Joint probability factorizes based on cliques in the graph Common name: Markov Random Fields Undirected graphs can have an alternative formulation as Factor Graphs

  27. Markov Random Fields: Undirected Graphs π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

  28. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂

  29. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)

  30. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part of the clique C global normalization maximal potential function (not cliques necessarily a probability!)

  31. Markov Random Fields: Undirected Graphs clique : subset of nodes, where nodes are pairwise connected maximal clique : a clique that cannot add a node and remain a clique π‘ž 𝑦 1 , 𝑦 2 , 𝑦 3 , … , 𝑦 𝑂 = 1 π‘Ž ΰ·‘ πœ” 𝐷 𝑦 𝑑 𝐷 variables part Q : What restrictions should we of the clique C place on the potentials πœ” 𝐷 ? global normalization maximal potential function (not cliques necessarily a probability!)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend