graphical models
play

Graphical Models CS 6355: Structured Prediction 1 So far We - PowerPoint PPT Presentation

Graphical Models CS 6355: Structured Prediction 1 So far We discussed sequence labeling tasks: HMM: Hidden Markov Models MEMM: Maximum Entropy Markov Models CRF: Conditional Random Fields All these models use a linear chain


  1. Graphical Models CS 6355: Structured Prediction 1

  2. So far… We discussed sequence labeling tasks: • HMM: Hidden Markov Models • MEMM: Maximum Entropy Markov Models • CRF: Conditional Random Fields All these models use a linear chain structure to describe the interactions between random variables. y t-1 y t y t-1 y t y t-1 y t x t x t x t HMM MEMM CRF 2

  3. This lecture Graphical models – Directed: Bayesian Networks – Undirected: Markov Networks (Markov Random Field) • Representations • Inference • Learning 3

  4. Probabilistic Graphical Models Languages that represent probability distributions over multiple random • variables – Directed or undirected graphs Encodes conditional independence assumptions • Or equivalently, encodes factorization of joint probabilities. • General machinery for • – Algorithms for computing marginal and conditional probabilities • Recall that we have been looking at most probable states so far • Exploiting graph structure – An “inference engine” – Can introduce prior probability distributions Because parameters are also random variables • 4

  5. � Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * 5

  6. � Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * 6 Example from Russell and Norvig

  7. � Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * Joint probability 𝑄 𝐶, 𝐹, 𝐵, 𝐾, 𝑁 = 𝑄 𝐶 ⋅ 𝑄 𝐹 ⋅ 𝑄 𝐵 𝐶, 𝐹 ⋅ 𝑄 𝐾 𝐵 ⋅ 𝑄 𝑁 𝐵 7 Example from Russell and Norvig

  8. � Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * Joint probability 𝑄 𝐶, 𝐹, 𝐵, 𝐾, 𝑁 = 𝑄 𝐶 ⋅ 𝑄 𝐹 ⋅ 𝑄 𝐵 𝐶, 𝐹 ⋅ 𝑄 𝐾 𝐵 ⋅ 𝑄 𝑁 𝐵 The network and its parameters are a compact representation of the joint probability distribution 8 Example from Russell and Norvig

  9. � Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * We can ask questions like: “What is the probability that Mary calls if there is • an earthquake?” “If John called and Mary did not call, what is the • probability that there was a burglary?” 9 Example from Russell and Norvig

  10. Independence Assumptions of a BN If X, Y, Z are random variables, we write • X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” 10 Example from Daphne Koller

  11. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • 11 Example from Daphne Koller

  12. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * 12 Example from Daphne Koller

  13. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * Examples: – 𝐺𝑚𝑣 ⊥ 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 ∣ 𝑇𝑓𝑏𝑡𝑝𝑜 – 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜 ⊥ 𝑇𝑓𝑏𝑡𝑝𝑜 ∣ 𝐺𝑚𝑣, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 13 Example from Daphne Koller

  14. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * Examples: – 𝐺𝑚𝑣 ⊥ 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 ∣ 𝑇𝑓𝑏𝑡𝑝𝑜 – 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜 ⊥ 𝑇𝑓𝑏𝑡𝑝𝑜 ∣ 𝐺𝑚𝑣, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 Parents of a node shield it from influence of ancestors and non-descendants… … but information about descendants can influence beliefs about a node. 14 Example from Daphne Koller

  15. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * 15 Example from Daphne Koller

  16. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * Example: The Markov blanket of 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is the set {𝑇𝑓𝑏𝑡𝑝𝑜, 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜, 𝐺𝑚𝑣} . If we know these variables, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is independent of 𝑁𝑣𝑡𝑑𝑚𝑓𝑄𝑏𝑗𝑜 16 Example from Daphne Koller

  17. Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * Example: The Markov blanket of 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is the set {𝑇𝑓𝑏𝑡𝑝𝑜, 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜, 𝐺𝑚𝑣} . If we know these variables, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is independent of 𝑁𝑣𝑡𝑑𝑚𝑓𝑄𝑏𝑗𝑜 The Markov blanket of a node shields it from influence of any other node 17 Example from Daphne Koller

  18. Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • 18 Example from Daphne Koller

  19. Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • Where do the independence assumptions come from? 19 Example from Daphne Koller

  20. Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • Where do the independence assumptions come from? Domain knowledge 20 Example from Daphne Koller

  21. We have seen Bayesian networks before • The naïve Bayes model is a simple Bayesian Network – The naïve Bayes assumption is an example of an independence assumption • The hidden Markov model is another Bayesian network 21

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend