Graphical Models CS 6355: Structured Prediction 1 So far We - PowerPoint PPT Presentation

Graphical Models CS 6355: Structured Prediction 1

So far… We discussed sequence labeling tasks: • HMM: Hidden Markov Models • MEMM: Maximum Entropy Markov Models • CRF: Conditional Random Fields All these models use a linear chain structure to describe the interactions between random variables. y t-1 y t y t-1 y t y t-1 y t x t x t x t HMM MEMM CRF 2

This lecture Graphical models – Directed: Bayesian Networks – Undirected: Markov Networks (Markov Random Field) • Representations • Inference • Learning 3

Probabilistic Graphical Models Languages that represent probability distributions over multiple random • variables – Directed or undirected graphs Encodes conditional independence assumptions • Or equivalently, encodes factorization of joint probabilities. • General machinery for • – Algorithms for computing marginal and conditional probabilities • Recall that we have been looking at most probable states so far • Exploiting graph structure – An “inference engine” – Can introduce prior probability distributions Because parameters are also random variables • 4

� Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * 5

� Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * 6 Example from Russell and Norvig

� Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * Joint probability 𝑄 𝐶, 𝐹, 𝐵, 𝐾, 𝑁 = 𝑄 𝐶 ⋅ 𝑄 𝐹 ⋅ 𝑄 𝐵 𝐶, 𝐹 ⋅ 𝑄 𝐾 𝐵 ⋅ 𝑄 𝑁 𝐵 7 Example from Russell and Norvig

� Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * Joint probability 𝑄 𝐶, 𝐹, 𝐵, 𝐾, 𝑁 = 𝑄 𝐶 ⋅ 𝑄 𝐹 ⋅ 𝑄 𝐵 𝐶, 𝐹 ⋅ 𝑄 𝐾 𝐵 ⋅ 𝑄 𝑁 𝐵 The network and its parameters are a compact representation of the joint probability distribution 8 Example from Russell and Norvig

� Bayesian Network Decompose joint probability via a directed acyclic graph – Nodes represent random variables – Edges represent conditional dependencies – Each node is associated with a conditional probability table 𝑄 𝑨 # , 𝑨 % , ⋯ 𝑨 ' = ) 𝑄 𝑨 * ∣ Parents 𝑨 * * We can ask questions like: “What is the probability that Mary calls if there is • an earthquake?” “If John called and Mary did not call, what is the • probability that there was a burglary?” 9 Example from Russell and Norvig

Independence Assumptions of a BN If X, Y, Z are random variables, we write • X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” 10 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • 11 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * 12 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * Examples: – 𝐺𝑚𝑣 ⊥ 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 ∣ 𝑇𝑓𝑏𝑡𝑝𝑜 – 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜 ⊥ 𝑇𝑓𝑏𝑡𝑝𝑜 ∣ 𝐺𝑚𝑣, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 13 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Local independencies : A node is independent with its non-descendants given its parents 𝑌 * ⊥ NonDescendants 𝑌 * ∣ Parents 𝑌 * Examples: – 𝐺𝑚𝑣 ⊥ 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 ∣ 𝑇𝑓𝑏𝑡𝑝𝑜 – 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜 ⊥ 𝑇𝑓𝑏𝑡𝑝𝑜 ∣ 𝐺𝑚𝑣, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 Parents of a node shield it from influence of ancestors and non-descendants… … but information about descendants can influence beliefs about a node. 14 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * 15 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * Example: The Markov blanket of 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is the set {𝑇𝑓𝑏𝑡𝑝𝑜, 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜, 𝐺𝑚𝑣} . If we know these variables, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is independent of 𝑁𝑣𝑡𝑑𝑚𝑓𝑄𝑏𝑗𝑜 16 Example from Daphne Koller

Independence Assumptions of a BN If X, Y, Z are random variables, we write X ⊥ 𝑍 to say “ X is independent of Y ” and • X ⊥ 𝑍 ∣ 𝑎 to say “ X is independent of Y given 𝑎 ” • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents, together called the node’s Markov Blanket 𝑌 * ⊥ 𝑌 Y ∣ MarkovBlanket 𝑌 * Example: The Markov blanket of 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is the set {𝑇𝑓𝑏𝑡𝑝𝑜, 𝐷𝑝𝑜𝑕𝑓𝑡𝑢𝑗𝑝𝑜, 𝐺𝑚𝑣} . If we know these variables, 𝐼𝑏𝑧𝑔𝑓𝑤𝑓𝑠 is independent of 𝑁𝑣𝑡𝑑𝑚𝑓𝑄𝑏𝑗𝑜 The Markov blanket of a node shields it from influence of any other node 17 Example from Daphne Koller

Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • 18 Example from Daphne Koller

Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • Where do the independence assumptions come from? 19 Example from Daphne Koller

Independence Assumptions of a BN Local independencies : A node is independent • with its non-descendants given its parents. ( X i ⊥ NonDescendants( X i ) | Parents( X i )) • Topological independencies : A node is independent of all other nodes given its parents, children and children’s parents —that is given its Markov Blanket. ( X i ? X j | MB( X i )) for all j 6 = i More general notions of independencies exist. • Where do the independence assumptions come from? Domain knowledge 20 Example from Daphne Koller

We have seen Bayesian networks before • The naïve Bayes model is a simple Bayesian Network – The naïve Bayes assumption is an example of an independence assumption • The hidden Markov model is another Bayesian network 21

Graphical Models CS 6355: Structured Prediction 1 So far We - PowerPoint PPT Presentation

Graphical Models CS 6355: Structured Prediction 1 So far We discussed sequence labeling tasks: HMM: Hidden Markov Models MEMM: Maximum Entropy Markov Models CRF: Conditional Random Fields All these models use a linear chain

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Introduction to Big Data and Machine Learning Graphical Models Dr. Mihail October 29, 2019 (Dr.

Lecture 5: Connections and Differences between Directed Acyclic and Undirected Graphical Models

Graphical models Why All probabilistic inference and learning amount at repeated applications

Information Dynamics and Temporal Structure in Music Samer Abdallah and Mark Plumbley Centre for

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 1 Probabilistic

Undirected Graphical Models Dr. Shuang LIANG School of Software Engineering TongJi University

Graphical Models CS 6355: Structured Prediction 1 So far We - PowerPoint PPT Presentation

Graphical Models CS 6355: Structured Prediction 1 So far We discussed sequence labeling tasks: HMM: Hidden Markov Models MEMM: Maximum Entropy Markov Models CRF: Conditional Random Fields All these models use a linear chain

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

Graphical &gt; Tangible? What are their limitations? 93 94 Graphical &gt; Tangible? Graphical

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

10/4/15 Graphical Programming (1) Maze Program TOPICS Graphical Programming Using

Introduction to Big Data and Machine Learning Graphical Models Dr. Mihail October 29, 2019 (Dr.

Lecture 5: Connections and Differences between Directed Acyclic and Undirected Graphical Models

Graphical models Why All probabilistic inference and learning amount at repeated applications

Information Dynamics and Temporal Structure in Music Samer Abdallah and Mark Plumbley Centre for

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical models Sunita Sarawagi IIT Bombay http://www.cse.iitb.ac.in/~sunita 1 Probabilistic

Undirected Graphical Models Dr. Shuang LIANG School of Software Engineering TongJi University

Graphical Models Graphical Models Relationship between the directed & undirected models

Graphical > Tangible? What are their limitations? 93 94 Graphical > Tangible? Graphical