Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML - PowerPoint PPT Presentation

Markov Random Fields Inference Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML Ch. 8

Markov Random Fields Inference Outline Markov Random Fields Inference

Markov Random Fields Inference Conditional Independence in Graphs a b a b c c • Recall that for Bayesian Networks, conditional independence was a bit complicated • d-separation with head-to-head links • We would like to construct a graphical representation such that conditional independence is straight-forward path checking

Markov Random Fields Inference Markov Random Fields C B A • Markov random fields (MRFs) contain one node per variable • Undirected graph over these nodes • Conditional independence will be given by simple separation, blockage by observing a node on a path • e.g. in above graph, A ⊥ ⊥ B | C

Markov Random Fields Inference Markov Blanket Markov • With this simple check for conditional independence, Markov blanket is also simple • Recall Markov blanket MB of x i is set of nodes such that x i conditionally independent from rest of graph given MB • Markov blanket is neighbours

Markov Random Fields Inference MRF Factorization • Remember that graphical models define a factorization of the joint distribution • What should be the factorization so that we end up with the simple conditional independence check? • For x i and x j not connected by an edge in graph: x i ⊥ ⊥ x j | x \{ i , j } • So there should not be any factor ψ ( x i , x j ) in the factorized form of the joint

Markov Random Fields Inference Cliques • A clique in a graph is a subset of nodes such x 1 that there is a link between every pair of x 2 nodes in the subset • A maximal clique is a clique for which one x 3 cannot add another node and have the set x 4 remain a clique

Markov Random Fields Inference MRF Joint Distribution • Note that nodes in a clique cannot be made conditionally independent from each other • So defining factors ψ ( · ) on nodes in a clique is “safe” • The joint distribution for a Markov random field is: p ( x 1 , . . . , x K ) = 1 � ψ C ( x C ) Z C where x C is the set of nodes in clique C , and the product runs over all maximal cliques • Each ψ C ( x C ) ≥ 0 • Z is a normalization constant

Markov Random Fields Inference MRF Joint - Terminology • The joint distribution for a Markov random field is: p ( x 1 , . . . , x K ) = 1 � ψ C ( x C ) Z C • Each ψ C ( x C ) ≥ 0 is called a potential function • Z , the normalization constant, is called the partition function: � � Z = ψ C ( x C ) C x • Z is very costly to compute, since it is a sum/integral over all possible states for all variables in x • Don’t always need to evaluate it though, will cancel for computing conditional probabilities

Markov Random Fields Inference MRF Joint Distribution Example • The joint distribution for a Markov random field is: 1 � p ( x 1 , . . . , x 4 ) = ψ C ( x C ) x 1 Z x 2 C 1 = Z ψ 123 ( x 1 , x 2 , x 3 ) ψ 234 ( x 2 , x 3 , x 4 ) x 3 x 4 • Note that maximal cliques subsume smaller ones: ψ 123 ( x 1 , x 2 , x 3 ) could include ψ 12 ( x 1 , x 2 ) , though sometimes smaller cliques are explicitly used for clarity

Markov Random Fields Inference Hammersley-Clifford • The definition of the joint: p ( x 1 , . . . , x K ) = 1 � ψ C ( x C ) Z C • Note that we started with particular conditional independences • We then formulated the factorization based on clique potentials • This formulation resulted in the right conditional independences • The converse is true as well, any strictly positive distribution with the conditional independences given by the undirected graph can be represented using a product of clique potentials • This is the Hammersley-Clifford theorem

Markov Random Fields Inference Energy Functions • Often use exponential, which is non-negative, to define potential functions: ψ C ( x C ) = exp {− E C ( x C ) } • Minus sign − by convention • E C ( x C ) is called an energy function • From physics, low energy = high probability • This exponential representation is known as the Boltzmann distribution

Markov Random Fields Inference Energy Functions - Intuition • Joint distribution nicely rearranges as 1 � p ( x 1 , . . . , x K ) = ψ C ( x C ) Z C 1 � = Z exp {− E C ( x C ) } C • Intuition about potential functions: ψ C are describing good (low energy) sets of states for adjacent nodes • An example of this is next

Markov Random Fields Inference Image Denoising • Consider the problem of trying to correct (denoise) an image that has been corrupted • Assume image is binary • Observed (noisy) pixel values y i ∈ {− 1 , + 1 } • Unobserved true pixel values x i ∈ {− 1 , + 1 } • Another application: face sketch synthesis from photos http: //people.csail.mit.edu/xgwang/sketch.html .

Markov Random Fields Inference Image Denoising - Graphical Model y i x i • Cliques containing each true pixel value x i ∈ {− 1 , + 1 } and observed value y i ∈ {− 1 , + 1 } • Observed pixel value is usually same as true pixel value • Energy function − η x i y i , η > 0 , lower energy (better) if x i = y i • Cliques containing adjacent true pixel values x i , x j • Nearby pixel values are usually the same • Energy function − β x i x j , β > 0 , lower energy (better) if x i = x j

Markov Random Fields Inference Image Denoising - Graphical Model y i x i • Complete energy function: � � E ( x , y ) = − β x i x j − η x i y i { i , j } i • Joint distribution: p ( x , y ) = 1 Z exp {− E ( x , y ) } • Or, as potential functions ψ n ( x i , x j ) = exp ( β x i x j ) , ψ p ( x i , y i ) = exp ( η x i y i ) : p ( x , y ) = 1 � � ψ n ( x i , x j ) ψ p ( x i , y i ) Z i , j i

Markov Random Fields Inference Image Denoising - Inference • The denoising query is arg max x p ( x | y ) • Two approaches: • Iterated conditional modes (ICM): hill climbing in x , one variable x i at a time • Simple to compute, Markov blanket is just observation plus neighbouring pixels • Graph cuts: formulate as max-flow/min-cut problem, exact inference (for this graph)

Markov Random Fields Inference Converting Directed Graphs into Undirected Graphs x 1 x 2 x N − 1 x N x N − 1 x 1 x 2 x N • Consider a simple directed chain graph: p ( x ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 2 ) . . . p ( x N | x N − 1 ) • Can convert to undirected graph p ( x ) = 1 Z ψ 1 , 2 ( x 1 , x 2 ) ψ 2 , 3 ( x 2 , x 3 ) . . . ψ N − 1 , N ( x N − 1 , x N ) where ψ 1 , 2 = p ( x 1 ) p ( x 2 | x 1 ) , all other ψ k − 1 , k = p ( x k | x k − 1 ) , Z = 1

Markov Random Fields Inference Converting Directed Graphs into Undirected Graphs • The chain was straight-forward because for each conditional p ( x i | pa i ) , nodes x i ∪ pa i were contained in one clique • Hence we could define that clique potential to include that conditional • For a general undirected graph we can force this to occur by “marrying” the parents • Add links between all parents in pa i • This process known as moralization, creating a moral graph

Markov Random Fields Inference Strong Morals x 1 x 3 x 1 x 3 x 2 x 2 x 4 x 4 • Start with directed graph on left • Add undirected edges between all parents of each node • Remove directionality from original edges

Markov Random Fields Inference Constructing Potential Functions x 1 x 3 x 1 x 3 x 2 x 2 x 4 x 4 • Initialize all potential functions to be 1 • With moral graph, for each p ( x i | pa i ) , there is at least one clique which contains all of x i ∪ pa i • Multiply p ( x i | pa i ) into potential function for one of these cliques • Z = 1 again since: � � p ( x ) = ψ C ( x C ) = p ( x i | pa i ) C i which is already normalized

Markov Random Fields Inference Equivalence Between Graph Types • Note that the moralized undirected graph loses some of the conditional independence statements of the directed graph • Further, there are certain conditional independence assumptions which can be represented by directed graphs which cannot be represented by directed graphs, and vice versa • Directed graph: A ⊥ ⊥ B |∅ , A ⊤ ⊤ B | C , cannot be represented using undirected graph • Undirected graph: A ⊤ ⊤ B |∅ , A ⊥ ⊥ B | C ∪ D , C ⊥ ⊥ D | A ∪ B cannot be represented using directed graph

Markov Random Fields Inference Equivalence Between Graph Types A B C • Note that the moralized undirected graph loses some of the conditional independence statements of the directed graph • Further, there are certain conditional independence assumptions which can be represented by directed graphs which cannot be represented by directed graphs, and vice versa • Directed graph: A ⊥ ⊥ B |∅ , A ⊤ ⊤ B | C , cannot be represented using undirected graph • Undirected graph: A ⊤ ⊤ B |∅ , A ⊥ ⊥ B | C ∪ D , C ⊥ ⊥ D | A ∪ B cannot be represented using directed graph

Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML - PowerPoint PPT Presentation

Markov Random Fields Inference Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML Ch. 8 Markov Random Fields Inference Outline Markov Random Fields Inference Markov Random Fields Inference Outline Markov Random Fields

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical models for Neuroscience Part I Giuseppe Vinci Department of Statistics Rice

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

The Average-Case Complexity of Counting Cliques in Erd os-R enyi Hypergraphs Enric

Conditional Random Fields Dietrich Klakow Overview Sequence Labeling Bayesian Networks

CS 758/858: Algorithms http://www.cs.unh.edu/~ruml/cs758 Graph Problems Number Problem Wheeler

Co-nondeterminism in compositions: A kernelization lower bound for a Ramsey-type problem Stefan

The Probabilistic Method Week 9: Random Graphs Joshua Brody CS49/Math59 Fall 2015 Reading

NP complete problems Some figures, text, and pseudocode from: - Introduction to Algorithms, by

k -dismantlability in graphs Bertrand Jouve joint work with Etienne Fieux CNRS - Toulouse -

Yale Science Building Town Hall Meeting December 13, 2016 Kline Biology Tower Cloister to be

Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML - PowerPoint PPT Presentation

Markov Random Fields Inference Graphical Models - Part II Oliver Schulte - CMPT 726 Bishop PRML Ch. 8 Markov Random Fields Inference Outline Markov Random Fields Inference Markov Random Fields Inference Outline Markov Random Fields

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Graphical models for Neuroscience Part I Giuseppe Vinci Department of Statistics Rice

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical Screen Design Grids are an essential tool for graphical design Important graphical

The Average-Case Complexity of Counting Cliques in Erd os-R enyi Hypergraphs Enric

Conditional Random Fields Dietrich Klakow Overview Sequence Labeling Bayesian Networks

CS 758/858: Algorithms http://www.cs.unh.edu/~ruml/cs758 Graph Problems Number Problem Wheeler

Co-nondeterminism in compositions: A kernelization lower bound for a Ramsey-type problem Stefan

The Probabilistic Method Week 9: Random Graphs Joshua Brody CS49/Math59 Fall 2015 Reading

NP complete problems Some figures, text, and pseudocode from: - Introduction to Algorithms, by

k -dismantlability in graphs Bertrand Jouve joint work with Etienne Fieux CNRS - Toulouse -

Yale Science Building Town Hall Meeting December 13, 2016 Kline Biology Tower Cloister to be

Graphical Models Graphical Models Relationship between the directed & undirected models