Undirected Graphical Models: Markov Random Fields Probabilistic - PowerPoint PPT Presentation

Undirected Graphical Models: Markov Random Fields Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018

Markov Network  Structure: undirected graph  Undirected edges show correlations (non-causal relationships) between variables  e.g., Spatial image analysis: intensity of neighboring pixels are correlated A B Markov Network C D 2

MRF: Joint distribution  Factor 𝜚(𝑌 1 , … , 𝑌 𝑙 )  𝜚: 𝑊𝑏𝑚(𝑌 1 , … , 𝑌 𝑙 ) → ℝ  Scope: {𝑌 1 , … , 𝑌 𝑙 } Joint distribution is parametrized by factors 𝚾 = 𝜚 1 𝑬 1 , … , 𝜚 𝐿 𝑬 𝐿 : 𝑄 𝑌 1 , … , 𝑌 𝑂 = 1 𝑎 𝜚 𝑙 (𝑬 𝑙 ) 𝑙 𝑬 𝑙 : the set of variables in the k-th factor 𝑎 = 𝜚 𝑙 (𝑬 𝑙 ) 𝑙 𝒀 𝑎 : normalization constant called partition function 3

Misconception example 𝐵 = 0 [Koller & Friedman] Factors show “ compatibilities ” between different values of the variables in their scope A factor is only one contribution to the overall joint distribution. 4

Misconception example  Some inferences: 𝑄 𝐵, 𝐶 = 6

MRF: Gibbs distribution Gibbs distribution with factors 𝚾 = {𝜚 1 𝒀 𝐷 1 , … , 𝜚 𝐿 𝒀 𝐷 𝐿 } : 𝐿 𝑄 𝚾 𝑌 1 , … , 𝑌 𝑂 = 1 𝑎 𝜚 𝑗 (𝒀 𝐷 𝑗 ) 𝑗=1 𝐿 𝑎 = 𝜚 𝑗 (𝒀 𝐷 𝑗 ) 𝑗=1 𝒀  𝜚 𝑗 𝒀 𝐷 𝑗 : potential function on clique 𝐷 𝑗  𝜚 𝑗 : Local contingency functions  𝒀 𝐷 𝑗 : the set of variables in the clique 𝐷 𝑗  Potential functions and cliques in the graph completely determine the joint distribution. 7

MRF Factorization: clique  Factors are functions of the variables in the cliques  T o reduce the number of factors we can only allow factors for maximal cliques Clique : subsets of nodes in the graph that are fully connected (complete subgraph) Maximal clique : where no superset of the nodes in a clique are also compose a clique, the clique is maximal Cliques: A B {A,B,C}, {B,C,D}, {A,B}, {A,C}, {B,C}, {B,D}, {C,D}, {A}, {B}, {C}, {D} Max-cliques: C D {A,B,C}, {B,C,D} 8

Relation between factorization and independencies  Theorem:  Let 𝒀, 𝒁, 𝒂 be three disjoint sets of variables:  𝑄 ⊨ 𝒀 ⊥ 𝒁|𝒂 iff 𝑄 𝒀, 𝒁, 𝒂 = 𝑔 𝒀, 𝒂 𝑕(𝒁, 𝒂) 9

MRF Factorization and pairwise independencies  A distribution with 𝑄 𝚾 𝚾 = {𝜚 1 𝑬 1 , … , 𝜚 𝐿 𝑬 𝐿 } factorizes over an MRF 𝐼 if each 𝑬 𝑙 is a complete subgraph of 𝐼  To hold conditional independence property, 𝑌 𝑗 and 𝑌 𝑘 that are not directly connected must not appear in the same factor in the distributions belonging to the graph 10

MRFs: Global Independencies Separation in the undirected graph: A path is active given 𝑎 if no node in it is in 𝑎 𝑌 and 𝑍 are separated given 𝑎 if there is no active path between 𝑌 and 𝑍 given 𝑎 sep 𝐼 (𝑌, 𝑍|𝑎) 𝑍 𝑎 𝑌  Global independencies for any disjoint sets A, B, C:  𝐵 ⊥ 𝐶|𝐷 If all paths that connect a node in 𝐵 to a node in 𝐶 pass through one or more nodes in set 𝐷 11

MRF: independencies  Determining conditional independencies in undirected models is much easier than in directed ones  Conditioning in undirected models can only eliminate dependencies while in directed ones observations can create new dependencies (v-structure) 12

MRF: global independencies  Independencies encoded by 𝐼 (that are found using the graph separation discussed previously): 𝐽(𝐼) = {(𝒀 ⊥ 𝒁|𝒂) ∶ sep 𝐼 (𝒀, 𝒁|𝒂)}  If 𝑄 satisfies 𝐽(𝐼) , we say that 𝐼 is an I-map (independency map) of 𝑄  𝐽 𝐼 ⊆ 𝐽 𝑄 where 𝐽 𝑄 = 𝒀, 𝒁 𝒂 ∶ 𝑄 ⊨ (𝒀 ⊥ 𝒁|𝒂)} 13

Factorization & Independence  Factorization ⇒ Independence (soundness of separation criterion)  Theorem: If 𝑄 factorizes over 𝐼 , and sep 𝐼 (𝒀, 𝒁|𝒂) then 𝑄 satisfies 𝒀 ⊥ 𝒁|𝒂 (i.e., 𝐼 is an I-map of 𝑄 )  Independence ⇒ Factorization  Theorem (Hammersley Clifford): For a positive distribution 𝑄 , if 𝑄 satisfies 𝐽(𝐼) = {(𝒀 ⊥ 𝒁|𝒂) ∶ sep 𝐼 (𝒀, 𝒁|𝒂)} then 𝑄 factorizes over 𝐼 14

Factorization & Independence  Theorem : Two equivalent views of graph structure for positive distributions :  If 𝑄 satisfies all independencies held in 𝐼 , then it can be represented factorized on cliques of 𝐼  If 𝑄 factorizes over a graph 𝐼 , we can read from the graph structure, independencies that must hold in 𝑄 15

Factorization on Markov networks  It is not as intuitive as that of Bayesian networks  The correspondence between the factors in a Gibbs distribution and the distribution 𝑄 is much more indirect  Factors do not necessarily correspond either to probabilities or to conditional probabilities.  The parameters (of factors) may not be intuitively understandable, making them hard to elicit from people.  There are no constraints on the parameters in a factor  While both CPDs and joint distributions must satisfy certain normalization constraints 16

Interpretation of clique potentials  Potentials cannot all be marginal or conditional distributions  A positive clique potential can be considered as general compatibility or goodness measure over values of the variables in its scope 17

𝑌 1 𝑌 2 Different factorizations  Maximal cliques: 𝑌 3 𝑌 4 1  𝑄 𝚾 𝑌 1 , 𝑌 2 , 𝑌 3 , 𝑌 4 = 𝑎 𝜚 123 𝑌 1 , 𝑌 2 , 𝑌 3 𝜚 234 𝑌 2 , 𝑌 3 , 𝑌 4  𝑎 = 𝑌 1 ,𝑌 2 ,𝑌 3 ,𝑌 4 𝜚 123 𝑌 1 , 𝑌 2 , 𝑌 3 𝜚 234 𝑌 2 , 𝑌 3 , 𝑌 4  Sub-cliques:  𝑄 𝚾 ′ 𝑌 1 , 𝑌 2 , 𝑌 3 , 𝑌 4 = 1 𝑎 𝜚 12 𝑌 1 , 𝑌 2 𝜚 23 𝑌 2 , 𝑌 3 𝜚 13 𝑌 1 , 𝑌 3 𝜚 24 𝑌 2 , 𝑌 4 𝜚 34 𝑌 3 , 𝑌 4  𝑎 = 𝑌 1 ,𝑌 2 ,𝑌 3 ,𝑌 4 𝜚 12 𝑌 1 , 𝑌 2 𝜚 23 𝑌 2 , 𝑌 3 𝜚 13 𝑌 1 , 𝑌 3 𝜚 24 𝑌 2 , 𝑌 4 𝜚 34 𝑌 3 , 𝑌 4  Canonical representation  𝑄 𝚾 ′ 𝑌 1 , 𝑌 2 , 𝑌 3 , 𝑌 4 = 1 𝑎 𝜚 123 𝑌 1 , 𝑌 2 , 𝑌 3 𝜚 234 𝑌 2 , 𝑌 3 , 𝑌 4 𝜚 12 𝑌 1 , 𝑌 2 𝜚 23 𝑌 2 , 𝑌 3 𝜚 13 𝑌 1 , 𝑌 3 × 𝜚 24 𝑌 2 , 𝑌 4 𝜚 34 𝑌 3 , 𝑌 4 𝜚 1 𝑌 1 𝜚 2 𝑌 2 𝜚 3 𝑌 3 𝜚 4 𝑌 4  𝑎 = 𝑌 1 ,𝑌 2 ,𝑌 3 ,𝑌 4 𝜚 123 𝑌 1 , 𝑌 2 , 𝑌 3 𝜚 234 𝑌 2 , 𝑌 3 , 𝑌 4 𝜚 12 𝑌 1 , 𝑌 2 𝜚 23 𝑌 2 , 𝑌 3 × 𝜚 13 𝑌 1 , 𝑌 3 𝜚 24 𝑌 2 , 𝑌 4 𝜚 34 𝑌 3 , 𝑌 4 𝜚 1 𝑌 1 𝜚 2 𝑌 2 𝜚 3 𝑌 3 𝜚 4 𝑌 4 18

Pairwise MRF  All of the factors on single variables or pair of variables (𝑌 𝑗 , 𝑌 𝑘 ) : 𝑄 𝒀 = 1 𝜚 𝑗𝑘 𝑌 𝑗 , 𝑌 𝑘 𝜚 𝑗 𝑌 𝑗 𝑎 𝑗 𝑌 𝑗 ,𝑌 𝑘 ∈𝐼  Pairwise MRFs are popular (simple special case of general MRFs)  consider pairwise interactions and not interactions of larger subset of vars.  Pairwise MRFs are attractive because of their simplicity, and because interactions on edges are an important special case that often arises in practice  In general, they do not have enough parameters to encompass the whole space of joint distributions 19

Factor graph  Markov network structure doesn ’ t itself fully specify the factorization of 𝑄  does not generally reveal all the structure in a Gibbs parameterization 𝑌 3 𝑌 1 𝑌 2  Factor graph: two kinds of nodes  Variable nodes  Factor nodes 𝑔 𝑔 𝑔 𝑔 2 1 3 4 𝑄 𝑌 1 , 𝑌 2 , 𝑌 3 = 𝑔 1 𝑌 1 , 𝑌 2 , 𝑌 3 𝑔 2 𝑌 1 , 𝑌 2 𝑔 3 𝑌 2 , 𝑌 3 𝑔 4 (𝑌 3 )  Factor graph is a useful structure for inference and parametrization (as we will see) 20

Energy function  Constraining clique potentials to be positive could be inconvenient  We represent a clique potential in an unconstrained form using a real-value "energy" function  If potential functions are strictly positive 𝜚 𝐷 𝒀 𝐷 > 0 : 𝜚 𝐷 𝒀 𝐷 = exp −𝐹 𝐷 (𝒀 𝐷 ) 𝐹(𝒀 𝐷 ) : energy function 𝐹 𝐷 𝒀 𝐷 = − ln 𝜚 𝐷 𝒀 𝐷 𝑄 𝒀 = 1 𝑎 exp{− 𝐹 𝐷 (𝒀 𝐷 )} 𝐷 21

Log-linear models  Defining the energy function as a linear combination of features  A set of 𝑛 features {𝑔 on complete 1 𝑬 1 , … , 𝑔 𝑛 𝑬 𝑛 } subgraphs where 𝑬 𝑗 shows the scope of the i-th feature:  Scope of a feature is a complete subgraph  We can have different features over a sub-graph 𝑛 𝑄 𝒀 = 1 𝑎 exp − 𝑥 𝑗 𝑔 𝑗 (𝑬 𝑗 ) 𝑗=1 22

Ising model  Most likely joint-configurations usually correspond to a "low-energy" state  𝑌 𝑗 ∈ −1,1 Ising model uses 𝑔 𝑗𝑘 𝑦 𝑗 , 𝑦 𝑘 = 𝑦 𝑗 𝑦 𝑘 𝑄 𝒚 = 1 𝑎 exp 𝑣 𝑗 𝑦 𝑗 + 𝑥 𝑗𝑘 𝑦 𝑗 𝑦 𝑘 𝑗 𝑗,𝑘∈𝐹  Grid model  Image processing, lattice physics, etc.  The states of adjacent nodes are related 23

Undirected Graphical Models: Markov Random Fields Probabilistic - PowerPoint PPT Presentation

Undirected Graphical Models: Markov Random Fields Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018 Markov Network Structure: undirected graph Undirected edges show correlations (non-causal

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Undirected Graphical Models Undirected Graphs Chris Williams, School of Informatics, University

Undirected graphical models Graph G : arbitrary undirected graph Useful when variables interact

Probabilistic Graphical Models 10-708 Learning Completely Observed Learning Completely Observed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Undirected Graphical Model Application Aryan Arbabi CSC 412 Tutorial February 1, 2018 Outline

Graphical Models Kalman Filter DBN ML 701 Undirected Models Anna Goldenberg

Lecture 4: Undirected Graphical Models Department of Biostatistics University of Michigan

Proof Systems and Proof Complexity Marijn J.H. Heule http://www.cs.cmu.edu/~mheule/15816-f19/

Giant Planet Formation and Migration Scenarios Christophe Carreau/ESA Rebekah (Bekki) Dawson

Healthy Kansas Hospitals Hospital Visit Statewide Partnerships for a Healthier Kansas

The Role of Suction in the Performance of Clay Fill Reed Engineering Group Ronald F. Reed, P.E.

What if I want to compute P(X i |x 0 ,x n+1 ) for each i? Compute: X 0 X 1 X 2 X 3 X 4 X 5

A new bound for cliques in strongly regular graphs Jack Koolen School of Mathematical

NP and Polynomial Time Reductions Lecture 23 November 17, 2015 Chandra & Manoj (UIUC)

Exhaustive Generation: Backtracking and Branch-and-bound Lucia Moura Fall 2013 Exhaustive

Undirected Graphical Models: Markov Random Fields Probabilistic - PowerPoint PPT Presentation

Undirected Graphical Models: Markov Random Fields Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018 Markov Network Structure: undirected graph Undirected edges show correlations (non-causal

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Undirected Graphical Models Undirected Graphs Chris Williams, School of Informatics, University

Undirected graphical models Graph G : arbitrary undirected graph Useful when variables interact

Probabilistic Graphical Models 10-708 Learning Completely Observed Learning Completely Observed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Undirected Graphical Model Application Aryan Arbabi CSC 412 Tutorial February 1, 2018 Outline

Graphical Models Kalman Filter DBN ML 701 Undirected Models Anna Goldenberg

Lecture 4: Undirected Graphical Models Department of Biostatistics University of Michigan

Proof Systems and Proof Complexity Marijn J.H. Heule http://www.cs.cmu.edu/~mheule/15816-f19/

Giant Planet Formation and Migration Scenarios Christophe Carreau/ESA Rebekah (Bekki) Dawson

Healthy Kansas Hospitals Hospital Visit Statewide Partnerships for a Healthier Kansas

The Role of Suction in the Performance of Clay Fill Reed Engineering Group Ronald F. Reed, P.E.

What if I want to compute P(X i |x 0 ,x n+1 ) for each i? Compute: X 0 X 1 X 2 X 3 X 4 X 5

A new bound for cliques in strongly regular graphs Jack Koolen School of Mathematical

NP and Polynomial Time Reductions Lecture 23 November 17, 2015 Chandra &amp; Manoj (UIUC)

Exhaustive Generation: Backtracking and Branch-and-bound Lucia Moura Fall 2013 Exhaustive

Graphical Models Graphical Models Relationship between the directed & undirected models

NP and Polynomial Time Reductions Lecture 23 November 17, 2015 Chandra & Manoj (UIUC)