Undirected Graphical Models Undirected Graphs Chris Williams, - PowerPoint PPT Presentation

� � ✁ ✁ ✁ ✁ ✁ ✁ ✁ ✁ � � � � � � Undirected Graphical Models Undirected Graphs Chris Williams, School of Informatics, University of Edinburgh • graph G = ( X, E ) • X is a set of nodes, in one-to- Overview one correspondence with a set of random variables • Undirected graphs • E is a set of undirected edges • Conditional independence between the nodes • Potential functions, energy functions • Examples: multivariate Gaussian, MRF • Boltzmann machines, learning rule • Reading: Jordan section 2.2. [chs 19, 20 for additional reading (not examinable)] Global conditional independence • Consider arbitrary disjoint index subsets A, B and C • If every path from a node in X A to a node in X C includes at least one node in B then I ( X A , X C | X B ) X X X • This is a na¨ ıve graph-theoretic separation condition (c.f. d-separation) A B C

Graphs and Cliques Parameterization • Conditional independence properties of undirected graphs imply a representation of the joint probability as a product of local functions defined on the maximal cliques of the • For directed graphs use P ( X ) = � i P ( X i | Pa i ) , gives notion of locality graph p ( x ) = 1 � ψ X C ( x c ) Z C ∈C • For undirected graphs, locality depends on the notion of cliques with � � Z = ψ X C ( x c ) C ∈C x • A clique of a graph is a fully-connected set of nodes • Each ψ X C ( x C ) is a strictly positive, real-valued function, otherwise arbitrary • Z is called the partition function • A maximal clique is a clique which cannot be extended to include additional nodes without losing the property of being fully connected • Equivalence of conditional independence and clique factorization form is the Hammersley-Clifford theorem X • Potential functions are in general neither conditional or marginal 4 probabilities X 2 X • Natural interpretation as agreement, constraint, energy 6 X 1 X X • Potential function favours certain local configurations by assigning them 3 5 ψ (x1,x2) ψ (x1,x3) ψ (x3,x5) ψ (x2,x5,x6) ψ (x2,x4) larger values P(x) = /Z • Global configurations that have high probability are, roughly speaking, those that satisfy as many of the favoured local configurations as possible

Energy functions Local Markov Property • Enforce positivity by defining • Denote all nodes by V ψ X C ( x C ) = exp {− H X C ( x C ) } • Negative sign is conventional (high probability, low energy) • For a vertex a , let ∂a denote the boundary of a , i.e. the set of vertices in p ( x ) = 1 ψ X C ( x c ) = 1 V \ a that are neighbours of a � � Z exp {− H X C ( x C ) } Z C ∈C C ∈C • Energy H ( x ) = � C ∈C H X C ( x C ) • Local Markov property : For any vertex a , the conditional distribution of X a given X V \ a depends only on X ∂a • Boltzmann distribution p ( x ) = 1 Z exp {− H ( x ) } Example I—Multivariate Gaussian X 4 p ( x ) ∝ exp {− 1 2 x T Σ − 1 x } X 2 X 6 X • It is the zeros in Σ − 1 that define the missing edges in the graph and 1 hence the conditional independence structure X X 3 5 ψ (x1,x2) ψ (x1,x3) ψ (x3,x5) ψ (x2,x5,x6) ψ (x2,x4) P(x) = /Z

Example II—Markov Random Field Boltzmann machines • Discrete random variables • Hinton and Sejnowski, 1983 • Ising model in statistical physics (spins up/down) • Binary units ± 1 • MRF models used in image p ( x ) = 1 Z exp { 1 analysis, e.g. segmentation of � w ij x i x j } 2 regions. Define energies such ij that blocks of the same labels are preferred (Geman and Ge- • w ij = w ji and w ii = 0 man, 1984) • set x 0 = 1 (bias unit) • 1 � ij w ij x i x j = � i<j w ij x i x j 2 hidden units • Can have hidden units output (visible) • Potential function is not arbitrary function of cliques, but only based on units pairwise links (can generalize) • P ( X i = 1 | rest ) = σ (2 h i ) where h i = � j w ij x j

Boltzmann machine learning rule p ( x , y ) = 1 � Z exp { θ k φ k ( x , y ) } Denote visible units by x , hidden units by y k p ( x ) = 1 � � exp { θ k φ k ( x , y ) } p ( x , y ) = 1 Z � Z exp { θ k φ k ( x , y ) } k y k � � log p ( x ) = log exp { θ k φ k ( x , y ) } − log Z y k This is the general form of a log linear model. ∂ log p ( x ) � � = φ l ( x , y ) p ( y | x ) − φ l ( x , y ) p ( x , y ) ∂θ l x , y y • Features φ k ( x , y ) are the pairwise potentials for a Boltzmann machine def = � φ l ( x , y ) � + − � φ l ( x , y ) � − • Parameters θ k correspond to weights in the Boltzmann machine Gibbs sampler • + denotes the clamped phase (with x clamped on visible units), − denotes the free-running phase (all unclamped) Loop T times for each unit i to be sampled from compute h i and sample P ( X i | rest ) end for • Learning stops when statistics match in both phases end loop • This is a Markov Chain Monte Carlo (MCMC) method. Under general conditions this will converge to the correct distribution as T → ∞ • Statistics could be computed exactly (using junction tree algorithm) but often this is intractable— use stochastic sampling • Boltzmann machine learning can be slow due to the need to use MCMC techniques. Gradient is the difference of two noisy estimates • Boltzmann machine learning rule is gradient based; one can also use • Hinton (1999, 2000) has introduced the Products of Experts (PoE) architecture, which may get round some of these difficulties Iterative Scaling algorithms (see Jordan ch 20) to update θ k ’s

Undirected Graphical Models Undirected Graphs Chris Williams, - PowerPoint PPT Presentation

Undirected Graphical Models Undirected Graphs Chris Williams, School of Informatics, University of Edinburgh graph G = ( X, E ) X is a set of nodes, in one-to-

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Undirected Graphical Models: Markov Random Fields Probabilistic Graphical Models Sharif

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Graphical Models Graphical Models Relationship between the directed & undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models 10-708 Learning Completely Observed Learning Completely Observed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Undirected graphical models Graph G : arbitrary undirected graph Useful when variables interact

Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges

Undirected graphs anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Undirected graphs A graph

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Markov Networks Asthma Cough Potential functions defined over cliques Smoking Cancer

Graphical Models Aarti Singh Slides Courtesy: Carlos Guestrin Machine Learning 10-701/15-781

Two types of GMs Directed edges give causality relationships ( Bayesian Network or Directed

Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012

Probabilistic Graphical Models Lecture 10 Undirected Models CS/CNS/EE 155 Andreas Krause

Belief Propagation for Spatial Network Embeddings Andrew Frank Alex Ihler Padhraic Smyth

Learning chordal Markov networks by dynamic programming Kustaa Kangas Teppo Niinim aki Mikko

Advanced Machine Learning Introduction to Probabilistic Graphical Models Amit Sethi Electrical

Undirected Graphical Models Undirected Graphs Chris Williams, - PowerPoint PPT Presentation

Undirected Graphical Models Undirected Graphs Chris Williams, School of Informatics, University of Edinburgh graph G = ( X, E ) X is a set of nodes, in one-to-

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Undirected Graphical Models: Markov Random Fields Probabilistic Graphical Models Sharif

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Probabilistic Graphical Models 10-708 Learning Completely Observed Learning Completely Observed

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Probabilistic Graphical Models Part II: Undirected Graphical Models Selim Aksoy Department of

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Undirected graphical models Graph G : arbitrary undirected graph Useful when variables interact

Chapter 3 - Graphs Undirected Graphs Undirected graph. G = (V, E) V = nodes. E = edges

Undirected graphs anhtt-fit@mail.hut.edu.vn dungct@it-hut.edu.vn Undirected graphs A graph

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Markov Networks Asthma Cough Potential functions defined over cliques Smoking Cancer

Graphical Models Aarti Singh Slides Courtesy: Carlos Guestrin Machine Learning 10-701/15-781

Two types of GMs Directed edges give causality relationships ( Bayesian Network or Directed

Probabilistic Graphical Models David Sontag New York University Lecture 4, February 16, 2012

Probabilistic Graphical Models Lecture 10 Undirected Models CS/CNS/EE 155 Andreas Krause

Belief Propagation for Spatial Network Embeddings Andrew Frank Alex Ihler Padhraic Smyth

Learning chordal Markov networks by dynamic programming Kustaa Kangas Teppo Niinim aki Mikko

Advanced Machine Learning Introduction to Probabilistic Graphical Models Amit Sethi Electrical

Graphical Models Graphical Models Relationship between the directed & undirected models