Undirected Graphical Models Michael Gutmann Probabilistic Modelling - - PowerPoint PPT Presentation
Undirected Graphical Models Michael Gutmann Probabilistic Modelling - - PowerPoint PPT Presentation
Undirected Graphical Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018 Recap The number of free parameters in probabilistic models increases with
Recap
◮ The number of free parameters in probabilistic models
increases with the number of random variables involved.
◮ Making statistical independence assumptions reduces the
number of free parameters that need to be specified.
◮ Starting with the chain rule and an ordering of the random
variables, we used statistical independencies to simplify the representation.
◮ We thus obtained a factorisation in terms of a product of
conditional pdfs that we visualised as a DAG.
◮ In turn, we used DAGs to define sets of distributions
(“directed graphical models”).
◮ We discussed independence properties satisfied by the
distributions, d-separation, and the equivalence to the factorisation.
Michael Gutmann Undirected Graphical Models 2 / 51
The directionality in directed graphical models
◮ So far we mainly exploited the property
x ⊥ ⊥ y | z ⇐ ⇒ p(y|x, z) = p(y|z)
◮ But when working with p(y|x, z) we impose an ordering or
directionality from x and z to y.
◮ Directionality matters in directed graphical models x z y versus x z y ◮ In some cases, directionality is natural but in others we do not
want to choose one direction over another.
◮ We now discuss how to represent independencies in a
symmetric manner without assuming a directionality or
- rdering of the variables.
Michael Gutmann Undirected Graphical Models 3 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 4 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables Factorisation and statistical independence Gibbs distributions Visualising Gibbs distributions with undirected graphs Conditioning corresponds to removing nodes and edges from the graph
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 5 / 51
Further characterisation of statistical independence
◮ From tutorials: For non-negative functions a(x, z), b(y, z):
x ⊥ ⊥ y | z ⇐ ⇒ p(x, y, z) = a(x, z)b(y, z)
◮ More general version of p(x, y, z) = p(x|z)p(y|z)p(z) ◮ No directionality or ordering of the variables is imposed. ◮ Unconditional version: For non-negative functions a(x), b(y):
x ⊥ ⊥ y ⇐ ⇒ p(x, y) = a(x)b(y)
◮ The important point is the factorisation of p(x, y, z) into two
factors:
◮ if the factors share a variable z, then we have conditional
independence,
◮ if not, we have unconditional independence. Michael Gutmann Undirected Graphical Models 6 / 51
Further characterisation of statistical independence
◮ Since p(x, y, z) must sum (integrate) to one, we must have
- x,y,z
a(x, z)b(y, z) = 1
◮ Normalisation condition often ensured by re-defining
a(x, z)b(y, z): p(x, y, z) = 1 Z φA(x, z)φB(y, z) Z =
- x,y,z
φA(x, z)φB(y, z)
◮ Z: normalisation constant (related to partition function, see later) ◮ φi: factors (also called potential functions).
Do generally not correspond to (conditional) probabilities. They measure “compatibility”, “agreement”, or “affinity”
Michael Gutmann Undirected Graphical Models 7 / 51
What does it mean?
x ⊥ ⊥ y | z ⇐ ⇒ p(x, y, z) = 1
Z φA(x, z)φB(y, z)
“⇒” If we want our model to satisfy x ⊥ ⊥ y | z we should write the pdf (pmf) as p(x, y, z) ∝ φA(x, z)φB(y, z) “⇐” If the pdf (pmf) can be written as p(x, y, z) ∝ φA(x, z)φB(y, z) then we have x ⊥ ⊥ y | z
equivalent for unconditional version
Michael Gutmann Undirected Graphical Models 8 / 51
Example
Consider p(x1, x2, x3, x4) ∝ φ1(x1, x2)φ2(x2, x3)φ3(x4) What independencies does p satisfy?
◮ We can write
p(x1, x2, x3, x4) ∝ [φ1(x1, x2)φ2(x2, x3)]
- ˜
φ1(x1,x2,x3)
[φ3(x4)] ∝ ˜ φ1(x1, x2, x3)φ3(x4) so that x4 ⊥ ⊥ x1, x2, x3.
◮ Integrating out x4 gives
p(x1, x2, x3) =
- p(x1, x2, x3, x4)dx4 ∝ φ1(x1, x2)φ2(x2, x3)
so that x1 ⊥ ⊥ x3 | x2
Michael Gutmann Undirected Graphical Models 9 / 51
Gibbs distributions
◮ Example is a special case of a class of pdfs/pmfs that
factorise as p(x1, . . . , xd) = 1 Z
- c
φc(Xc)
◮ Xc ⊆ {x1, . . . , xd} ◮ φc are non-negative factors (potential functions)
Do generally not correspond to (conditional) probabilities. They measure “compatibility”, “agreement”, or “affinity”
◮ Z is a normalising constant so that p(x1, . . . , xd) integrates
(sums) to one.
◮ Known as Gibbs (or Boltzmann) distributions ◮ ˜
p(x1, . . . , xd) =
c φc(Xc) is an example of an unnormalised
model: ˜ p ≥ 0 but does not necessarily integrate (sum) to one.
Michael Gutmann Undirected Graphical Models 10 / 51
Energy-based model
◮ With φc(Xc) = exp (−Ec(Xc)), we have equivalently
p(x1, . . . , xd) = 1 Z exp
- −
- c
Ec(Xc)
- ◮
c Ec(Xc) is the energy of the configuration (x1, . . . , xd).
low energy ⇐ ⇒ high probability
Michael Gutmann Undirected Graphical Models 11 / 51
Example
Other examples of Gibbs distributions: p(x1, . . . , x6) ∝φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6) p(x1, . . . , x6) ∝φ1(x1, x2)φ2(x2, x3)φ3(x2, x5)φ4(x1, x4)φ5(x4, x5) φ6(x5, x6)φ7(x3, x6)? Independencies?
◮ In principle, the independencies follow from
x ⊥ ⊥ y | z ⇐ ⇒ p(x, y, z) ∝ φA(x, z)φB(y, z) with appropriately defined factors φA and φB.
◮ But the mathematical manipulations of grouping together
factors and integrating variables out become unwieldy. Let us use graphs to better see what’s going on.
Michael Gutmann Undirected Graphical Models 12 / 51
Visualising Gibbs distributions with undirected graphs
p(x1, . . . , xd) ∝
c φc(Xc) ◮ Node for each xi ◮ For all factors φc: draw an undirected edge between all xi and
xj that belong to Xc
◮ Results in a fully-connected subgraph for all xi that are part of
the same factor (this subgraph is called a clique). Example:
Graph for p(x1, . . . , x6) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6) x1 x2 x3 x4 x5 x6
Michael Gutmann Undirected Graphical Models 13 / 51
Effect of conditioning
Let p(x1, . . . , x6) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6).
◮ What is p(x1, x2, x4, x5, x6|x3 = α)? ◮ By definition p(x1, x2, x4, x5, x6|x3 = α)
= p(x1, x2, x3 = α, x4, x5, x6)
p(x1, x2, x3 = α, x4, x5, x6)dx1dx2dx4dx5dx6
= φ1(x1, x2, x4)φ2(x2, α, x4)φ3(α, x5)φ4(α, x6)
φ1(x1, x2, x4)φ2(x2, α, x4)φ3(α, x5)φ4(α, x6)dx1dx2dx4dx5dx6
= 1 Z(α)φ1(x1, x2, x4)φα
2 (x2, x4)φα 3 (x5)φα 4 (x6) ◮ Gibbs distribution with derived factors φα i of reduced domain
and new normalisation “constant” Z(α)
◮ Note that Z(α) depends on the conditioning value α.
Michael Gutmann Undirected Graphical Models 14 / 51
Effect of conditioning
Let p(x1, . . . , x6) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6).
◮ Conditional p(x1, x2, x4, x5, x6|x3 = α) is
1 Z(α)φ1(x1, x2, x4)φα
2 (x2, x4)φα 3 (x5)φα 4 (x6) ◮ Conditioning on variables removes the corresponding nodes
and connecting edges from the undirected graph
x1 x2 x4 x5 x6
Michael Gutmann Undirected Graphical Models 15 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables Factorisation and statistical independence Gibbs distributions Visualising Gibbs distributions with undirected graphs Conditioning corresponds to removing nodes and edges from the graph
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 16 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
Separation in undirected graphs Statistical independencies from graph separation Global Markov property I-map
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 17 / 51
Relating graph properties to independencies
◮ Consider p(x1, x2, x3, x4) ∝ φ1(x1, x2)φ2(x2, x3)φ3(x4) from
before
◮ We have seen:
◮ x4 ⊥
⊥ x1, x2, x3
◮ x1 ⊥
⊥ x3 | x2
◮ Graph:
x1 x3 x4 x2
◮ In the graph, x4 is separated from x1, x2, x3.
Starting at x4, we cannot reach x1, x2, or x3 (and vice versa). In other words, all trails from x4 to x1, x2, x3 are “blocked”.
◮ In the graph, x1 and x3 are separated by x2. In other words,
all trails from x1 to x3 are blocked by x2
(when removing x2 from the graph, we cannot reach x3 from x1 and vice versa)
Michael Gutmann Undirected Graphical Models 18 / 51
Relating graph properties to independencies
◮ Example:
p(x1, . . . , x6) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6)
◮ Graph:
x1 x2 x4 x5 x6 x3
◮ x3 separates {x1, x2, x4} and {x5, x6}
In other words, x3 blocks all trails from {x1, x2, x4} to {x5, x6}
◮ Do we have x1, x2, x4 ⊥
⊥ x5, x6 | x3?
Michael Gutmann Undirected Graphical Models 19 / 51
Relating graph properties to independencies
p(x) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6)
◮ Do we have x1, x2, x4 ⊥
⊥ x5, x6 | x3?
◮ Group the factors
p(x) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)
- φA(x1,x2,x4,x3)
φ3(x3, x5)φ4(x3, x6)
- φB(x5,x6,x3)
◮ Takes the form
p(x) ∝ φA(x, z)φB(y, z) with x ≡ (x1, x2, x4), y ≡ (x5, x6), z = x3
◮ Hence: x1, x2, x4 ⊥
⊥ x5, x6 | x3 holds indeed.
Michael Gutmann Undirected Graphical Models 20 / 51
Separation in undirected graphs
Let X, Y , Z be three disjoint set of nodes in an undirected graph.
◮ X and Z are separated by Z if every trail from any node in X
to any node in Y passes through at least one node of Z.
◮ In other words:
◮ all trails from X to Y are blocked by Z ◮ removing Z from the graph leaves X and Y disconnected. ◮ Nodes are valves; open by default but closed when part of Z.
X Y Z
Michael Gutmann Undirected Graphical Models 21 / 51
Statistical independencies from graph separation
Assume p(x1, . . . , xd) ∝
c φc(Xc), with Xc ⊂ {x1, . . . , xd} can
be visualised as the graph below. Do we have x1, x2 ⊥ ⊥ y1, y2 | z1, z2, z3? z1 z2 z3 x1 x2 y1 y2
Michael Gutmann Undirected Graphical Models 22 / 51
Statistical independencies from graph separation
Assume p(x1, . . . , xd) ∝
c φc(Xc), with Xc ⊂ {x1, . . . , xd} can
be visualised as the graph below. Do we have x ⊥ ⊥ y | z1, z2, z3? u x1 x2 z1 z2 z3 y1 y2 x y
Michael Gutmann Undirected Graphical Models 23 / 51
Statistical independencies from graph separation
◮ With z = (z1, z2, z3), all xi belong to one of the x, y, z, or u. ◮ We thus have p(x1, . . . , xd) = p(x, y, z, u) and we can group
the factors φc together so that p(x, y, z, u) ∝ φ1(x, z)φ2(y, z)φ3(u, z)
u x1 x2 z1 z2 z3 y1 y2 x y
Michael Gutmann Undirected Graphical Models 24 / 51
Statistical independencies from graph separation
◮ Integrating (summing) out u gives
p(x, y, z) =
- u
p(x, y, z, u) (1) ∝
- u
φ1(x, z)φ2(y, z)φ3(u, z) (2)
(distributive law)
∝ φ1(x, z)φ2(y, z)
- u
φ3(u, z) (3) ∝ φ1(x, z)φ2(y, z)˜ φ(z) (4) ∝ φA(x, z)φB(y, z) (5)
◮ And p(x, y, z) ∝ φA(x, z)φB(y, z) means x ⊥
⊥ y | z
Michael Gutmann Undirected Graphical Models 25 / 51
Statistical independencies from graph separation
Assume p(x1, . . . , xd) ∝
c φc(Xc), with Xc ⊂ {x1, . . . , xd} can
be visualised as the graph below. We have shown that if x and y are separated by z, then x ⊥ ⊥ y | z. x1 x2 z1 z2 z3 y1 y2 x y
Michael Gutmann Undirected Graphical Models 26 / 51
Statistical independencies from graph separation
Assume p(x1, . . . , xd) ∝
c φc(Xc), with Xc ⊂ {x1, . . . , xd} can
be visualised as the graph below. So do we have x1, x2 ⊥ ⊥ y1, y2 | z1, z2, z3? x1 x2 z1 z2 z3 y1 y2 x y
Michael Gutmann Undirected Graphical Models 27 / 51
Statistical independencies from graph separation
◮ From tutorial: x ⊥
⊥ {y, w} | z implies x ⊥ ⊥ y | z
◮ Hence x ⊥
⊥ y | z1, z2, z3 implies x1, x2 ⊥ ⊥ y1, y2 | z1, z2, z3. x1 x2 z1 z2 z3 y1 y2 x y
Michael Gutmann Undirected Graphical Models 28 / 51
Summary
Theorem: Let G be the undirected graph for p(x1, . . . , xd) ∝
c φc(Xc), and
X, Y , Z three disjoint subsets of {x1, . . . , xd}. If, in the graph, X and Y are separated by Z, then X ⊥ ⊥ Y | Z.
◮ Important because:
- 1. the theorem allows us to read out (conditional) independencies
from the undirected graph
- 2. the theorem shows that graph separation does not indicate
false independence relations. (“Soundness” of the independence assertions.)
◮ We say that p(x1, . . . , xd) satisfies the global Markov property
relative to G.
Michael Gutmann Undirected Graphical Models 29 / 51
Converse
Theorem: If X and Y are not separated by Z in the graph then X ⊥ ⊥ Y | Z in some probability distributions that factorise according to the graph.
Optional, for those interested: A proof sketch can be found in Section 4.3.1.2
- f Probabilistic Graphical Models by Koller and Friedman.
Remark: The theorem implies that for some specific factors, we may have X ⊥ ⊥ Y | Z even though X and Y are not separated by
- Z. The separation criterion only allows us to decide about
independence and not about dependence. It is not “complete”.
Michael Gutmann Undirected Graphical Models 30 / 51
I-map
(as before for directed graphical models)
◮ A graph is said to be an independency map (I-map) for a set
- f independencies I if the independencies asserted by the
graph are part of I.
◮ For a undirected graph H, let I(H) be all the independencies
that we can derive via graph separation.
◮ Denote the independencies that a distribution p satisfies by
I(p).
◮ The previous results on graph separation can thus be written
as I(H) ⊆ I(p) for all p that factorise over H
◮ As before, we generally do not have I(H) = I(p). If we have
equality, the graph is said to be a perfect map (P-map) for I(p).
Michael Gutmann Undirected Graphical Models 31 / 51
Example
◮ p(x1, . . . , x6) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6) ◮ Graph
x1 x2 x3 x4 x5 x6
◮ Example independencies:
x1 ⊥ ⊥ {x3, x5, x6} | x2, x4 x2 ⊥ ⊥ x6 | x3 x5 ⊥ ⊥ x6 | x3
◮ But x3 ⊥
⊥ x1 for some distributions that factorise over the graph.
Michael Gutmann Undirected Graphical Models 32 / 51
Summary
- 1. Representing probability distributions without imposing a
directionality between the random variables Factorisation and statistical independence Gibbs distributions Visualising Gibbs distributions with undirected graphs Conditioning corresponds to removing nodes and edges from the graph
- 2. Undirected graphs, separation, and statistical independencies
Separation in undirected graphs Statistical independencies from graph separation Global Markov property I-map
Michael Gutmann Undirected Graphical Models 33 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
Via factorisation according to the graph Undirected graphical models satisfy the global Markov property
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 34 / 51
Undirected graphical models
◮ We started with a pdf/pmf in the form of a Gibbs distribution,
and associated a undirected graph with it.
◮ We now go the other way around and start with an undirected
graph.
◮ Definition An undirected graphical model based on an
undirected graph with d nodes and associated random variables xi is the set of pdfs/pmfs that factorise as p(x1, . . . , xd) = 1 Z
- c
φc(Xc) where Z is the normalisation constant, φc(Xc) ≥ 0, and the Xc correspond to the maximal cliques in the graph.
◮ p(x1, . . . , xd) as above are said to factorise according to the
graph.
Michael Gutmann Undirected Graphical Models 35 / 51
Remarks
◮ The undirected graphical model corresponds to a set of
probability distributions. This is because we left the actual definition of the factors φc(Xc) unspecified.
◮ Other names for an undirected graphical model: Markov
network (MN), Markov random field (MRF)
◮ By definition, all p(x1, . . . , xd) defined by the graph satisfy the
global Markov property relative to the graph.
◮ Since the graph is an I-map, we can use graph separation to
determine independencies that hold for all distributions that factorise according to the graph.
◮ The Xc correspond to maximal cliques in the graph.
Maximal clique: a set of fully connected nodes (clique) that is not contained in another clique.
Michael Gutmann Undirected Graphical Models 36 / 51
Why maximal cliques?
◮ The mapping from Gibbs distribution to graph is many to one
We may obtain the same graph for different Gibbs distributions, e.g.
p(x) ∝ φ1(x1, x2, x4)φ2(x2, x3, x4)φ3(x3, x5)φ4(x3, x6) p(x) ∝ ˜ φ1(x1, x2)˜ φ2(x1, x4)˜ φ3(x2, x4)˜ φ4(x2, x3)˜ φ5(x3, x4)˜ φ6(x3, x5)˜ φ7(x3, x6) x1 x2 x3 x4 x5 x6
◮ By using maximal cliques, we take a conservative approach
and do not make additional assumptions on the factorisation.
Michael Gutmann Undirected Graphical Models 37 / 51
Example (pair-wise Markov network)
Graph:
x1 x2 x3 x4 x5 x6
Random variables: x1, . . . , x6 Maximal cliques: all neighbours
{x1, x2} {x2, x3} {x4, x5} φ6{x5, x6} {x1, x4} {x2, x5} φ7{x3, x6}
All models defined by the graph factorise as:
p(x) ∝φ1(x1, x2)φ2(x2, x3)φ3(x4, x5)φ4(x5, x6)φ5(x1, x4)φ6(x2, x5)φ7(x3, x6)
Example of a pairwise Markov network.
Michael Gutmann Undirected Graphical Models 38 / 51
Example (pair-wise Markov network)
Graph:
x1 x2 x3 x4 x5 x6
Some independencies from global Markov property: x1, x4 ⊥ ⊥ x3, x6 | x2, x5 x1 ⊥ ⊥ x5, x6, x3
- all \(x1∪ne1)
| x4, x2
ne1
x1 ⊥ ⊥ x6 | x2, x3, x4, x5
- all without x1,x6
Last two are examples of the “local Markov property” and the “pairwise Markov property” relative to the undirected graph.
Michael Gutmann Undirected Graphical Models 39 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
Via factorisation according to the graph Undirected graphical models satisfy the global Markov property
- 4. Further independencies in undirected graphical models
Michael Gutmann Undirected Graphical Models 40 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Local Markov property Pairwise Markov property Equivalence between factorisation and Markov properties for positive distributions Markov blanket
Michael Gutmann Undirected Graphical Models 41 / 51
Local Markov property
Denote the set of all nodes by X and the neighbours of a node α by ne(α).
◮ A probability distribution is said to satisfy the local Markov
property relative to an undirected graph if α ⊥ ⊥ X \ (α ∪ ne(α)) | ne(α) for all nodes α ∈ X
◮ If p satisfies the global Markov property, then it satisfies the
local Markov property. This is because ne(α) blocks all trails to remaining nodes.
α
Michael Gutmann Undirected Graphical Models 42 / 51
Pairwise Markov property
Denote the set of all nodes by X.
◮ A probability distribution is said to satisfy the pairwise Markov
property relative to an undirected graph if α ⊥ ⊥ β | X \ {α, β} for all non-neighbouring α, β ∈ X
◮ If p satisfies the local Markov property, then it satisfies the
pairwise Markov property.
α β
Michael Gutmann Undirected Graphical Models 43 / 51
Summary
Let p be a pdf/pmf defined by the undirected graph G. p factorises according to G ⇓ p satisfies the global Markov property ⇓ p satisfies the local Markov property ⇓ p satisfies the pairwise Markov property
Michael Gutmann Undirected Graphical Models 44 / 51
Do we have an equivalence?
◮ In directed graphical models, we had an equivalence of
◮ factorisation, ◮ ordered Markov property, ◮ local directed Markov property, and ◮ global directed Markov property.
◮ Do we have a similar equivalence for undirected graphical
models? Yes, under some very mild condition
Michael Gutmann Undirected Graphical Models 45 / 51
Intersection property
◮ The intersection property holds for all distributions with
p(x) > 0 for all values of x in its domain.
◮ Excludes deterministic relationships between the variables. ◮ Intersection property: Let A, B, C, D be sets of random
variables If A ⊥ ⊥ B | (C∪D) and A ⊥ ⊥ C | (B∪D) then A ⊥ ⊥ (B∪C) | D
A D C B
Michael Gutmann Undirected Graphical Models 46 / 51
From pairwise to global Markov property and factorisation
◮ Let p(x1, . . . , xd) be a pdf/pmf that satisfies the intersection
property for all disjoint subsets A, B, C, D of {x1, . . . , xd}. Holds if p is always takes positive values (“positive distributions”).
◮ If p satisfies the pairwise Markov property with respect to an
undirected graph G then
◮ p satisfies the global Markov property with respect to G, and ◮ p factorises according to G.
◮ Hence: equivalence of factorisation and the global, local, and
pairwise Markov properties for positive distributions.
◮ Equivalence known as Hammersely-Clifford theorem. ◮ Important e.g. for learning because prior knowledge may come
in form of conditional independencies (the graph), which we can incorporate by working with Gibbs distributions that factorise accordingly.
Michael Gutmann Undirected Graphical Models 47 / 51
Summary of equivalences
Factorisation p(x1, . . . , xd) = 1
Z
- c φc(Xc),
φc(Xc) > 0
- pairwise Markov property
α ⊥ ⊥ β | {x1, . . . , xd} \ {α, β}
- local Markov property
α ⊥ ⊥ {x1, . . . , xd} \ (α ∪ ne(α)) | ne(α)
- global Markov property
all independencies from graph separation Broadly speaking, the graph serves two related purposes:
- 1. it tells us how distributions factorise
- 2. it represents the independence assumptions made
Michael Gutmann Undirected Graphical Models 48 / 51
Markov blanket
What is the minimal set of variables such that knowing their values makes x independent from the rest? From local Markov property: MB(x) = ne(x): x ⊥ ⊥ {all variables \ (x ∪ ne(x))} | ne(x) .
x
Michael Gutmann Undirected Graphical Models 49 / 51
Program
- 1. Representing probability distributions without imposing a
directionality between the random variables
- 2. Undirected graphs, separation, and statistical independencies
- 3. Definition of undirected graphical models
- 4. Further independencies in undirected graphical models
Local Markov property Pairwise Markov property Equivalence between factorisation and Markov properties for positive distributions Markov blanket
Michael Gutmann Undirected Graphical Models 50 / 51
Program recap
- 1. Representing probability distributions without imposing a directionality
between the random variables Factorisation and statistical independence Gibbs distributions Visualising Gibbs distributions with undirected graphs Conditioning corresponds to removing nodes and edges from the graph
- 2. Undirected graphs, separation, and statistical independencies
Separation in undirected graphs Statistical independencies from graph separation Global Markov property I-map
- 3. Definition of undirected graphical models
Via factorisation according to the graph Undirected graphical models satisfy the global Markov property
- 4. Further independencies in undirected graphical models
Local Markov property Pairwise Markov property Equivalence between factorisation and Markov properties for positive distributions Markov blanket
Michael Gutmann Undirected Graphical Models 51 / 51