Probabilistic Graphical Models Probabilistic Graphical Models
Variable elimination
Siamak Ravanbakhsh Fall 2019
Probabilistic Graphical Models Probabilistic Graphical Models - - PowerPoint PPT Presentation
Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak Ravanbakhsh Fall 2019 Learning objective Learning objective an intuition for inference in graphical models why is it difficult? exact inference by
Siamak Ravanbakhsh Fall 2019
1
1
m
m P(X
=x )m m
P(X
=x ,X =x )1 1 m m
1
P(X , X =2 n
1 2
2 n
n
1
1
m
m P(X
=x )m m
P(X
=x ,X =x )1 1 m m
∗
x
1
P(X , X =2 n
1 2
2 n
n
maximum a posteriori
1
2
1
O(∣V al(X
) ×1
V al(X
)∣)2
O(∣V al(X
) ×1
V al(X
)∣)2
1
P(X , X =2 n
1 2
2 n
n
1
2
1
3
1
P(X , X =2 n
1 2
2 n
n
O(∣V al(X
) ×1
V al(X
) ×2
V al(X
)∣)3
O(∣V al(X
) ×1
V al(X
) ×2
V al(X
)∣)3
i
n
1
P(X , X =2 n
1 2
2 n
n
i
n
1
P(X , X =2 n
1 2
2 n
n
Z 1 ∏i=1 n−1 i i i+1
i
n
1
P(X , X =2 n
1 2
2 n
n
Z 1 ∏i=1 n−1 i i i+1
despite this, graphical models are used for combinatorial optimization (why?)
P(X = x) > 0
SAT vars. SAT clauses X = 1 iff satisfiable
P(X = x) > 0 P(X = x)
SAT vars. SAT clauses X = 1 iff satisfiable
P(X = x) > 0
P(X = x)
SAT vars. SAT clauses X = 1 iff satisfiable
P(X = x)
1+ϵ ρ
P(X = x ∣ E = e)
ρ(1 − ϵ) ≤ P(X = x) ≤ ρ(1 + ϵ) 0 < ϵ <
2 1
P(X = x ∣ E = e)
ρ(1 − ϵ) ≤ P(X = x) ≤ ρ(1 + ϵ)
0 < ϵ <
2 1
q
=i ∗
arg max
P(Q =q i
q ∣ (Q
, … , Q ) =1 i−1
(q
… q ), X =1 ∗ i−1 ∗
1)
i
2 1 i 1 2 1
2 1
Z 1 ∏i=1 n−1 i i i+1
1
n
n
n
p(x)n
i
Z 1 ∏i=1 n−1 i i i+1
1
n
n
m
… ϕ (x , x ) … ϕ (x , x )1
n−1
1 1 2 n−1 n−1 n
3 operations 2 operations
n
(x )/( (x ))n
n p
n
3 operations 2 operations
∣V al(X)∣ = ∣V al(Y )∣ = ∣V al(Z)∣ = d
3
2
back to example
m
n−1
n−1 n−1 n ∑x
n−2
n−2 n−2 n−1
1
1 1 2
m
… ϕ (x , x ) … ϕ (x , x )1
n−1
1 1 2 n−1 n−1 n
Z 1 ∏i=1 n−1 i i i+1
1
n
2
n
1
) =p(
)x ˉ6 p(x
, )1 x
ˉ6
source: Michael Jordan's book
1
6
)1
1 x
another way to write (used in Jordan's textbook)
source: Michael Jordan's book
3
source: Michael Jordan's book
1 x
2
1 x
3
2
is constant
3
5
if we had built the 5d array of
1 2 3 4 5
)in the general case O(d )
n
6 x 6
6
1 x
Z 1 ∑x
,…,x2 5
1 2 1 3 2 3 3 5 2 5 6 6 x
1 x
Z 1 ∑x
,…,x2 5
1 2 1 3 2 3 3 5 2 5 6 6 x
=
ϕ(x , x )ϕ(x , x )ϕ(x , x )ϕ(x , x )m (x , x )Z 1 ∑x
,…,x2 5
1 2 1 3 2 3 3 5 6 2 5
=
ϕ(x , x ) … , m (x ) ϕ(x , x )m (x , x )Z 1 ∑x
2
1 2 4 2 ∑x
3
1 3 5 2 3
=
m (x )Z 1 2 1
=
ϕ(x , x ) … , m (x )m (x , x )Z 1 ∑x
2
1 2 4 2 3 1 2
i
1
i
m
t=0
1 K
t
t
i
t
t
ϕi
t
t ′
ψi t
t
ϕ (D )i 1 i m ∏k
k k
t
t−1
t
t ′
t=0
1 K
ϕ (D )i 1 i m ∏k
k k
2
1 3
1
2 5 4
2 5
3
1
m
t
t
i
t
t
ϕ2 5 5
3
t 2 3 5
2 5 5
3
t
t
i
t
t
ϕ2 5 5 3
t 2 3 5
2 5 5 3
t ′ 2 3
ψ (x , x , x )5
t 2 3 5
t t i
t
t
2 5 5 3
t 2 3 5
2 5 5 3
t ′ 2 3
ψ (x , x , x )5
t 2 3 5
3
2
5
t t i
t
t
t ′ 2 3
5
t 2 3 5
t
t−1
t
t ′
2
1 3
1
2 5 4
2 5 3
1
2
1 3
1 4
2 t ′ 2 3
t t i
t
t
t t−1 t t ′
1
2
1 3
1 4
2 t ′
1
2
1 3
1 6 2 5 4
2 5 3
2 5 6
1
1
2
1 3
1 1 ′ 2 5 4
2 5 3
1 ′ 2 5
1
1
2
1 3
1 1 ′ 2 5 4
2 5
3
2 2 3 5 ψ
(x , x )2 ′ 2 3
1
2
2
1 3
1 2 ′ 2 3 4
2
2 2 3 5 ψ
(x , x )2 ′ 2 3
fill edge
1
2
2
1 3
1 2 ′ 2 3 4
2
3 2 4
3 ′ 2
1
3
2
1 3
1 2 ′ 2 3 3 ′ 2
3 2 4
3 ′ 2
1
3
2
1 3
1 2 ′ 2 3 3 ′ 2
4 1 2 3
4 ′ 1 2
1
4
2
1 3 ′ 2 4 ′ 1 2
4 1 2 3
4 ′ 1 2
1
5 1 2
5 ′ 1
4
2
1 3 ′ 2 4 ′ 1 2
1
5 1 2
5 ′ 1
5
5 ′ 1
1
ψ (x )Z 1 5 ′ 1
1
5 ′ 1
5
5 ′ 1
1
ϕ(x , x )ϕ(x , x )ϕ(x , x )ϕ(x , x )ϕ(x , x , x )Z 1 ∑x
,…,x2 6
1 2 1 3 2 3 3 5 2 5 6
6
6 ′
1
m
Ψ =
t
{ϕ ∈ Φ ∣
t
x
∈i
t
Scope[ϕ]}
t
ϕi
t
t ′
ψi t
t
t
t−1
t
t ′
t ∣Scope[ψ
]∣t
∣Scope[ψ
]∣t
t
take one such clique - e.g., take the first to be eliminated - e.g., all the edges to exist before its elimination therefore, removing will create a factor with
t
{X
, X , X }2 3 5
X
5
X
5
X
5
Scope[ψ
] =t
{X
, X , X }2 3 5
take one such clique - e.g., take the first to be eliminated - e.g., all the edges to exist before its elimination therefore, removing will create a factor with
t
{X
, X , X }2 3 5
X
5
X
5
X
5
Scope[ψ
] =t
{X
, X , X }2 3 5
a similar argument all the loops > 3 have a chord
t
∣Scope[ψ
]∣t
t
ψ
t
t
min-neighbours: #neigbours in the current graph min-weight: product of cardinality of neighbours
min-neighbours: #neigbours in the current graph min-weight: product of cardinality of neighbours
min-fill: number of fill-edges after its elimination weighted min-fill: edges are weighted by the product of the cardinality of the two vertices
try different heuristics calculate the max-clique size pick the best ordering apply variable elimination
comparing the size of factors
1
m
m P(X
=x )m m
P(X
,X =x )1 m m
1
P(X , X =2 n
1 2
2 n
n
P(X
, X =1 m
x
)m
P(X
=m
x
)m
1
1
P(X =2 n
1
1 2
2 n
n
x
x
1
1
x
,…,x2 n
1 2
2 n
n