1
Divergence measures and message passing
Tom Minka Microsoft Research Cambridge, UK
with thanks to the Machine Learning and Perception Group
Divergence measures and message passing Tom Minka Microsoft - - PowerPoint PPT Presentation
Divergence measures and message passing Tom Minka Microsoft Research Cambridge, UK with thanks to the Machine Learning and Perception Group 1 Message-Passing Algorithms MF [Peterson,Anderson 87] Mean-field BP [Frey,MacKay 97] Loopy
1
with thanks to the Machine Learning and Perception Group
2
[Minka 04] PEP
[Wiegerinck,Heskes 02] FBP
[Wainwright,Jaakkola,Willsky 03] TRW
[Minka 01] EP
[Frey,MacKay 97] BP
[Peterson,Anderson 87] MF
3
4
5
x y z a b c d f e
6
x y z a b c d f e 1 ? 1 ? 1 ?
7
x y z
8
9
y x z
10
x y z
11
Marginals: (Exact) (BP) Normalizing constant: 0.45 (Exact) 0.44 (BP) Argmax: (0,0,0) (Exact) (0,0,0) (BP)
12
13
14
15
16
Kullback-Leibler (KL) divergence Let p,q be unnormalized distributions Alpha-divergence (α is any real number) Asymmetric, convex
17
18
q is Gaussian, minimizes Dα(p||q) α = -∞
19
q is Gaussian, minimizes Dα(p||q) α = 0
20
q is Gaussian, minimizes Dα(p||q) α = 0.5
21
q is Gaussian, minimizes Dα(p||q) α = 1
22
q is Gaussian, minimizes Dα(p||q) α = ∞
23
[Frey,Patrascu,Jaakkola,Moran 00]
24
α 1 zero forcing inclusive (zero avoiding) MF BP, EP FBP, PEP TRW
25
26
divergence to p
1 (BP)
x y
27
α = 1 (BP) Bimodal distribution
Bad Good
heights
α = 0 (MF) α ≤ 0.5
28
α = ∞ Bimodal distribution
Bad Good
heights
29
30
31
32
33
34
35
MF α local = global no loss from message passing local ≠ global
36
37
38
39
BP
EP
FBP
Power EP
MF
TRW
Structured MF
40
BP
EP
FBP
Power EP
divergence measure Other families? (mixtures) MF
TRW
approximation family Structured MF
Other divergences?
41
[Yedidia,Freeman,Weiss 00]
42