Feedback Message Passing for Inference in Gaussian Graphical Models - - PowerPoint PPT Presentation

feedback message passing for inference in gaussian
SMART_READER_LITE
LIVE PREVIEW

Feedback Message Passing for Inference in Gaussian Graphical Models - - PowerPoint PPT Presentation

Feedback Message Passing for Inference in Gaussian Graphical Models Ying Liu Venkat Chandrasekaran, Animashree Anandkumar and Alan Willsky Stochastic Systems Group, Laboratory for Information and Decision Systems, Massachusetts Institute of


slide-1
SLIDE 1

Feedback Message Passing for Inference in Gaussian Graphical Models

Ying Liu Venkat Chandrasekaran, Animashree Anandkumar and Alan Willsky

Stochastic Systems Group, Laboratory for Information and Decision Systems, Massachusetts Institute of Technology

ISIT, Austin Texas, June 18, 2010

1/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 1 / 21

slide-2
SLIDE 2

Gaussian Graphical Models

2/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 2 / 21

slide-3
SLIDE 3

Gaussian Graphical Models

The probability density of a Gaussian graphical model can be written as p(x) ∝ exp{−1 2xT Jx + hT x} where J is called the information matrix and h is called the potential vector.

2/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 2 / 21

slide-4
SLIDE 4

Gaussian Graphical Models

The probability density of a Gaussian graphical model can be written as p(x) ∝ exp{−1 2xT Jx + hT x} where J is called the information matrix and h is called the potential vector. For a valid model, J is symmetric and positive semidefinite.

2/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 2 / 21

slide-5
SLIDE 5

Gaussian Graphical Models

The probability density of a Gaussian graphical model can be written as p(x) ∝ exp{−1 2xT Jx + hT x} where J is called the information matrix and h is called the potential vector. For a valid model, J is symmetric and positive semidefinite. An information matrix J is sparse or Markov with respect to a graph if G = {V, E}: ∀(i, j) / ∈ E, Jij = 0.

Matrix Structure Graph Structure

2/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 2 / 21

slide-6
SLIDE 6

Inference Problem and Applications

p(x) ∝ exp{−1 2xT Jx + hT x}

3/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 3 / 21

slide-7
SLIDE 7

Inference Problem and Applications

p(x) ∝ exp{−1 2xT Jx + hT x} means µ = J−1h and variances diag{Σ = J−1}

3/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 3 / 21

slide-8
SLIDE 8

Inference Problem and Applications

p(x) ∝ exp{−1 2xT Jx + hT x} means µ = J−1h and variances diag{Σ = J−1} Solving this problem in general has O(n3)(fastest O(n2.376)) time complexity, which is intractable for very large scale models.

3/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 3 / 21

slide-9
SLIDE 9

Inference Problem and Applications

p(x) ∝ exp{−1 2xT Jx + hT x} means µ = J−1h and variances diag{Σ = J−1} Solving this problem in general has O(n3)(fastest O(n2.376)) time complexity, which is intractable for very large scale models. Applications: gene regulatory networks, medical diagnostics,

  • ceanography, and communication systems

3/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 3 / 21

slide-10
SLIDE 10

Related Work

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-11
SLIDE 11

Related Work

Belief propagation on trees: linear time complexity, exactness

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-12
SLIDE 12

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-13
SLIDE 13

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

◮ LBP performs reasonably well for certain loopy graphs (Murphy et al,

Crick et al).

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-14
SLIDE 14

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

◮ LBP performs reasonably well for certain loopy graphs (Murphy et al,

Crick et al).

◮ Convergence and accuracy not guaranteed in general (Ihler et al, Weiss

et al)

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-15
SLIDE 15

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

◮ LBP performs reasonably well for certain loopy graphs (Murphy et al,

Crick et al).

◮ Convergence and accuracy not guaranteed in general (Ihler et al, Weiss

et al)

◮ For Gaussian graphical models, if LBP converges, the means are

correct and the variances are generally incorrect (Weiss et al)

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-16
SLIDE 16

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

◮ LBP performs reasonably well for certain loopy graphs (Murphy et al,

Crick et al).

◮ Convergence and accuracy not guaranteed in general (Ihler et al, Weiss

et al)

◮ For Gaussian graphical models, if LBP converges, the means are

correct and the variances are generally incorrect (Weiss et al)

◮ Walk-sum analysis framework (Malioutov et al) 4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-17
SLIDE 17

Related Work

Belief propagation on trees: linear time complexity, exactness Loopy belief propagation for graphs with cycles.

◮ LBP performs reasonably well for certain loopy graphs (Murphy et al,

Crick et al).

◮ Convergence and accuracy not guaranteed in general (Ihler et al, Weiss

et al)

◮ For Gaussian graphical models, if LBP converges, the means are

correct and the variances are generally incorrect (Weiss et al)

◮ Walk-sum analysis framework (Malioutov et al)

Generalized BP (Yedidia et al.), embedded trees (Sudderth et al.), inference by tractable subgraphs (Chandrasekaran et al.)

4/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 4 / 21

slide-18
SLIDE 18

Main Results

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-19
SLIDE 19

Main Results

Exact Feedback Message Passing Exact Solution: O(k2n), where k is the size of the “feedback nodes” and n is the number of nodes.

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-20
SLIDE 20

Main Results

Exact Feedback Message Passing Exact Solution: O(k2n), where k is the size of the “feedback nodes” and n is the number of nodes. Approximate Feedback Message Passing Approximate Solution: trade-off complexity and accuracy by selecting a proper set of “feedback nodes” of bounded size. Walk-sum interpretation.

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-21
SLIDE 21

Main Results

Exact Feedback Message Passing Exact Solution: O(k2n), where k is the size of the “feedback nodes” and n is the number of nodes. Approximate Feedback Message Passing Approximate Solution: trade-off complexity and accuracy by selecting a proper set of “feedback nodes” of bounded size. Walk-sum interpretation. High level idea: run common BP/LBP on non-feedback nodes; special message passing scheme for feedback nodes.

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-22
SLIDE 22

Main Results

Exact Feedback Message Passing Exact Solution: O(k2n), where k is the size of the “feedback nodes” and n is the number of nodes. Approximate Feedback Message Passing Approximate Solution: trade-off complexity and accuracy by selecting a proper set of “feedback nodes” of bounded size. Walk-sum interpretation. High level idea: run common BP/LBP on non-feedback nodes; special message passing scheme for feedback nodes.

1

Obtain inference results for feedback nodes first.

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-23
SLIDE 23

Main Results

Exact Feedback Message Passing Exact Solution: O(k2n), where k is the size of the “feedback nodes” and n is the number of nodes. Approximate Feedback Message Passing Approximate Solution: trade-off complexity and accuracy by selecting a proper set of “feedback nodes” of bounded size. Walk-sum interpretation. High level idea: run common BP/LBP on non-feedback nodes; special message passing scheme for feedback nodes.

1

Obtain inference results for feedback nodes first.

2

Make corrections for the non-feedback nodes afterward.

5/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 5 / 21

slide-24
SLIDE 24

Gaussian Belief Propagation

1 Message Passing

∀j ∈ N(i), ∆Ji→j ∆hi→j

6/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 6 / 21

slide-25
SLIDE 25

Gaussian Belief Propagation

1 Message Passing

∀j ∈ N(i), ∆Ji→j ∆hi→j

2 Marginal Computation

∀i ∈ V, ˆ Ji = Jii +

  • k∈N(i)

∆Jk→i ˆ hi = hi +

  • k∈N(i)

∆hk→i µi = ˆ J−1

i

ˆ hi Var{i} = ˆ J−1

i

.

6/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 6 / 21

slide-26
SLIDE 26

Loopy Belief Propagation

Message update scheme: completely local, no header information and suffers from the cyclic effects.

7/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 7 / 21

slide-27
SLIDE 27

Loopy Belief Propagation

Message update scheme: completely local, no header information and suffers from the cyclic effects. More memory and multiple messages?

7/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 7 / 21

slide-28
SLIDE 28

Loopy Belief Propagation

Message update scheme: completely local, no header information and suffers from the cyclic effects. More memory and multiple messages? Sacrifice some distributiveness for better convergence and accuracy?

7/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 7 / 21

slide-29
SLIDE 29

Loopy Belief Propagation

Message update scheme: completely local, no header information and suffers from the cyclic effects. More memory and multiple messages? Sacrifice some distributiveness for better convergence and accuracy? Some special nodes?

7/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 7 / 21

slide-30
SLIDE 30

Feedback Vertex Set

8/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 8 / 21

slide-31
SLIDE 31

Feedback Vertex Set

Feedback vertex set (FVS) is a set of nodes whose removal results in a cycle-free graph.

8/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 8 / 21

slide-32
SLIDE 32

Feedback Vertex Set

Feedback vertex set (FVS) is a set of nodes whose removal results in a cycle-free graph. In practice, a pseudo-FVS (a small subset of the FVS) may be sufficient for convergence and accuracy.

8/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 8 / 21

slide-33
SLIDE 33

Exact Inference: a Single Feedback Node Case

Extra potential vector h1, h1

j =

  • j /

∈ N(1) J1j j ∈ N(1)

9/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 9 / 21

slide-34
SLIDE 34

Exact Inference: a Single Feedback Node Case (cont’)

Run belief propagation on T . The messages are ∆JT

i→j

∆hT

i→j

∆h1

i→j

We obtain partial variance, partial mean, and feedback gain: VarT {i} µT

i

g1

i

10/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 10 / 21

slide-35
SLIDE 35

Exact Inference: a Single Feedback Node Case (cont’)

Var{1} = (J11 −

  • k∈N(1)

J1kg1

k)−1

µ1 = Var{1}(h1 −

  • j∈N(1)

J1jµT

j )

11/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 11 / 21

slide-36
SLIDE 36

Exact Inference: a Single Feedback Node Case (cont’)

Node 1 tells its neighbors to make revisions on their node potentials.

  • hj = hj − J1jµ1, ∀j ∈ N(1)
  • hj = hj, ∀j /

∈ N(1)

12/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 12 / 21

slide-37
SLIDE 37

Exact Inference: a Single Feedback Node Case (cont’)

Run BP on T with revised node potentials h to obtain exact means. The exact variances can be achieved as Var{i} = VarT {i} + Var{1}(g1

i )2,

∀i ∈ T .

13/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 13 / 21

slide-38
SLIDE 38

Exact Inference: Multiple Feedback Nodes Case

With size k FVS, run BP with k extra messages and add more correction terms. O(k2n)

14/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 14 / 21

slide-39
SLIDE 39

Exact Inference: Multiple Feedback Nodes Case

With size k FVS, run BP with k extra messages and add more correction terms. O(k2n) Example: Exact Inference: O((log n)2n)

14/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 14 / 21

slide-40
SLIDE 40

Approximate Feedback Message Passing

15/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 15 / 21

slide-41
SLIDE 41

Approximate Feedback Message Passing

Full FVS ⇒ pseudo-FVS

15/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 15 / 21

slide-42
SLIDE 42

Approximate Feedback Message Passing

Full FVS ⇒ pseudo-FVS Approximate inference among the tree-like part.

15/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 15 / 21

slide-43
SLIDE 43

Approximate Feedback Message Passing

Full FVS ⇒ pseudo-FVS Approximate inference among the tree-like part. Exact inference among the feedback nodes.

15/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 15 / 21

slide-44
SLIDE 44

Approximate Inference: Theoretical Results

16/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 16 / 21

slide-45
SLIDE 45

Approximate Inference: Theoretical Results

The spectral radius ρT < 1 for the remaining graph T is a sufficient condition for convergence

16/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 16 / 21

slide-46
SLIDE 46

Approximate Inference: Theoretical Results

The spectral radius ρT < 1 for the remaining graph T is a sufficient condition for convergence When it converges, feedback nodes get exact means and variances.

16/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 16 / 21

slide-47
SLIDE 47

Approximate Inference: Theoretical Results

The spectral radius ρT < 1 for the remaining graph T is a sufficient condition for convergence When it converges, feedback nodes get exact means and variances. When it converges, non-feedback nodes get exact means but inaccurate variances (capturing a strictly larger set of walks).

16/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 16 / 21

slide-48
SLIDE 48

Approximate Inference: Theoretical Results

The spectral radius ρT < 1 for the remaining graph T is a sufficient condition for convergence When it converges, feedback nodes get exact means and variances. When it converges, non-feedback nodes get exact means but inaccurate variances (capturing a strictly larger set of walks). For attractive models (where Jij ≤ 0 for i = j), better lower bounds

  • f the variances.

16/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 16 / 21

slide-49
SLIDE 49

Selecting a pseudo-FVS of Bounded Size

Two goals: better convergence and better accuracy

17/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 17 / 21

slide-50
SLIDE 50

Selecting a pseudo-FVS of Bounded Size

Two goals: better convergence and better accuracy

1

s(i) =

j∈N(i) |Jij|

2

s(i) =

l,k∈N(i),l<k |JilJik|

17/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 17 / 21

slide-51
SLIDE 51

Selecting a pseudo-FVS of Bounded Size

Two goals: better convergence and better accuracy

1

s(i) =

j∈N(i) |Jij|

2

s(i) =

l,k∈N(i),l<k |JilJik|

Pick up one node with the largest score s(i) at one step and continue with the remaining graph

17/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 17 / 21

slide-52
SLIDE 52

Numerical Results

5 10 −10 −8 −6 −4 −2 Removing 0 node(s): =1.0477 5 10 −10 −8 −6 −4 −2 Removing 1 node(s): =1.0415 5 10 −10 −8 −6 −4 −2 Removing 2 node(s): =0.97249 5 10 −10 −8 −6 −4 −2 Removing 3 node(s): =0.95638 − − − − −

− − − −

  • 18/21

Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 18 / 21

slide-53
SLIDE 53

(o) Iterations versus variance errors (p) Iterations versus mean errors

Figure: Inference errors of a 80 × 80 grid graph

Empirically, k = O(log n) seems to be sufficient.

19/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 19 / 21

slide-54
SLIDE 54

Conclusions and Future Research

Conclusions

20/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 20 / 21

slide-55
SLIDE 55

Conclusions and Future Research

Conclusions Exact feedback message passing: O(k2n)

20/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 20 / 21

slide-56
SLIDE 56

Conclusions and Future Research

Conclusions Exact feedback message passing: O(k2n) Approximate feedback message passing: trade-off complexity and accuracy

20/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 20 / 21

slide-57
SLIDE 57

Conclusions and Future Research

Conclusions Exact feedback message passing: O(k2n) Approximate feedback message passing: trade-off complexity and accuracy Future Research Performance on random graphs Computing the partition function

20/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 20 / 21

slide-58
SLIDE 58

Conclusions and Future Research

Conclusions Exact feedback message passing: O(k2n) Approximate feedback message passing: trade-off complexity and accuracy Future Research Performance on random graphs Computing the partition function Corresponding structural learning problem

20/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 20 / 21

slide-59
SLIDE 59

Questions and Comments? Thank you!

21/21 Ying Liu (LIDS, MIT) Feedback Message Passing ISIT 2010 21 / 21