Sum-Product: Message Passing Belief Propagation Probabilistic - - PowerPoint PPT Presentation

sum product message passing
SMART_READER_LITE
LIVE PREVIEW

Sum-Product: Message Passing Belief Propagation Probabilistic - - PowerPoint PPT Presentation

Sum-Product: Message Passing Belief Propagation Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani All single-node marginals If we need the full set of marginals, repeating elimination algorithm for each


slide-1
SLIDE 1

Sum-Product: Message Passing Belief Propagation

Probabilistic Graphical Models Sharif University of Technology Spring 2018 Soleymani

slide-2
SLIDE 2

All single-node marginals

2

 If we need the full set of marginals, repeating elimination

algorithm for each individual variable is wasteful

 It does not share intermediate terms

 Message-passing algorithms on graphs (messages are

the shared intermediate terms).

 sum-product and junction tree  upon convergence of the algorithms, we obtain marginal

probabilities for all cliques of the original graph.

slide-3
SLIDE 3

Tree

3

 Sum-product work only in trees (and we will see it also

work on tree-like graphs)

Directed tree All nodes have one parent expect to the root Undirected tree A unique path between any pair of nodes

slide-4
SLIDE 4

Parameterization

4

 Consider a tree 𝒰(𝒲, ℰ)  Potential functions: 𝜚 𝑦𝑗 , 𝜚(𝑦𝑗, 𝑦𝑘)

𝑄 𝒚 = 1 𝑎

𝑗∈𝒲

𝜚 𝑦𝑗

𝑗,𝑘 ∈ℰ

𝜚 𝑦𝑗, 𝑦𝑘

 In directed graphs:

 𝜚 𝑦𝑠 = 𝑄(𝑦𝑠), ∀𝑗 ≠ 𝑠, 𝜚 𝑦𝑗 = 1  𝜚 𝑦𝑗, 𝑦𝑘 = 𝑄(𝑦𝑘|𝑦𝑗) (𝑦𝑗 is the parent of 𝑦𝑘)  𝑎 = 1

 When we have evidence on variable 𝑦𝑗 as 𝑦𝑗 =

𝑦𝑗 we replace 𝑦𝑗 in all factors in which it appears by 𝑦𝑗

𝑄 𝒚 = 𝑄(𝑦𝑠)

𝑗,𝑘 ∈ℰ

𝑄 𝑦𝑘|𝑦𝑗

slide-5
SLIDE 5

Sum-product: elimination view

5

 Query node 𝑠  Elimination order: inverse of the topological order

 Starts from leaves and generates elimination cliques of size at

most two

 Elimination of each node can be considered as message-

passing (or Belief Propagation):

 Elimination on trees is equivalent to message passing along tree

branches

 Instead of the node elimination, we preserve the node and

compute a message from it to its parent

 This message is equivalent to the factor resulted from the elimination

  • f that node and all of the nodes in its subtree
slide-6
SLIDE 6

Messages

6

 A node can send a message to its neighbors when (and

  • nly when) it has received messages from all its other

neighbors.

Message that 𝑘 sends to 𝑗 … root

slide-7
SLIDE 7

Messages and marginal distribution

7

Message that X𝑘 sends to 𝑌𝑗 𝑛𝑘𝑗 𝑦𝑗 =

𝑦𝑘

𝜚 𝑦𝑘 𝜚 𝑦𝑗, 𝑦𝑘

𝑙∈𝒪(𝑘)\𝑗

𝑛𝑙𝑘(𝑦𝑘) 𝑞 𝑦𝑠 ∝ 𝜚 𝑦𝑠

𝑙∈𝒪(𝑠)

𝑛𝑙𝑠(𝑦𝑠)

a function of only 𝑦𝑗

slide-8
SLIDE 8

Messages and marginal: Example

8

 Compute 𝑞 𝑦1

𝑞 𝑦1 ∝ 𝜚 𝑦1 𝑛21(𝑦1) 𝑛21 𝑦1 =

𝑦2

𝜚 𝑦2 𝜚 𝑦1, 𝑦2 𝑛32(𝑦2)𝑛42(𝑦2)

21

𝑛32 𝑦2 =

𝑦3

𝜚 𝑦3 𝜚 𝑦2, 𝑦3 𝑛42 𝑦2 =

𝑦4

𝜚 𝑦4 𝜚 𝑦2, 𝑦4 Product remained factors (after eliminating all variables except to 𝑦1)

slide-9
SLIDE 9

Messages and marginal: Example

9

 Compute 𝑞 𝑦2

𝑞 𝑦2 ∝ 𝜚 𝑦2 𝑛12(𝑦2)𝑛32(𝑦2)𝑛42(𝑦2) 𝑛12 𝑦2 =

𝑦1

𝜚 𝑦1 𝜚 𝑦1, 𝑦2 𝑛32 𝑦2 =

𝑦3

𝜚 𝑦3 𝜚 𝑦2, 𝑦3 𝑛42 𝑦2 =

𝑦4

𝜚 𝑦4 𝜚 𝑦2, 𝑦4

slide-10
SLIDE 10

Messages on a tree

10

 Messages can be reused to find probabilities on different

query variables.

 Messages on the tree provide a data structure for caching

computations.

𝑌1 𝑌2 𝑌3 𝑌4 𝑌5 We need 𝑛32(𝑦2) to find both 𝑄(𝑌1) and 𝑄(𝑌2)

slide-11
SLIDE 11

From elimination to message passing

11

 Recall ELIMINATION algorithm:

 Choose an ordering Z in which query node f is the final node  Place all potentials on an active list  Eliminate node i by removing all potentials containing i, take sum over xi  Place the resultant factor back on the list

 For a TREE graph:

 Choose query node f as the root of the tree

 View tree as a directed tree with edges pointing towards leaves from f

 Elimination ordering based on reverse topological order  Elimination of each node can be considered as message-passing directly

along tree branches

 Thus, we can use the tree itself as a data-structure to do general

inference!!

This slide has been adopted from Eric Zing, PGM 10708, CMU.

slide-12
SLIDE 12

Computing all node marginals

12

 We can compute over all possible elimination order

(generating only elimination cliques of size 2) by only computing all possible messages (2 ℰ )

 T

  • allow all nodes can be the root, we just need to compute

2 ℰ messages

 Messages can be reused

 Instead of running the elimination algorithm 𝑂 times

 Dynamic programming approach

 2-Pass algorithm that saves and uses messages

 A pair of messages (one for each direction) have been computed for

each edge

slide-13
SLIDE 13

Messages required to compute all node marginals

13

slide-14
SLIDE 14

Computing node marginals:

14

 Naïve approach:

 Complexity: N×C

 N is the number of nodes  C is the complexity of a complete message passing

 Alternative dynamic programming approach

 2-Pass algorithm

 Complexity: 2C!

slide-15
SLIDE 15

A two-pass message-passing schedule

15

 Arbitrarily pick a node as the root

 First pass: starting at the leaves and proceeds inward

 each node passes a message to its parent.  continues until the root has obtained messages from all of its

adjoining nodes.

 Second pass: starting at the root and passing the messages back

  • ut

 messages are passed in the reverse direction.  continues until all leaves have received their messages.

slide-16
SLIDE 16

Asynchronous two-pass message-passing

16

First pass: upward Second pass: downward

slide-17
SLIDE 17

Sum-product algorithm: example

17

𝑛21(𝑦1) 𝑛21(𝑦1)

slide-18
SLIDE 18

Sum-product algorithm: example

18

𝑛21(𝑦1)

slide-19
SLIDE 19

Parallel (synchronous) message-passing

19

 For a node of degree d, whenever messages have arrived

  • n any subset of d-1 node, compute the message for the

remaining edge and send!

 A pair of messages have been computed for each edge,

  • ne for each direction

 All incoming messages are eventually computed for each

node

slide-20
SLIDE 20

Parallel message-passing

20

 Message-passing protocol: a node can send a message to a

neighboring node when and only when it has received messages from all of its other neighbors

 Correctness of parallel message-passing on trees

 The synchronous implementation is “non-blocking”  Theorem:

The message-passing guarantees

  • btaining

all marginals in the tree

slide-21
SLIDE 21

Parallel message passing: Example

21

slide-22
SLIDE 22

Tree-like graphs

22

 Sum-product message passing idea can also be extended

to work in tree-like graphs (e.g., polytrees) too.

 Although the undirected marginalized graphs resulted

from polytrees are not tree, the corresponding factor graph is a tree

Polytree Nodes can have multiple parents Moralized graph Factor graph

slide-23
SLIDE 23

References

23

 D. Koller and N. Friedman, “Probabilistic Graphical Models:

Principles and Techniques”, MIT Press, 2009, Chapter 10.

 M.I. Jordan, “An

Introduction to Probabilistic Graphical Models”, Chapter 4.