Graphical models and inference III Milos Hauskrecht milos@pitt.edu - - PDF document

graphical models and inference iii
SMART_READER_LITE
LIVE PREVIEW

Graphical models and inference III Milos Hauskrecht milos@pitt.edu - - PDF document

CS 3750 Machine Learning Lecture 4 Graphical models and inference III Milos Hauskrecht milos@pitt.edu 5329 Sennott Square, x4-8845 http://www.cs.pitt.edu/~milos/courses/cs3750-Spring2020/ CS 3750 Advanced Machine Learning Clique trees BBNs


slide-1
SLIDE 1

1

CS 3750 Advanced Machine Learning

CS 3750 Machine Learning

Milos Hauskrecht milos@pitt.edu 5329 Sennott Square, x4-8845 http://www.cs.pitt.edu/~milos/courses/cs3750-Spring2020/

Lecture 4

Graphical models and inference III

CS 3750 Advanced Machine Learning

Clique trees

– .

A A B C C B D E F G F C G G H

B C D E F H A G Clique tree MRF (graph) BBNs and MRF can be converted to clique tress:

  • Optimal clique trees can support efficient inferences

Note: a clique tree = a tree decomposition of an MRF = = junction tree

slide-2
SLIDE 2

2

CS 3750 Advanced Machine Learning

Algorithms for clique trees

Properties

  • A tree with nodes corresponding to sets of variables
  • Satisfies: a running intersection property
  • For every v G : the nodes in T that contain v form a

connected subtree. Inference algorithms for the clique trees exist:

  • inference complexity is determined by the width of the tree

CS 3750 Advanced Machine Learning

VE on the Clique tree

  • Variable Elimination on the clique tree

– works on factors

  • Makes factor a data structure

– Sends and receives messages

  • Graph representing a set of factors, each node i is associated

with a subset (cluster, clique) Ci.

slide-3
SLIDE 3

3

CS 3750 Advanced Machine Learning

Clique trees

  • Example clique tree

C,D G,J,S,L G,S,I G,I,D H,G,J S,K

C D I G L S J H K

CS 3750 Advanced Machine Learning

Clique tree properties

  • Sepset

– separation set (sepset) : Variables X on one side of a sepset are separated from the variables Y on the other side in the factor graph given variables in S

  • Running intersection property

– if Ci and Cj both contain variable X, then all cliques on the unique path between them also contain X

j i ij

C C S  

slide-4
SLIDE 4

4

CS 3750 Advanced Machine Learning

Clique trees

  • Running intersection:

E.g. Cliques involving G form a connected subtree.

C,D G,J,S,L G,S,I G,I,D H,G,J S,K

C D I G L S J H K

CS 3750 Advanced Machine Learning

Clique trees

  • Sepsets:
  • Variables X on one side of a sepset are

separated from the variables Y on the

  • ther side given variables in S

C,D G,J,S,L G,S,I G,I,D H,G,J S,K D G,I G,S G,J S Sepsets

j i ij

C C S  

C D I G L S J H K

slide-5
SLIDE 5

5

CS 3750 Advanced Machine Learning

Clique trees

C,D G,J,S,L G,S,I G,I,D H,G,J

Initial potentials : Assign factors to cliques and multiply them. C D I G L S J H

i

 ) , ( D C  ) , , ( D I G  ) , , ( I S G  ) , , , ( L S J G  ) , , ( J G H 

S,K

K

) , , ( ) , ( ) , , , ( ) , , ( ) , , ( ) , ( ) , , , , , , , , ( J G H K S L S J G I S G D I G D C H K L J S I G D C p       

) , ( K S 

CS 3750 Advanced Machine Learning

Message Passing VE

  • Query for P(J)

– Eliminate C:

C

D C D ] , [ ) (

1 1

 

C,D G,J,S,L G,S,I G,I,D S,K H,G,J

Message sent from [C,D] to [G,I,D]

D Message received at [G,I,D] -- [G,I,D] updates:

] , , [ ) ( ] , , [

2 1 2

D I G D D I G     

C D I G L S J H K

slide-6
SLIDE 6

6

CS 3750 Advanced Machine Learning

Message Passing VE

  • Query for P(J)

– Eliminate D:

D

D I G I G ] , , [ ) , (

2 2

 

C,D G,J,S,L G,S,I G,I,D SK H,G,J

Message sent from [G,I,D] to [G,S,I]

D Message received at [G,S,I] -- [G,S,I] updates:

] , , [ ) , ( ] , , [

3 2 3

I S G I G I S G     

G,I

C D I G L S J H K

CS 3750 Advanced Machine Learning

Message Passing VE

  • Query for P(J)

– Eliminate I:

C,D G,J,S,L G,S,I G,I,D S,K H,G,J

Message sent from [G,S,I] to [G,J,S,L]

D Message received at [G,J,S,L] -- [G,J,S,L] updates:

] , , , [ ) , ( ] , , , [

4 3 4

L S J G S G L S J G     

G,I i G,S

!

[G,J,S,L] is not ready!

I

I S G S G ] , , [ ) , (

3 3

 

C D I G L S J H K

slide-7
SLIDE 7

7

CS 3750 Advanced Machine Learning

Message Passing VE

  • Query for P(J)

– Eliminate H:

C,D G,J,S,L G,S,I G,I,D S,K H,G,J

Message sent from [H,G,J] to [G,J,S,L]

D

] , , , [ ) , ( ) , ( ] , , , [

4 4 3 4

L S J G J G S G L S J G       

G,I i G,S

And …

H

J G H J G ] , , [ ) , (

5 4

 

h G,J C D I G L S J H K

CS 3750 Advanced Machine Learning

Message Passing VE

  • Query for P(J)

– Eliminate K:

C,D G,J,S,L G,S,I G,I,D S,K H,G,J

Message sent from [S,K] to [G,J,S,L]

D All messages received at [G,J,S,L] [G,J,S,L] updates:

] , , , [ ) ( ) , ( ) , ( ] , , , [

4 6 4 3 4

L S J G S J G S G L S J G         

G,I i G,S

K

K S S ] , [ ) (

6

 

h G,J C D I G L S J H K 

S

And calculate P(J) from it by summing out G,S,L

slide-8
SLIDE 8

8

CS 3750 Advanced Machine Learning

Message Passing VE

  • [G,J,S,L] clique potential
  • … is used to finish the inference

C,D G,J,S,L G,S,I G,I,D S,K H,G,J 

D 

G,I i G,S G,J h

S

CS 3750 Advanced Machine Learning

Message passing VE

  • Often, many marginals are desired

– Inefficient to re-run each inference from scratch – One distinct message per edge & direction

  • Methods :

– Compute (unnormalized) marginals for any vertex (clique) of the tree – Results in a calibrated clique tree

  • Recap: three kinds of factor objects

– Initial potentials, final potentials and messages

 

 

ij j ij i

S C j S C i

 

slide-9
SLIDE 9

9

CS 3750 Advanced Machine Learning

Two-pass message passing VE

  • Chose the root clique, e.g. [S,K]
  • Propagate messages to the root

C,D G,J,S,L G,S,I G,I,D S,K H,G,J 

D 

G,I i G,S G,J h

S

CS 3750 Advanced Machine Learning

Two-pass message passing VE

  • Send messages back from the root

C,D G,J,S,L G,S,I G,I,D S,K H,G,J 

D 

G,I G,J i

S G,S h

slide-10
SLIDE 10

10

CS 3750 Advanced Machine Learning

Message Passing: BP

  • Belief propagation

– A different algorithm but equivalent to variable elimination in terms of the results – Asynchronous implementation

CS 3750 Advanced Machine Learning

Message Passing: BP

  • Each node: multiply all the messages and divide by the one

that is coming from node we are sending the message to – Clearly the same as VE – Initialize the messages on the edges to 1

    

         

  

ij i ij i ij i

S C j i N k i k i j S C i N k i k i j S C i j i \ ) ( ) (

     

slide-11
SLIDE 11

11

CS 3750 Advanced Machine Learning

Message Passing: BP

A,B C,D B,C

B C

       

 B

C B ) , (

2 3 2

  ) , ( ) , ( ) , ( ) , (

2 3 3 , 2 3 2 3 3

C B D C D C D C

B

 

 

     

) , (

1

B A  ) , (

2

C B  ) , (

3

D C  Store the last message

  • n the edge and divide

each passing message by the last stored.

1

3 , 2 

        

 B

C B ) , (

2 3 2 3 , 2

  

New message

CS 3750 Advanced Machine Learning

Message Passing: BP

A,B C,D B,C

B C

       

B

C B ) , (

2 3 , 2

 

3 , 2 3 2 3 3

) , ( ) , ( ) , ( ) , (      D C C B D C D C

B

 

) , (

1

B A  ) , (

2

C B 

       

  D

D C ) , (

3 2 3

 

) , (

3

D C 

 

     

  D D

D C C B C D C C C B C C B C B ) , ( ) , ( ) ( ) , ( ) ( ) , ( ) ( ) , ( ) , (

3 2 3 , 2 3 3 , 2 2 3 , 2 2 3 2 2

         

Store the last message

  • n the edge and divide

each passing message by the last stored.

  

        

  D B D

C B D C D C ) , ( ) , ( ) , (

2 3 3 2 3 3 , 2

    

New message

slide-12
SLIDE 12

12

CS 3750 Advanced Machine Learning

Message Passing: BP

A,B C,D B,C

B C ) , ( ) , ( ) , (

2 3 3

C B D C D C

B

    ) , (

1

B A  ) , (

2

C B 

       

  D

D C ) , (

3 2 3

 

) , (

3

D C 

 

D

D C C B C B ) , ( ) , ( ) , (

3 2 2

  

Store the last message

  • n the edge and divide

each passing message by the last stored.

 

D B

C B D C ) , ( ) , (

2 3 3 , 2

  

) , ( ) , ( ) , ( ) , ( ) , ( ) , ( ) ( ) , ( ) , (

2 2 3 2 3 2 3 , 2 2 3 2 2

C B C B D C C B D C C B C C B C B

D B D B

               

   

 

The same as before

CS 3750 Advanced Machine Learning

Loopy belief propagation

  • The asynchronous BP algorithm works on clique trees
  • What if we run the belief propagation algorithm on a non-tree

structure?

  • Sometimes converges
  • If it converges it leads to an approximate solution
  • Advantage: tractable for large graphs

A D B C

A,B

B

A,D

A

B,C C,D

C D

slide-13
SLIDE 13

13

CS 3750 Advanced Machine Learning

Loopy belief propagation

  • If the BP algorithm converges, it converges to the optimum of

the Bethe free energy See papers:

  • Yedidia J.S., Freeman W.T. and Weiss Y. Generalized Belief

Propagation, 2000

  • Yedidia J.S., Freeman W.T. and Weiss Y. Understanding

Belief Propagation and Its Generalizations, 2001

CS 3750 Advanced Machine Learning

Factor graph representation

A graphical representation that lets us express a factorization of a function over a set of variables A factor graph is bipartite graph where:

  • One layer is formed by variables
  • Another layer is formed by factors or functions on subsets of

variables Example: a function over variables x1, x2 , … x5 g(x1, x2 , …x5 ) = fA(x1) fB(x2) fC(x1,x2 ,x3) fD(x3 ,x4) fE(x3 ,x5) x1 x2 x3 x4 x5 fA fB fc fD fE

slide-14
SLIDE 14

14

Inferences

Slides by C. Bishop CS 3750 Advanced Machine Learning

Inferences on factor graphs

  • Efficient inference algorithms for factor graphs built

for trees [Frey, 1998; Kschischnang et al., 2001] :

  • Sum-product algorithm
  • Max product algorithm
slide-15
SLIDE 15

15

Inferences

Slides by C. Bishop

Inferences

Slides by C. Bishop

slide-16
SLIDE 16

16

Inferences

Slides by C. Bishop

Inferences

Slides by C. Bishop

slide-17
SLIDE 17

17

Inferences

Slides by C. Bishop

Inferences

Slides by C. Bishop

slide-18
SLIDE 18

18

Inferences

Slides by C. Bishop