junction tree algorithm
play

Junction Tree Algorithm and a case study of the Hidden Markov Model - PDF document

School of Computer Science Junction Tree Algorithm and a case study of the Hidden Markov Model Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 6, Oct 3, 2007 Receptor A Receptor A X 1 X 1 X 1


  1. School of Computer Science Junction Tree Algorithm and a case study of the Hidden Markov Model Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 6, Oct 3, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B X 2 X 2 X 2 Eric Xing Eric Xing Kinase C Kinase C X 3 X 3 X 3 Kinase D Kinase D X 4 X 4 X 4 Kinase E Kinase E X 5 X 5 X 5 TF F TF F X 6 X 6 X 6 Reading: J-Chap 12, 17, KF-Chap. 10 Gene G Gene G X 7 X 7 X 7 X 8 X 8 X 8 Gene H Gene H 1 Outline � So far we have studied exact inference in: Trees � message passing on the original graph (which are trees) � � Poly-trees, Tree-like graphs message passing in factor trees � � Now we will look into exact inference in arbitrary graphs Junction-Tree algorithm � � Inference in Hidden Markov Model Eric Xing 2 1

  2. Elimination Clique � Recall that Induced dependency during marginalization is captured in elimination cliques � Summation <-> elimination � Intermediate term <-> elimination clique B A B A A C A A C D A E F C D E Can this lead to an generic E � E F G H inference algorithm? Eric Xing 3 A Clique Tree B A B A A m m b c C m A A d m C D f m A E F e C D m h E m E E F g G H m ( a , c , d ) e ∑ = ( | , ) ( ) ( , ) p e c d m e m a e g f e Eric Xing 4 2

  3. From Elimination to Message Passing Elimination ≡ message passing on a clique tree � B A B A B A B A B A B A B A B A A C D C D C D C D C D C C D E F E F E F E F E G H G H G ≡ B A B A A m m c b C m A A d m C D f A m ( , , ) m a c d E F e e C D ∑ = m p ( e | c , d ) m ( e ) m ( a , e ) g f h E E m E F e g G H Messages can be reused � Eric Xing 5 From Elimination to Message Passing Elimination ≡ message passing on a clique tree � � Another query ... B A B A A m m c b C m A A d m C D f A m E F e C D m E h E m E F g G H Messages m f and m h are reused, others need to be recomputed � Eric Xing 6 3

  4. The Junction Tree Algorithm Recall: Elimination ≡ message passing on a clique tree � Junction Tree Algorithm: � computing messages on a clique tree � � message passing protocol on a clique tree There are several inference algorithms; some of which operate � directly on (special) directed graph Forward-backward algorithm for HMM (we will see it later) � Pealing algorithm for trees and phylogenies � The junction tree algorithm is the most popular and general � inference algorithm, it operates on an undirected graph To understand the JT-algorithm, we need to understand how to compile a � directed graph into an undirected graph Eric Xing 7 Moral Graph Note that for both directed GMs and undirected GMs, the joint � probability is in a product form: 1 ∏ ∏ = = ψ ( ) ( ) BN: P ( X ) P ( X | X ) MRF: P X X π c c i i Z 1 = ∈ : i d c C So let’s convert local conditional probabilities into potentials; then � the second expression will be generic, but how does this operation affect the directed graph? We can think of a conditional probability, e.g,. P ( C | A , B ) as a function of the three � variables A , B , and C (we get a real number of each configuration): A A C C B B Ψ ( A , B,C ) = P ( C | A , B ) P ( C | A , B ) � Problem: But a node and its parent are not generally in the same clique in a BN Solution: Marry the parents to obtain the "moral graph" � Eric Xing 8 4

  5. Moral Graph (cont.) Define the potential on a clique as the product over all conditional � probabilities contained within the clique Now the product of potentials gives the right answer: � X 4 X 5 X 1 X 1 X 6 X 6 X 3 X 3 X 2 X 2 X 5 X 4 P ( X , X , X , X , X , X ) 1 2 3 4 5 6 = P ( X ) P ( X ) P ( X | X , X ) P ( X | X ) P ( X | X ) P ( X | X , X ) 1 2 3 1 2 4 3 5 3 6 4 5 = ψ ψ ψ ( , , ) ( , , ) ( , , ) X X X X X X X X X 1 2 3 3 4 5 4 5 6 Note that here the ψ = where ( X , X , X ) P ( X ) P ( X ) P ( X | X , X ) 1 2 3 1 2 3 1 2 interpretation of potential is ambivalent: ψ = ( X , X , X ) P ( X | X ) P ( X | X ) 3 4 5 4 3 5 3 it can be either marginals ψ = ( X , X , X ) P ( X | X , X ) or conditionals 4 5 6 6 4 5 Eric Xing 9 Clique trees A clique tree is an (undirected) tree of cliques � X 5 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 1 X 6 X 3 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 2 X 4 X 3 X 4 Consider cases in which two neighboring cliques V and W have an � overlap S (e.g., ( X 1 , X 2 , X 3 ) overlaps with ( X 3 , X 4 , X 5 ) ), ψ φ ψ ( V ) ( S ) ( W ) V S W Now we have an alternative representation of the joint in terms of � the potentials: Eric Xing 10 5

  6. Clique trees A clique tree is an (undirected) tree of cliques � X 5 X 1 X 6 X 3 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 2 X 3 X 4 X 4 The alternative representation of the joint in terms of the potentials: � P ( X , X , X , X , X , X ) 1 2 3 4 5 6 = P ( X ) P ( X ) P ( X | X , X ) P ( X | X ) P ( X | X ) P ( X | X , X ) 1 2 3 1 2 4 3 5 3 6 4 5 P ( X , X , X ) P ( X , X , X ) 3 4 5 4 5 6 = ( , , ) P X X X 1 2 3 ( ) ( , ) P X P X X 3 4 5 Now each potential is ψ ψ ( X , X , X ) ( X , X , X ) = ψ 3 4 5 4 5 6 ( X , X , X ) isomorphic to the cluster 1 2 3 φ φ ( ) ( , ) X X X 3 4 5 marginal of the attendant ∏ ψ set of variables ( X ) Generally: = C C � ( ) C P X ∏ φ ( ) X S S S Eric Xing 11 Why this is useful? � Propagation of probabilities Now suppose that some evidence has been "absorbed" (i.e., certain values of � some nodes have been observed). How do we propagate this effect to the rest of the graph? X 4 X 5 X 1 X 1 X 6 X 6 X 3 X 3 X 2 X 2 X 5 X 4 What do we mean by propagate? � Can we adjust all the potentials { ψ }, { φ } so that they still represent the correct cluster marginals (or unnormalized equivalents) of their respective attendant variables? X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X ∑ = = ψ Utility? X 3 X 4 � P ( X | X x ) ( X , X , X ) 1 6 6 1 2 3 , 2 X 3 = = φ ( | ) ( ) P X X x X 3 6 6 3 Local operations! X ∑ = ψ P ( x ) ( X , X , x ) 6 4 5 6 , X Eric Xing 4 5 12 6

  7. Local Consistency ψ φ ψ We have two ways of obtaining p ( S ) ( V ) ( S ) ( W ) � S V W ∑ ∑ = ψ = ψ P ( S ) ( V ) P ( S ) ( W ) \ \ V S W S and they must be the same The following update-rule ensures this: � φ * * = ∑ φ = ψ ψ ψ * S Forward update: � S V W φ W \ V S S φ * * ∑ φ = ψ ψ = ψ * * * * * * S Backward update � S W V φ V * \ W S S Two important identities can be proven � = ∑ ∑ ψ ψ ψ ψ ψ ψ * * * * * * ψ ψ = φ * * * * * = = V W V W V W V W S φ φ φ * * * \ \ V S W S S S S Local Consistency Invariant Joint Eric Xing 13 Message Passing Algorithm φ * * = ∑ ψ φ ψ ( V ) ( S ) ( W ) φ = ψ ψ ψ * S S V W φ W \ V S S S V W φ * * ∑ φ = ψ ψ = ψ * * * * * * S S W V φ V * \ W S S � This simple local message-passing algorithm on a clique tree defines the general probability propagation algorithm for directed graphs! Many interesting algorithms are special cases: � Forward-backward algorithm for hidden Markov models, � Kalman filter updates � Pealing algorithms for probabilistic trees � The algorithm seems reasonable. Is it correct? � Eric Xing 14 7

  8. A problem � Consider the following graph and a corresponding clique tree A B A,B B,D C D A,C C,D Note that C appears in two non-neighboring cliques � � Question : with the previous message passage, can we ensure that the probability associated with C in these two (non- neighboring) cliques consistent? � Answer: No. It is not true that in general local consistency implies global consistency � What else do we need to get such a guarantee? Eric Xing 15 Triangulation A triangulated graph is one in which no cycles with � A B four or more nodes exist in which there is no chord C D We triangulate a graph by adding chords: � A B Now we no longer have our global inconsistency � problem. C D A clique tree for a triangulated graph has the running � intersection property : If a node appears in two cliques, A,B,C it appears everywhere on the path between the cliques Thus local consistency implies global consistency � B,C,D Eric Xing 16 8

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend