Junction Tree Algorithm and a case study of the Hidden Markov Model - PDF document

School of Computer Science Junction Tree Algorithm and a case study of the Hidden Markov Model Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 6, Oct 3, 2007 Receptor A Receptor A X 1 X 1 X 1 Receptor B Receptor B X 2 X 2 X 2 Eric Xing Eric Xing Kinase C Kinase C X 3 X 3 X 3 Kinase D Kinase D X 4 X 4 X 4 Kinase E Kinase E X 5 X 5 X 5 TF F TF F X 6 X 6 X 6 Reading: J-Chap 12, 17, KF-Chap. 10 Gene G Gene G X 7 X 7 X 7 X 8 X 8 X 8 Gene H Gene H 1 Outline � So far we have studied exact inference in: Trees � message passing on the original graph (which are trees) � � Poly-trees, Tree-like graphs message passing in factor trees � � Now we will look into exact inference in arbitrary graphs Junction-Tree algorithm � � Inference in Hidden Markov Model Eric Xing 2 1

Elimination Clique � Recall that Induced dependency during marginalization is captured in elimination cliques � Summation <-> elimination � Intermediate term <-> elimination clique B A B A A C A A C D A E F C D E Can this lead to an generic E � E F G H inference algorithm? Eric Xing 3 A Clique Tree B A B A A m m b c C m A A d m C D f m A E F e C D m h E m E E F g G H m ( a , c , d ) e ∑ = ( | , ) ( ) ( , ) p e c d m e m a e g f e Eric Xing 4 2

From Elimination to Message Passing Elimination ≡ message passing on a clique tree � B A B A B A B A B A B A B A B A A C D C D C D C D C D C C D E F E F E F E F E G H G H G ≡ B A B A A m m c b C m A A d m C D f A m ( , , ) m a c d E F e e C D ∑ = m p ( e | c , d ) m ( e ) m ( a , e ) g f h E E m E F e g G H Messages can be reused � Eric Xing 5 From Elimination to Message Passing Elimination ≡ message passing on a clique tree � � Another query ... B A B A A m m c b C m A A d m C D f A m E F e C D m E h E m E F g G H Messages m f and m h are reused, others need to be recomputed � Eric Xing 6 3

The Junction Tree Algorithm Recall: Elimination ≡ message passing on a clique tree � Junction Tree Algorithm: � computing messages on a clique tree � � message passing protocol on a clique tree There are several inference algorithms; some of which operate � directly on (special) directed graph Forward-backward algorithm for HMM (we will see it later) � Pealing algorithm for trees and phylogenies � The junction tree algorithm is the most popular and general � inference algorithm, it operates on an undirected graph To understand the JT-algorithm, we need to understand how to compile a � directed graph into an undirected graph Eric Xing 7 Moral Graph Note that for both directed GMs and undirected GMs, the joint � probability is in a product form: 1 ∏ ∏ = = ψ ( ) ( ) BN: P ( X ) P ( X | X ) MRF: P X X π c c i i Z 1 = ∈ : i d c C So let’s convert local conditional probabilities into potentials; then � the second expression will be generic, but how does this operation affect the directed graph? We can think of a conditional probability, e.g,. P ( C | A , B ) as a function of the three � variables A , B , and C (we get a real number of each configuration): A A C C B B Ψ ( A , B,C ) = P ( C | A , B ) P ( C | A , B ) � Problem: But a node and its parent are not generally in the same clique in a BN Solution: Marry the parents to obtain the "moral graph" � Eric Xing 8 4

Moral Graph (cont.) Define the potential on a clique as the product over all conditional � probabilities contained within the clique Now the product of potentials gives the right answer: � X 4 X 5 X 1 X 1 X 6 X 6 X 3 X 3 X 2 X 2 X 5 X 4 P ( X , X , X , X , X , X ) 1 2 3 4 5 6 = P ( X ) P ( X ) P ( X | X , X ) P ( X | X ) P ( X | X ) P ( X | X , X ) 1 2 3 1 2 4 3 5 3 6 4 5 = ψ ψ ψ ( , , ) ( , , ) ( , , ) X X X X X X X X X 1 2 3 3 4 5 4 5 6 Note that here the ψ = where ( X , X , X ) P ( X ) P ( X ) P ( X | X , X ) 1 2 3 1 2 3 1 2 interpretation of potential is ambivalent: ψ = ( X , X , X ) P ( X | X ) P ( X | X ) 3 4 5 4 3 5 3 it can be either marginals ψ = ( X , X , X ) P ( X | X , X ) or conditionals 4 5 6 6 4 5 Eric Xing 9 Clique trees A clique tree is an (undirected) tree of cliques � X 5 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 1 X 6 X 3 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 2 X 4 X 3 X 4 Consider cases in which two neighboring cliques V and W have an � overlap S (e.g., ( X 1 , X 2 , X 3 ) overlaps with ( X 3 , X 4 , X 5 ) ), ψ φ ψ ( V ) ( S ) ( W ) V S W Now we have an alternative representation of the joint in terms of � the potentials: Eric Xing 10 5

Clique trees A clique tree is an (undirected) tree of cliques � X 5 X 1 X 6 X 3 X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X 2 X 3 X 4 X 4 The alternative representation of the joint in terms of the potentials: � P ( X , X , X , X , X , X ) 1 2 3 4 5 6 = P ( X ) P ( X ) P ( X | X , X ) P ( X | X ) P ( X | X ) P ( X | X , X ) 1 2 3 1 2 4 3 5 3 6 4 5 P ( X , X , X ) P ( X , X , X ) 3 4 5 4 5 6 = ( , , ) P X X X 1 2 3 ( ) ( , ) P X P X X 3 4 5 Now each potential is ψ ψ ( X , X , X ) ( X , X , X ) = ψ 3 4 5 4 5 6 ( X , X , X ) isomorphic to the cluster 1 2 3 φ φ ( ) ( , ) X X X 3 4 5 marginal of the attendant ∏ ψ set of variables ( X ) Generally: = C C � ( ) C P X ∏ φ ( ) X S S S Eric Xing 11 Why this is useful? � Propagation of probabilities Now suppose that some evidence has been "absorbed" (i.e., certain values of � some nodes have been observed). How do we propagate this effect to the rest of the graph? X 4 X 5 X 1 X 1 X 6 X 6 X 3 X 3 X 2 X 2 X 5 X 4 What do we mean by propagate? � Can we adjust all the potentials { ψ }, { φ } so that they still represent the correct cluster marginals (or unnormalized equivalents) of their respective attendant variables? X 1 , X 2 , X 3 X 3 , X 4 , X 5 X 4 , X 5 , X 6 X ∑ = = ψ Utility? X 3 X 4 � P ( X | X x ) ( X , X , X ) 1 6 6 1 2 3 , 2 X 3 = = φ ( | ) ( ) P X X x X 3 6 6 3 Local operations! X ∑ = ψ P ( x ) ( X , X , x ) 6 4 5 6 , X Eric Xing 4 5 12 6

Local Consistency ψ φ ψ We have two ways of obtaining p ( S ) ( V ) ( S ) ( W ) � S V W ∑ ∑ = ψ = ψ P ( S ) ( V ) P ( S ) ( W ) \ \ V S W S and they must be the same The following update-rule ensures this: � φ * * = ∑ φ = ψ ψ ψ * S Forward update: � S V W φ W \ V S S φ * * ∑ φ = ψ ψ = ψ * * * * * * S Backward update � S W V φ V * \ W S S Two important identities can be proven � = ∑ ∑ ψ ψ ψ ψ ψ ψ * * * * * * ψ ψ = φ * * * * * = = V W V W V W V W S φ φ φ * * * \ \ V S W S S S S Local Consistency Invariant Joint Eric Xing 13 Message Passing Algorithm φ * * = ∑ ψ φ ψ ( V ) ( S ) ( W ) φ = ψ ψ ψ * S S V W φ W \ V S S S V W φ * * ∑ φ = ψ ψ = ψ * * * * * * S S W V φ V * \ W S S � This simple local message-passing algorithm on a clique tree defines the general probability propagation algorithm for directed graphs! Many interesting algorithms are special cases: � Forward-backward algorithm for hidden Markov models, � Kalman filter updates � Pealing algorithms for probabilistic trees � The algorithm seems reasonable. Is it correct? � Eric Xing 14 7

A problem � Consider the following graph and a corresponding clique tree A B A,B B,D C D A,C C,D Note that C appears in two non-neighboring cliques � � Question : with the previous message passage, can we ensure that the probability associated with C in these two (non- neighboring) cliques consistent? � Answer: No. It is not true that in general local consistency implies global consistency � What else do we need to get such a guarantee? Eric Xing 15 Triangulation A triangulated graph is one in which no cycles with � A B four or more nodes exist in which there is no chord C D We triangulate a graph by adding chords: � A B Now we no longer have our global inconsistency � problem. C D A clique tree for a triangulated graph has the running � intersection property : If a node appears in two cliques, A,B,C it appears everywhere on the path between the cliques Thus local consistency implies global consistency � B,C,D Eric Xing 16 8

Junction Tree Algorithm and a case study of the Hidden Markov Model - PDF document

School of Computer Science Junction Tree Algorithm and a case study of the Hidden Markov Model Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 6, Oct 3, 2007 Receptor A Receptor A X 1 X 1 X 1

Junction Tree Algorithm Examples October 13, 2016 Junction Tree Algorithm Moralize (if

Why the Junction Tree Algorithm? The Junction Tree Algorithm The JTA is a general-purpose

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

GRASSWOOD JUNCTION GRASSWOOD JUNCTION Grasswood Junction is a 134.61-acre property located three

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

A13 Widening Construction of a new motorway junction on the M20, consisting of a new gyratory,

Foundations I Fall, 2016 Synaptic Transmission I Neuromuscular Junction Neuromuscular Junction

PHYSICAL ELECTRONICS(ECE3540) CHAPTER 8 THE PN JUNCTION DIODE CHAPTER 8 THE PN JUNCTION

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

Gap Junction Channels Gap Junction Channels Presented by: Ima Ima Student Student Presented

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &

TAKOMA JUNCTION Update to City Council January 11, 2017 TAKOMA JUNCTION Update to City Council

TAKOMA MA JUNCTION Update to City Council September 27, 2017 TAKOMA JUNCTION Update to City

6. Code Generation 6.1 Overview 6.2 The MicroJava VM 6.3 Code Buffer 6.4 Operands 6.5

rev.ng A unified static binary analysis framework Alessandro Di Federico PhD student at

Product Feature Discovery and Ranking for Sentiment Analysis from Online Reviews. Analysis from

The Spotify Platform WOW Hack Gteborg 2014 Per-Olov Jernberg @possan @SpotifyPlatform Spotify

The silicon detectors and electronics Tran Hoai Nam Osaka University Outline Set up

Unit 5: Electrical network analysis. Introduction: junction, branch, loop, network (multi-loop

Test 2 0 0 18 1 5 0 6 10 0 16 11 15 0 14 16 20 0 21 25 0 12 26

Well-posedness of a monotone solver for traffic junctions Carlotta Donadello 1 , . Andreianov 2

Junction Tree Algorithm and a case study of the Hidden Markov Model - PDF document

School of Computer Science Junction Tree Algorithm and a case study of the Hidden Markov Model Probabilistic Graphical Models (10- Probabilistic Graphical Models (10 -708) 708) Lecture 6, Oct 3, 2007 Receptor A Receptor A X 1 X 1 X 1

Junction Tree Algorithm Examples October 13, 2016 Junction Tree Algorithm Moralize (if

Why the Junction Tree Algorithm? The Junction Tree Algorithm The JTA is a general-purpose

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

GRASSWOOD JUNCTION GRASSWOOD JUNCTION Grasswood Junction is a 134.61-acre property located three

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

A13 Widening Construction of a new motorway junction on the M20, consisting of a new gyratory,

Foundations I Fall, 2016 Synaptic Transmission I Neuromuscular Junction Neuromuscular Junction

PHYSICAL ELECTRONICS(ECE3540) CHAPTER 8 THE PN JUNCTION DIODE CHAPTER 8 THE PN JUNCTION

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

Gap Junction Channels Gap Junction Channels Presented by: Ima Ima Student Student Presented

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &amp;

TAKOMA JUNCTION Update to City Council January 11, 2017 TAKOMA JUNCTION Update to City Council

TAKOMA MA JUNCTION Update to City Council September 27, 2017 TAKOMA JUNCTION Update to City

6. Code Generation 6.1 Overview 6.2 The MicroJava VM 6.3 Code Buffer 6.4 Operands 6.5

rev.ng A unified static binary analysis framework Alessandro Di Federico PhD student at

Product Feature Discovery and Ranking for Sentiment Analysis from Online Reviews. Analysis from

The Spotify Platform WOW Hack Gteborg 2014 Per-Olov Jernberg @possan @SpotifyPlatform Spotify

The silicon detectors and electronics Tran Hoai Nam Osaka University Outline Set up

Unit 5: Electrical network analysis. Introduction: junction, branch, loop, network (multi-loop

Test 2 0 0 18 1 5 0 6 10 0 16 11 15 0 14 16 20 0 21 25 0 12 26

Well-posedness of a monotone solver for traffic junctions Carlotta Donadello 1 , . Andreianov 2

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &