Why the Junction Tree Algorithm? The Junction Tree Algorithm The - PowerPoint PPT Presentation

Why the Junction Tree Algorithm? The Junction Tree Algorithm The JTA is a general-purpose algorithm for computing (conditional) marginals on graphs. It does this by creating a tree of cliques, and carrying out a Chris Williams 1 message-passing procedure on this tree The best thing about a general-purpose algorithm is that there is School of Informatics, University of Edinburgh no longer any need to publish a separate paper explaining how to deal with each new model – the JTA generalises nearly all the October 2009 popular previous special case algorithms. Reading: Jordan chapter 17 1 Based on slides by David Barber 1 / 28 2 / 28 Overview Clique Potential Representation Observe that for both directed and undirected graphs, the joint probability is in a product form. Clique Potential Representation We can interpret the CPTs in directed graphs as potential Constructing a Junction Tree functions. Moralization Basic idea is to represent probability distribution corresponding Triangulation to any graph as a product of clique potentials: Assembling cliques into a junction tree Message Passing p ( x ) = 1 � Ψ C ( x C ) Z Introducing Evidence C Propagation on a Junction Tree where x C is the set of variables corresponding to clique C . A clique is a fully-connected subset of nodes in a graph 3 / 28 4 / 28

An example d d d b b b f f a a f a c c e e Moralization Triangulation c e p ( a , b , c , d , e , f ) = p ( a ) p ( b | a ) p ( c | a ) p ( d | b ) p ( e | c ) p ( f | b , e ) 5 / 28 6 / 28 Clique Trees and Separators b d A clique tree is an (undirected) tree of cliques b e f a b c a,b,c c c,d,e d,e,f d,e Variables shared by neighbouring cliques are drawn in the separator sets in blue. b c e The potential representation of a clique tree is the product of the clique potentials, divided by the product of the separator The clique potential representation is potentials. p ( a , b , c , d , e , f ) = Ψ( a , b , c )Ψ( b , d )Ψ( b , c , e )Ψ( b , e , f ) � C Ψ C ( x C ) p ( x ) = � S Φ S ( x S ) A valid assignment of cluster potentials is Ψ( a , b , c ) = p ( a ) p ( b | a ) p ( c | a ) , Ψ( b , d ) = p ( d | b ) , Ψ( b , c , e ) = p ( e | c ) , Ψ( b , e , f ) = p ( f | b , e ) and Z = 1 7 / 28 8 / 28

Constructing a Junction Tree from a DAG Initially, all separator potentials are set to 1. After running the JTA, we will have C , ¯ Ψ( x C ) = p ( x ˜ x E ) Moralize the graph 1 S , ¯ Φ( x S ) = p ( x ˜ x E ) Triangulate the graph 2 Construct a junction tree 3 where ˜ C denotes those variables in C that are not in E , and similarly for ˜ S . 9 / 28 10 / 28 Moral Graphs A Moral Example to us all D A Let’s represent the following DAG as a product of clique potentials: F C p(a) A B p(c|a,b) C E p(b) B After moralisation, we get the following undirected graph A D A = (a,c) (b,c) Ψ Ψ C F C B B A E = (a,b,c) Ψ C The product of clique potentials is B p ( a , b , c , d , e , f ) = Ψ( a , b , c )Ψ( c , d , e )Ψ( d , e , f ) To ensure that a node and its parents are in the same clique, we have to marry the parents – moralisation . where Ψ( a , b , c ) = p ( a ) p ( b ) p ( c | a , b ) , Ψ( c , d , e ) = p ( d | c ) p ( e | c ) , Ψ( d , e , f ) = p ( f | d , e ) 11 / 28 12 / 28

The need for triangulation Triangulation Consider the following graph and a corresponding clique tree In a triangulated graph, all loops containing 4 or more nodes contain a chord: A A,B B,D B A B A B A,B,C A,C C,D C D C appears in two non-neighbouring cliques. C C B,C,D D D There is no guarantee that marginal on C in these two cliques should be equal, i.e � A Ψ( A , C ) = � D Ψ( C , D ) That is, local consistency does not necessarily imply global One way to create a triangulated graph is via the elimination consistency. algorithm (see Jordan §3.2) Triangulation provides a solution. 13 / 28 14 / 28 Constructing a Junction Tree a a b d a b d b d A clique tree is a junction tree if it has the following junction b c d c d e b c d c d e c e tree property: if a node appears in two cliques, it appears everywhere on the path between the cliques. For every triangulated graph there exists a clique tree Not all clique trees are junction trees which obeys the junction tree property Theorem A clique tree is a junction tree iff it is a maximal Thus local consistency implies global consistency spanning tree, where the weight is given by the sum of the cardinalities of the separator sets 15 / 28 16 / 28

Message Passing Absorption In order that the cliques contain all information required for Absorption passes a “message” from one node to another: marginals of the variables in the clique, we need to enforce W absorbs from V * * consistency . That is, if clique V (containing a set of variables) Ψ( Φ( Ψ( V) S) W) and clique W share variables S , the marginals on their separators must be equal. Ψ ∗ ( W ) = Ψ( W ) Φ ∗ ( S ) Φ( S ) , where Φ ∗ ( S ) = � V \ S Ψ( V ) Similarly, after passing a message one way, we pass it the other: V absorbs from W Ψ( Φ( Ψ( V) S) W) ** ** * Ψ( Φ( Ψ( V) S) W) Ψ ∗∗ ( V ) = Ψ ∗ ( V ) Φ ∗∗ ( S ) Φ ∗ ( S ) , where Ψ ∗ ( V ) = Ψ( V ) and Φ ∗∗ ( S ) = � W \ S Ψ ∗ ( W ) We need � V \ S Ψ( V ) = Φ( S ) = � W \ S Ψ( W ) . 17 / 28 18 / 28 Introducing Evidence This ensures consistency : � V \ S Ψ ∗∗ ( V ) = Φ ∗∗ ( S ) = � W \ S Ψ ∗ ( W ) . p ( x ) = Ψ C ( x C ) � C Also Split nodes into H (hidden) and E (evidence) = Ψ ∗ ( V )Ψ ∗ ( W ) = Ψ ∗∗ ( V )Ψ ∗∗ ( W ) Ψ( V )Ψ( W ) � � ˜ p ( x H , ¯ C , ¯ x C ∩ E ) � x E ) = Ψ C ( x ˜ Ψ ˜ C ( x ˜ C ) Φ ∗ ( S ) Φ ∗∗ ( S ) Φ( S ) C C where Ψ ∗∗ ( W ) = Ψ ∗ ( W ) , thus maintaining the clique tree This is a product of “slices” of potential functions. representation of the graph. Thus to introduce evidence, we modify the potentials in the original graph, setting any nodes to their evidential values. One can also use the “evidence potential” approach by setting Show that Ψ ∗∗ ( V ) and Ψ ∗∗ ( W ) have the same marginals on S ˜ Ψ C ( x C ) = Ψ C ( x C ) δ ( x C ∩ E , ¯ x C ∩ E ) but this fills the clique potentials with lots of zeros thus and wastes storage and computation 19 / 28 20 / 28

Propagation on a Junction Tree Node V can send exactly one message to a neighbour W , and it may only be sent when V has received a message from all of its other neighbours Choose one clique (arbitrarily) as a root of the tree; collect messages to this node and then distribute messages away from it After collection and distribution phases, we have in each clique that C , ¯ Ψ( x C ) = p ( x ˜ x E ) CollectEvidence DistributeEvidence 21 / 28 22 / 28 Summary of JTA Proof of Correctness of JTA Theorem Let the probability p ( x H , ¯ x E ) be represented by the clique Convert belief network into JT potentials of a junction tree. When the junction tree algorithm terminates, the clique potentials and separator potentials are Initialize potentials and separators proportional to the local marginal probabilities. In particular: Incorporate evidence (JT is inconsistent) CollectEvidence and DistributeEvidence (to give a C , ¯ S , ¯ Ψ C = p ( x ˜ x E ) , Φ S = p ( x ˜ x E ) consistent JT) Proof Obtain clique marginals by marginalization/normalization Observe that the separators are subsets of the cliques which are consistent with the cliques. Thus we only need to prove the result for the cliques. 23 / 28 24 / 28

Throughout the propagation process we have maintained the C S representation R � C Ψ C ( x C ) p ( x H , ¯ x E ) = � S Φ S ( x S ) After the collect- and distribute-evidence stages the junction V tree is consistent (i.e. the marginalization of the potentials of the cliques at either end of a separator give the same separator potential). We now show that marginalization of the joint p ( x H , ¯ x E ) gives Choose a clique C that is a leaf of the JT with separator S. Let C = C \ E and ˜ ˜ S = S \ E . Let ˜ R = ˜ C \ ˜ the desired result. S , and the remaining non-evidence nodes be denoted ˜ T . We now remove clique C by summing out ˜ R from p ( x H , ¯ T , ¯ x E ) = p ( x ˜ R , x ˜ S , x ˜ x E ) 25 / 28 26 / 28 JTA example � p ( x ˜ S , ¯ x E ) = p ( x H , ¯ x E ) T , x ˜ ˜ R � C Ψ ˜ C ( x ˜ C ) ˜ � = � S Φ ˜ S ( x ˜ S ) ˜ ˜ a b b c R a b c � C ′ � = C Ψ ˜ C ′ ( x ˜ C ′ ) Ψ ˜ C ( x ˜ C ) ˜ � = Compute Φ ˜ S ( x ˜ S ) � S ′ � = S Φ ˜ S ′ ( x ˜ S ′ ) ˜ ˜ R p ( b ) � C ′ � = C Ψ ˜ C ′ ( x ˜ C ′ ) � R Ψ ˜ C ( x ˜ C ) ˜ ˜ = p ( b | a = 0 , c = 1 ) Φ ˜ S ( x ˜ S ) � S ′ � = S Φ ˜ S ′ ( x ˜ S ′ ) ˜ � C ′ � = C Ψ ˜ C ′ ( x ˜ C ′ ) p ( c | b = 1 ) ˜ = � S ′ � = S Φ ˜ S ′ ( x ˜ S ′ ) ˜ Applying this process repeatedly we obtain p ( x ˜ C , ¯ x E ) = Ψ ˜ C ( x ˜ C , ¯ x E ) 27 / 28 28 / 28

Why the Junction Tree Algorithm? The Junction Tree Algorithm The - PowerPoint PPT Presentation

Why the Junction Tree Algorithm? The Junction Tree Algorithm The JTA is a general-purpose algorithm for computing (conditional) marginals on graphs. It does this by creating a tree of cliques, and carrying out a Chris Williams 1 message-passing

Junction Tree Algorithm Examples October 13, 2016 Junction Tree Algorithm Moralize (if

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

GRASSWOOD JUNCTION GRASSWOOD JUNCTION Grasswood Junction is a 134.61-acre property located three

A13 Widening Construction of a new motorway junction on the M20, consisting of a new gyratory,

Foundations I Fall, 2016 Synaptic Transmission I Neuromuscular Junction Neuromuscular Junction

PHYSICAL ELECTRONICS(ECE3540) CHAPTER 8 THE PN JUNCTION DIODE CHAPTER 8 THE PN JUNCTION

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

PVMD Arno Smets Delft University of Technology Learning objectives What is a PIN junction

Gap Junction Channels Gap Junction Channels Presented by: Ima Ima Student Student Presented

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &

TAKOMA JUNCTION Update to City Council January 11, 2017 TAKOMA JUNCTION Update to City Council

Introduction Voronoi Diagram & Delaunay Triangulation P = { p 1 , p 2 , , p n } a set of n

3. Interpolation and Filtering Data is often discretized in space and / or time Finite

Efficiently Enumerating Minimal Triangulations Nofar Carmeli Batya Kenig Benny Kimelfeld Recent

Computational Geometry Lecture 12: Delaunay Triangulations Computational Geometry Lecture 12:

On the number of hamiltonian cycles in triangulations with few separating triangles Gunnar

Straight Skeletons and their Relation to Triangulations EuroCG2010, Dortmund, Germany Stefan

On a Linear Program for Minimum Weight Triangulation Arman Yousefi and Neal Young University of

PR19 engagement - Triangulating the evidence (Stage 2) South East Water engagement strategy

Why the Junction Tree Algorithm? The Junction Tree Algorithm The - PowerPoint PPT Presentation

Why the Junction Tree Algorithm? The Junction Tree Algorithm The JTA is a general-purpose algorithm for computing (conditional) marginals on graphs. It does this by creating a tree of cliques, and carrying out a Chris Williams 1 message-passing

Junction Tree Algorithm Examples October 13, 2016 Junction Tree Algorithm Moralize (if

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

GRASSWOOD JUNCTION GRASSWOOD JUNCTION Grasswood Junction is a 134.61-acre property located three

A13 Widening Construction of a new motorway junction on the M20, consisting of a new gyratory,

Foundations I Fall, 2016 Synaptic Transmission I Neuromuscular Junction Neuromuscular Junction

PHYSICAL ELECTRONICS(ECE3540) CHAPTER 8 THE PN JUNCTION DIODE CHAPTER 8 THE PN JUNCTION

Junction Trees And Belief Propagation (Slides from Pedro Domingos) Junction Trees: Motivation

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Odds Algorithm An Online Algorithm Group Fibonado 20. Dec 2016 Group Fibonado Odds Algorithm

Junction-tree algorithm Probabilistic Graphical Models Sharif University of Technology Spring

PVMD Arno Smets Delft University of Technology Learning objectives What is a PIN junction

Gap Junction Channels Gap Junction Channels Presented by: Ima Ima Student Student Presented

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &amp;

TAKOMA JUNCTION Update to City Council January 11, 2017 TAKOMA JUNCTION Update to City Council

Introduction Voronoi Diagram &amp; Delaunay Triangulation P = { p 1 , p 2 , , p n } a set of n

3. Interpolation and Filtering Data is often discretized in space and / or time Finite

Efficiently Enumerating Minimal Triangulations Nofar Carmeli Batya Kenig Benny Kimelfeld Recent

Computational Geometry Lecture 12: Delaunay Triangulations Computational Geometry Lecture 12:

On the number of hamiltonian cycles in triangulations with few separating triangles Gunnar

Straight Skeletons and their Relation to Triangulations EuroCG2010, Dortmund, Germany Stefan

On a Linear Program for Minimum Weight Triangulation Arman Yousefi and Neal Young University of

PR19 engagement - Triangulating the evidence (Stage 2) South East Water engagement strategy

Draft Master Plan and City of Grand Junction Ordinance Presentation City of Grand Junction &

Introduction Voronoi Diagram & Delaunay Triangulation P = { p 1 , p 2 , , p n } a set of n