Directed Graphical Models Michael Gutmann Probabilistic Modelling - PowerPoint PPT Presentation

Directed Graphical Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018

Recap ◮ We talked about reasonably weak assumption to facilitate the efficient representation of a probabilistic model ◮ Independence assumptions reduce the number of interacting variables ◮ Parametric assumptions restrict the way the variables may interact. ◮ (Conditional) independence assumptions lead to a factorisation of the pdf/pmf, e.g. p ( x , y , z ) = p ( x ) p ( y ) p ( z ) p ( x 1 , . . . , x d ) = p ( x d | x d − 3 , x d − 2 , x d − 1 ) p ( x 1 , . . . , x d − 1 ) Michael Gutmann Directed Graphical Models 2 / 66

Program 1. Equivalence of factorisation and ordered Markov property 2. Understanding models from their factorisation 3. Definition of directed graphical models 4. Independencies in directed graphical models Michael Gutmann Directed Graphical Models 3 / 66

Program 1. Equivalence of factorisation and ordered Markov property Chain rule Ordered Markov property implies factorisation Factorisation implies ordered Markov property 2. Understanding models from their factorisation 3. Definition of directed graphical models 4. Independencies in directed graphical models Michael Gutmann Directed Graphical Models 4 / 66

Chain rule Iteratively applying the product rule allows us to factorise any joint pdf (pmf) p ( x ) = p ( x 1 , x 2 , . . . , x d ) into product of conditional pdfs. p ( x ) = p ( x 1 ) p ( x 2 , . . . , x d | x 1 ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 , . . . , x d | x 1 , x 2 ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 1 , x 2 ) p ( x 4 , . . . , x d | x 1 , x 2 , x 3 ) . . . = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 1 , x 2 ) . . . p ( x d | x 1 , . . . x d − 1 ) d � = p ( x 1 ) p ( x i | x 1 , . . . , x i − 1 ) i =2 d � = p ( x i | pre i ) i =1 with pre i = pre ( x i ) = { x 1 , . . . , x i − 1 } , pre 1 = ∅ and p ( x 1 | ∅ ) = p ( x 1 ) The chain rule can be applied to any ordering x k 1 , . . . x k d . Different orderings give different factorisations. Michael Gutmann Directed Graphical Models 5 / 66

From (conditional) independence to factorisation p ( x ) = � d i =1 p ( x i | pre i ) for the ordering x 1 , . . . , x d ◮ For each x i , we condition on all previous variables in the ordering. ◮ Assume that, for each i , there is a minimal subset of variables π i ⊆ pre i such that p ( x ) satisfies x i ⊥ ⊥ ( pre i \ π i ) | π i for all i . The distribution is then said to satisfy the ordered Markov property . ◮ By definition of conditional independence: p ( x i | x 1 , . . . , x i − 1 ) = p ( x i | pre i ) = p ( x i | π i ) ◮ With the convention π 1 = ∅ , we obtain the factorisation d � p ( x 1 , . . . , x d ) = p ( x i | π i ) i =1 ◮ See later: the π i correspond to the parents of x i in graphs. Michael Gutmann Directed Graphical Models 6 / 66

From (conditional) independence to factorisation ◮ Assume the variables are ordered as x 1 , . . . , x d , let pre i = { x 1 , . . . x i − 1 } and π i ⊆ pre i . ◮ We have seen that x i ⊥ ⊥ pre i \ π i | π i for all i if d � p ( x i | π i ) then p ( x 1 , . . . , x d ) = i =1 ◮ The chain rule corresponds to the case where π i = pre i . ◮ Do we also have the reverse? d � if p ( x 1 , . . . , x d ) = p ( x i | π i ) with π i ⊆ pre i i =1 x i ⊥ ⊥ pre i \ π i | π i for all i ? then Michael Gutmann Directed Graphical Models 7 / 66

From factorisation to (conditional) independence ◮ Let us first check whether x d ⊥ ⊥ pre d \ π d | π d holds. ◮ We do that by checking whether pre d � �� p ( x d | x 1 , . . . , x d − 1 ) = p ( x | π d ) holds. ◮ Since p ( x 1 , . . . , x d ) p ( x d | x 1 , . . . , x d − 1 ) = p ( x 1 , . . . , x d − 1 ) we start with computing p ( x 1 , . . . , x d − 1 ). Michael Gutmann Directed Graphical Models 8 / 66

From factorisation to (conditional) independence Assume that the x i are ordered as x 1 , . . . , x d and that p ( x 1 , . . . , x d ) = � d i =1 p ( x i | π i ) with π i ⊆ pre i . We compute p ( x 1 , . . . , x d − 1 ) using the sum rule: � p ( x 1 , . . . , x d − 1 ) = p ( x 1 , . . . , x d ) d x d d � � = p ( x i | π i ) d x d i =1 � d − 1 � = p ( x i | π i ) p ( x d | π d ) d x d ( x d / ∈ π i , i < d ) i =1 d − 1 � � p ( x i | π i ) p ( x d | π d ) d x d = i =1 d − 1 � p ( x i | π i ) = i =1 Michael Gutmann Directed Graphical Models 9 / 66

From factorisation to (conditional) independence Hence: p ( x 1 , . . . , x d ) p ( x d | x 1 , . . . , x d − 1 ) = p ( x 1 , . . . , x d − 1 ) � d i =1 p ( x i | π i ) = � d − 1 i =1 p ( x i | π i ) = p ( x d | π d ) And p ( x d | x 1 , . . . , x d − 1 ) = p ( x d | π d ) means that x d ⊥ ⊥ pre d \ π d | π d as desired. p ( x 1 , . . . , x d − 1 ) has the same form as p ( x 1 , . . . , x d ): apply same procedure to all p ( x 1 , . . . , x k ), for smaller and smaller k ≤ d − 1 Proves that (1) p ( x 1 , . . . , x k ) = � k i =1 p ( x i | π i ) and that (2) factorisation implies x i ⊥ ⊥ pre i \ π i | π i for all i Michael Gutmann Directed Graphical Models 10 / 66

Brief summary ◮ Let x = ( x 1 , . . . , x d ) be a d -dimensional random vector with pdf/pmf p ( x ). ◮ Denote the predecessors of x i in the ordering by pre ( x i ) = pre i = { x 1 , . . . , x i − 1 } , and let π i ⊆ pre i . d � p ( x ) = p ( x i | π i ) ⇐ ⇒ x i ⊥ ⊥ pre i \ π i | π i for all i i =1 ◮ Equivalence of factorisation and ordered Markov property of the pdf/pmf Michael Gutmann Directed Graphical Models 11 / 66

Why does it matter? ◮ Denote the predecessors of x i in the ordering by pre i = { x 1 , . . . , x i − 1 } , and let π i ⊆ pre i . d � p ( x ) = p ( x i | π i ) ⇐ ⇒ x i ⊥ ⊥ pre i \ π i | π i for all i i =1 ◮ Why does it matter? ◮ Relatively strong result: It holds for sets of pdfs/pmfs and not only single instances ◮ For all members of the set: Fewer numbers are needed for their representation ◮ Given the independencies, we know what form p ( x ) must have. ◮ Increased understanding of the properties of the model (independencies and data generation mechanism) ◮ Visualisation as a graph Michael Gutmann Directed Graphical Models 12 / 66

Program 1. Equivalence of factorisation and ordered Markov property Chain rule Ordered Markov property implies factorisation Factorisation implies ordered Markov property 2. Understanding models from their factorisation 3. Definition of directed graphical models 4. Independencies in directed graphical models Michael Gutmann Directed Graphical Models 13 / 66

Program 1. Equivalence of factorisation and ordered Markov property 2. Understanding models from their factorisation Ancestral sampling Visualisation as a directed graph Description of directed graphs and topological orderings 3. Definition of directed graphical models 4. Independencies in directed graphical models Michael Gutmann Directed Graphical Models 14 / 66

Ancestral sampling ◮ Factorisation provides a recipe for data generation / sampling from p ( x ) ◮ Example: p ( x 1 , x 2 , x 3 , x 4 , x 5 ) = p ( x 1 ) p ( x 2 ) p ( x 3 | x 1 , x 2 ) p ( x 4 | x 3 ) p ( x 5 | x 2 ) ◮ We can generate samples from the joint distribution p ( x 1 , x 2 , x 3 , x 4 , x 5 ) by sampling 1. x 1 ∼ p ( x 1 ) 2. x 2 ∼ p ( x 2 ) 3. x 3 ∼ p ( x 3 | x 1 , x 2 ) 4. x 4 ∼ p ( x 4 | x 3 ) 5. x 5 ∼ p ( x 5 | x 2 ) ◮ Note: Helps in modelling and understanding of the properties of p ( x ) but may not reflect causal relationships. Michael Gutmann Directed Graphical Models 15 / 66

Visualisation as a directed graph If p ( x ) = � d i =1 p ( x i | π i ) with π i ⊆ pre i we can visualise the model as a graph with the random variables x i as nodes, and directed edges that point from the x j ∈ π i to the x i . This results in a directed acyclic graph (DAG). Example: p ( x 1 , x 2 , x 3 , x 4 , x 5 ) = p ( x 1 ) p ( x 2 ) p ( x 3 | x 1 , x 2 ) p ( x 4 | x 3 ) p ( x 5 | x 2 ) x 1 x 2 x 3 x 5 x 4 Michael Gutmann Directed Graphical Models 16 / 66

Visualisation as a directed graph Example: p ( x 1 , x 2 , x 3 , x 4 ) = p ( x 1 ) p ( x 2 | x 1 ) p ( x 3 | x 1 , x 2 ) p ( x 4 | x 1 , x 2 , x 3 ) x 1 x 2 x 3 x 4 Factorisation obtained by chain rule ≡ fully connected directed acyclic graph. Michael Gutmann Directed Graphical Models 17 / 66

Graph concepts ◮ Directed graph: graph where all edges are directed ◮ Directed acyclic graph (DAG): by following the direction of the arrows you will never visit a node more than once ◮ x i is a parent of x j if there is a (directed) edge from x i to x j . The set of parents of x i in the graph is denoted by pa ( x i ) = pa i , e.g. pa ( x 3 ) = pa 3 = { x 1 , x 2 } . ◮ x j is a child of x i if x i ∈ pa ( x j ), e.g. x 3 and x 5 are children of x 2 . x 1 x 2 x 3 x 5 x 4 Michael Gutmann Directed Graphical Models 18 / 66

Graph concepts ◮ A path or trail from x i to x j is a sequence of distinct connected nodes starting at x i and ending at x j . The direction of the arrows does not matter. For example: x 5 , x 2 , x 3 , x 1 is a trail. ◮ A directed path is a sequence of connected nodes where we follow the direction of the arrows. For example: x 1 , x 3 , x 4 is a directed path. But x 5 , x 2 , x 3 , x 1 is not a directed path. x 1 x 2 x 3 x 5 x 4 Michael Gutmann Directed Graphical Models 19 / 66

Directed Graphical Models Michael Gutmann Probabilistic Modelling - PowerPoint PPT Presentation

Directed Graphical Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018 Recap We talked about reasonably weak assumption to facilitate the efficient

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Graphical Models Graphical Models Relationship between the directed & undirected models

Finding Strongly Connected Components Directed Acyclic Graphs Directed Acyclic Graphs Directed

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Dr. Ampl A Meta Solver for Optimization Dominique Orban Bob Fourer cole Polytechnique de

CS 170 Section 10 Search Problems and Intractability Owen Jow owenjow@berkeley.edu 4/04

Parallel Computing Basics Nima Honarmand Fall 2015 :: CSE 610 Parallel Computer Architectures

LOCAL and GLOBAL INDEPENDENCE of PARAMETERS in DISCRETE BAYESIAN GRAPHICAL MODELS Jacek

A Practical Approach for a Workflow Management System Simone Pellegrini, Francesco Giacomini,

Three things you really should know about DAGMan Allstars Alain Roy OSG Software Coordinator

Modern day workow management BUILDIN G DATA EN GIN EERIN G P IP ELIN ES IN P YTH ON Oliver

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

Directed Graphical Models Michael Gutmann Probabilistic Modelling - PowerPoint PPT Presentation

Directed Graphical Models Michael Gutmann Probabilistic Modelling and Reasoning (INFR11134) School of Informatics, University of Edinburgh Spring semester 2018 Recap We talked about reasonably weak assumption to facilitate the efficient

Probabilistic Graphical Models Probabilistic Graphical Models Relationship between the directed

Graphical Models Graphical Models Bayesian Networks Siamak Ravanbakhsh Fall 2019 Previously on

Transforming Graphical System Models to Graphical Attack Models ! Joint work with Marieta

Graphical Models Graphical Models Relationship between the directed &amp; undirected models

Finding Strongly Connected Components Directed Acyclic Graphs Directed Acyclic Graphs Directed

Undirected Graphical Models Aaron Courville, Universit de Montral 2 (UNDIRECTED) GRAPHICAL

Directed Graphical Models + Undirected Graphical Models Matt Gormley Lecture 7 Sep. 18, 2019

Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of

Probabilistic Graphical Models Probabilistic Graphical Models Variable elimination Siamak

Probabilistic Graphical Models CMSC 678 UMBC Probabilistic Graphical Models A graph G that

Graphical models Review Graphical models (Bayes nets, Markov random fields, factor graphs) !

Probabilistic Graphical Models CMSC 691 UMBC Two Problems for Graphical Models 1 ,

Probabilistic Graphical Models Probabilistic Graphical Models introduction to learning Siamak

Probabilistic Graphical Models Probabilistic Graphical Models Undirected Models Fall 2019

Probabilistic Graphical Models Probabilistic Graphical Models parameter learning in undirected

Probabilistic Graphical Models Probabilistic Graphical Models Gaussian Network Models Fall 2019

Dr. Ampl A Meta Solver for Optimization Dominique Orban Bob Fourer cole Polytechnique de

CS 170 Section 10 Search Problems and Intractability Owen Jow owenjow@berkeley.edu 4/04

Parallel Computing Basics Nima Honarmand Fall 2015 :: CSE 610 Parallel Computer Architectures

LOCAL and GLOBAL INDEPENDENCE of PARAMETERS in DISCRETE BAYESIAN GRAPHICAL MODELS Jacek

A Practical Approach for a Workflow Management System Simone Pellegrini, Francesco Giacomini,

Three things you really should know about DAGMan Allstars Alain Roy OSG Software Coordinator

Modern day workow management BUILDIN G DATA EN GIN EERIN G P IP ELIN ES IN P YTH ON Oliver

CSE443 Compilers Dr. Carl Alphonce alphonce@buffalo.edu 343 Davis Hall http:/

Graphical Models Graphical Models Relationship between the directed & undirected models