directed graphical models
play

Directed Graphical Models: Bayesian Networks Probabilistic - PowerPoint PPT Presentation

Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018 Basics Multivariate distributions with large number of variables Independency assumptions are useful


  1. Directed Graphical Models: Bayesian Networks Probabilistic Graphical Models Sharif University of Technology Soleymani Spring 2018

  2. Basics  Multivariate distributions with large number of variables  Independency assumptions are useful  Independence and conditional independence relationships simplify representation and alleviate inference complexities  Bayesian networks enable us to i ncorporate domain knowledge and structures  Modular combination of heterogeneous parts  Combining data and knowledge (Bayesian philosophy) 2

  3. Conditional and marginal independence  𝑌 and 𝑍 are conditionally independent given 𝑎 if: 𝑌 ⊥ 𝑍|𝑎 𝑄 𝑌 𝑍, 𝑎 = 𝑄 𝑌 𝑎 𝑄 𝑌, 𝑍 𝑎 = 𝑄 𝑌 𝑎 𝑄 𝑍 𝑎 𝑄 𝑍 𝑌, 𝑎 = 𝑄 𝑍 𝑎 ∀𝑦 ∈ 𝑊𝑏𝑚 𝑌 , 𝑧 ∈ 𝑊𝑏𝑚 𝑍 , 𝑨 ∈ 𝑊𝑏𝑚 𝑎 𝑄 𝑌 = 𝑦, 𝑍 = 𝑧 𝑎 = 𝑨 = 𝑄 𝑌 = 𝑦 𝑎 = 𝑨 𝑄 𝑍 = 𝑧 𝑎 = 𝑨  𝑌 and 𝑍 are marginal independent if: 𝑌 ⊥ 𝑍|∅ 𝑄 𝑌 𝑍 = 𝑄(𝑌) 𝑄 𝑌, 𝑍 = 𝑄 𝑌 𝑄(𝑍) 𝑄 𝑍 𝑌 = 𝑄(𝑍) 3

  4. Bayesian network definition  Bayesian Network  Qualitative specification by a Directed Acyclic Graph (DAG)  Each node denotes a random variable  Edges denote dependencies  𝑌 → 𝑍 shows a " direct influence “ of 𝑌 on 𝑍 ( 𝑌 is a parent of 𝑍 )  Quantitative specification by CPDs  CPD for each node 𝑌 𝑗 defines 𝑄(𝑌 𝑗 | 𝑄𝑏(𝑌 𝑗 ))  Bayesian Network represents a joint distribution over variables (via DAG and CPDs) compactly in a factorized way: 𝑜 𝑄(𝑌 1 , … , 𝑌 𝑜 ) = 𝑄 (𝑌 𝑗 | 𝑄𝑏(𝑌 𝑗 )) 𝑗=1 4

  5. Burglary example John do not perceive minor earthquakes John do not perceive burglaries directly 5

  6. Burglary example  Bayesian networks define joint distribution (over the variables) in terms of the graph structure and conditional probability distributions 𝑄 𝐶, 𝐹, 𝐵, 𝐾, 𝑁 = 𝑄 𝐶 𝑄 𝐹 𝑄 𝐵 𝐶, 𝐹 𝑄 𝐾 𝐵 𝑄(𝑁|𝐵) 6

  7. Burglary example: DAG + CPTs 𝑄(𝐵 = 𝑢|𝐶, 𝐹) CPDs as quantitative specification 𝑄(𝐾 = 𝑢|𝐵) 𝑄(𝑁 = 𝑢|𝐵) 7

  8. Burglary example: full joint probability  𝑄 𝐾, 𝑁, 𝐵, 𝐶, 𝐹 = 𝑄(𝐾|𝐵) 𝑄(𝑁|𝐵) 𝑄(𝐵|𝐶, 𝐹) 𝑄 (𝐶) 𝑄 (𝐹)  𝑄 𝐾 = 𝑢, 𝑁 = 𝑢, 𝐵 = 𝑢, 𝐶 = 𝑔, 𝐹 = 𝑔 =  𝑄(𝐾 = 𝑢|𝐵 = 𝑢) 𝑄(𝑁 = 𝑢|𝐵 = 𝑢) 𝑄(𝐵 = 𝑢|𝐶 = 𝑔, 𝐹 = 𝑔) 𝑄 (𝐶 = 𝑔) 𝑄 (𝐹 = 𝑔)  = 0.9 × 0.7 × 0.001 × 0.999 × 0.998 = 0.000628 Short-hands 𝐾 = 𝑢: 𝐾𝑝ℎ𝑜𝐷𝑏𝑚𝑚𝑡 = 𝑈𝑠𝑣𝑓 𝐶 = 𝑔: 𝐶𝑣𝑠𝑕𝑚𝑏𝑠𝑧 = 𝐺𝑏𝑚𝑡𝑓 … 8

  9. Burglary example: inference  Conditional probability distribution: 𝑄(𝐾=𝑢,𝑁=𝑔,𝐶=𝑢)  𝑄(𝐶 = 𝑢|𝐾 = 𝑢, 𝑁 = 𝑔) = 𝑄(𝐾=𝑢,𝑁=𝑔) 𝐵 𝐹 𝑄(𝐾=𝑢,𝑁=𝑔,𝐵,𝐶,𝐹) = 𝐶 𝐵 𝐹 𝑄(𝐾=𝑢,𝑁=𝑔,𝐵,𝐶,𝐹) 9

  10. Student example 𝑄(𝐸 = 𝑢) 𝑄(𝐽 = 𝑢) Intelligence Difficulty 0.65 0.55 𝑄(𝐻|𝐽, 𝐸) 𝐽 𝐸 Grade 𝐻 = 1 𝐻 = 2 𝐻 = 3 SAT 𝑔 𝑔 0.3 0.4 0.3 𝐽 𝑄(𝑇 = 1|𝐽) 𝑔 𝑢 0.05 0.25 0.7 𝑔 0.1 𝑢 𝑔 0.9 0.08 0.02 Letter 𝑢 0.7 𝑢 𝑢 0.5 0.3 0.2 𝐻 𝑄(𝑀 = 𝑢|𝐻) 1 0.9 2 0.5 3 0.05 10

  11. Continuous variables example  Linear Gaussian 𝑌~𝑂(0,1) 𝑌 𝑍|𝑌 ~ 𝑂(𝑐 + 𝑌, 𝜏) 𝑞(𝑧|𝑦) 𝑍 𝐶 𝐵 𝑧 𝑦 𝑐 = 0.5 𝜏 = 0.1 11

  12. Missing edges  The joint distribution is represented by the chain rule generally: 𝑜 𝑄(𝑌 1 , … , 𝑌 𝑜 ) = 𝑄(𝑌 1 ) 𝑄(𝑌 𝑗 |𝑌 1 , … , 𝑌 𝑗−1 ) 𝑗=2  Equivalent to a graph in which all 𝑌 1 , … , 𝑌 𝑗−1 are parents of 𝑌 𝑗  Missing edges imply conditional independencies.  If we use a DAG that is not complete:  we remove some links, some of the conditioned variables are missing 12

  13. Compact representation  A CPT for a Boolean variable with k Boolean parents requires:  2 𝑙 rows: different combinations of parent values  𝑙 = 0 : one row showing the prior probability  If each variable has no more than 𝑙 parents  Full joint distribution requires 2 𝑜 − 1 numbers  Bayesian network requires at most 𝑜 × 2 𝑙 numbers (linear with 𝑜 )  ⇒ Exponential reduction in number of parameters 13

  14. Bayesian network semantics  Local independencies :  Each node is conditionally independent of its non-descendants given its parents 𝑌 𝑗 ⊥ Non_Descendants 𝑌 𝑗 | 𝑄𝑏(𝑌 𝑗 )  Are local independencies all of the conditional independencies implied by a BN? 14

  15. Factorization & independence  Let 𝐻 be a graph over 𝑌 1 , … , 𝑌 𝑜 , distribution 𝑄 factorizes over 𝐻 if: 𝑜 𝑄(𝑌 1 , … , 𝑌 𝑜 ) = 𝑄 (𝑌 𝑗 | 𝑄𝑏(𝑌 𝑗 )) 𝑗=1  Factorization ⇒ Independence  If 𝑄 factorizes over 𝐻 , then any variable in 𝑄 is independent of its non- descendants given its parents (in the graph 𝐻 )  Factorization according to 𝐻 implies the associated conditional independencies.  Independence ⇒ Factorization  If any variable in the distribution 𝑄 is independent of its non-descendants given its parents (in the graph 𝐻 ) then 𝑄 factorizes over 𝐻  Conditional independencies imply factorization of the joint distribution (into a product of simpler terms) 15

  16. Independence ⇒ factorization  Consider the chain rule: 𝑜 𝑄(𝑌 1 , … , 𝑌 𝑜 ) = 𝑄(𝑌 𝑗 |𝑌 1 , … , 𝑌 𝑗−1 ) 𝑗=1  We can simplify it through conditional independencies assumptions  Given using 𝑌 𝑗 ⫫ Non_Descendants 𝑌 𝑗 | 𝑄𝑏(𝑌 𝑗 ) we can show 𝑄 𝑌 𝑗 𝑌 1 , 𝑌 2 , … , 𝑌 𝑗−1 ) = 𝑄(𝑌 𝑗 | 𝑄𝑏𝑠𝑓𝑜𝑢𝑡(𝑌 𝑗 )) 16

  17. Equivalence Theorem  For a graph G: • Let D1 denote the family of all distributions that satisfy conditional independencies of G • Let D2 denote the family of all distributions that factor according to G • ⇒ D1 ≡ D2. 17

  18. Other independencies  Are there other independences that hold for every distribution 𝑄 that factorizes over 𝐻 ?  According to the graphical criterion called D-separation, we can find independencies from the graph  If 𝑄 factorizes over 𝐻 , can we read these independencies from the structure of 𝐻 ? 18

  19. Basic structures  𝑌 ⊥ 𝑍|𝑎 X Z Y  𝑌 ⊥ 𝑍|𝑎 Z X Y X Y  𝑌 ⊥ 𝑍 Z Explaining away 19

  20. Explaining away  When we condition on 𝑎 are 𝑌 and 𝑍 are independent? X Y Z 𝑄 𝑌, 𝑍, 𝑎 = 𝑄 𝑌 𝑄 𝑍 𝑄(𝑎|𝑌, 𝑍)  𝑌 and 𝑍 are marginally independent but given 𝑎 they are conditionally dependent  This is called explaining away  Two coins example 20

  21. D-separation  Let 𝐵, 𝐶, 𝐷 denote three disjoint sets of nodes, 𝐵 is d- separated from 𝐶 by 𝐷 iff 𝑩 ⊥ 𝑪|𝑫  𝐵 is d-separated from 𝐶 by 𝐷 if all undirected paths between 𝐵 and 𝐶 are blocked by 𝐷 21

  22. Undirected path blocking  Head-to-tail at a node 𝑎 ∈ 𝐷 Y X Z 𝑍 ∈ 𝐶 𝑎 ∈ 𝐷 𝑌 ∈ 𝐵  Tail-to-tail at a node 𝑎 ∈ 𝐷 Y X Z 𝑍 ∈ 𝐶 𝑎 ∈ 𝐷 𝑌 ∈ 𝐵  Head-to-head (i.e., v-structure) at a node 𝑎 ( 𝑎 ∉ 𝐷 & none of its descendants are in 𝐷 ) Y X Z 𝑍 ∈ 𝐶 𝑌 ∈ 𝐵 22

  23. Undirected path blocking 𝐵 𝐷 𝐶 … … In all trails (undirected paths) between A and B: • A node in the path is in 𝐷 and … … the path at the node do not meet head-to-head. … … Or a head-to-head node in the • path, and neither the node, nor … any of its descendants, is in C … 𝐵 ⊥ 𝐶|𝐷 23

  24. D-separation: active trail view  Definition: 𝑌 and 𝑍 are d-separated in 𝐻 given 𝑎 if there is no active trail in 𝐻 between 𝑌 and 𝑍 given 𝑎  A trail between 𝑌 and 𝑍 is active :  for any v-structure node 𝑉 in the trail 𝑌 … ⟶ 𝑉 ⟵ ⋯ 𝑍 , either 𝑉 or one of its descendants are in 𝑎  other nodes in this trail are not in 𝑎 24

  25. D-separation: example 𝑆⊥𝐻|𝐽 Intelligence Difficulty 𝑆⊥𝐸|𝐽 𝑆 ⊥ 𝐸|𝐻 Grade Rank 𝑆 ⊥ 𝐸|𝑀 𝑆 ⊥ 𝑀|𝐻 Letter 𝐸 ⊥ 𝑀|𝐻 25

  26. Markov Blanket in Bayesian Network  A variable is conditionally independent of all other variables given its Markov blanket  Markov blanket of a node:  All parents  Children  Co-parents of children 26

  27. D-Separation: soundness & completeness  Soundness : Any conditional independence properties that we can derive from 𝐻 should hold for the probability distribution that factorize over 𝐻  Theorem : If 𝑄 factorizes over 𝐻 , and d-sep G (𝒀, 𝒁|𝒂) then 𝑄 satisfies 𝒀 ⊥ 𝒁|𝒂  Weak completeness :  For almost all distributions 𝑄 that factorize over 𝐻 , if 𝒀 ⊥ 𝒁|𝒂 is in 𝑄 then 𝒀 and 𝒁 are d-separated given 𝒂 in the graph 𝐻  There can be independencies in 𝑄 that are not found by conditional independence properties of 𝐻 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend