Reasoning with Graphical Models Slides Set 2: Rina Dechter - - PowerPoint PPT Presentation

reasoning with graphical models
SMART_READER_LITE
LIVE PREVIEW

Reasoning with Graphical Models Slides Set 2: Rina Dechter - - PowerPoint PPT Presentation

Reasoning with Graphical Models Slides Set 2: Rina Dechter Reading: Darwiche chapters 4 Pearl: chapter 3 slides2 COMPSCI 2020 Outline Basics of probability theory DAGS, Markov(G), Bayesian networks Graphoids: axioms of for


slide-1
SLIDE 1

Reasoning with Graphical Models

Slides Set 2:

Rina Dechter

slides2 COMPSCI 2020

Reading: Darwiche chapters 4 Pearl: chapter 3

slide-2
SLIDE 2

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring

conditional independence (CI)

 D-separation: Inferring CIs in graphs

slides2 COMPSCI 2020

slide-3
SLIDE 3

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring

conditional independence (CI)

 Capturing CIs by graphs  D-separation: Inferring CIs in graphs

slides2 COMPSCI 2020

slide-4
SLIDE 4

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring

conditional independence (CI)

 D-separation: Inferring CIs in graphs

(Darwiche chapter 4)

slides2 COMPSCI 2020

slide-5
SLIDE 5

Bayesian Networks: Representation

= P(S) P(C|S) P(B|S) P(X|C,S) P(D|C,B) lung Cancer Smoking X-ray Bronchitis Dyspnoea

P(D|C,B) P(B|S) P(S) P(X|C,S) P(C|S)

P(S, C, B, X, D)

Conditional Independencies Efficient Representation

Θ) (G, BN 

CPD:

C B D=0 D=1 0 0 0.1 0.9 0 1 0.7 0.3 1 0 0.8 0.2 1 1 0.9 0.1

slides2 COMPSCI 2020

slide-6
SLIDE 6

The causal interpretation

slides2 COMPSCI 2020

slide-7
SLIDE 7

slides2 COMPSCI 2020

slide-8
SLIDE 8

slides2 COMPSCI 2020

slide-9
SLIDE 9

slides2 COMPSCI 2020

slide-10
SLIDE 10

Graphs convey set of independence statements

 Undirected graphs by graph separation  Directed graphs by graph’s d-separation  Goal: capture probabilistic conditional

independence by graph graphs.

slides2 COMPSCI 2020

slide-11
SLIDE 11

slides2 COMPSCI 2020

slide-12
SLIDE 12

slides2 COMPSCI 2020

slide-13
SLIDE 13

slides2 COMPSCI 2020

slide-14
SLIDE 14

slides2 COMPSCI 2020

slide-15
SLIDE 15

slides2 COMPSCI 2020

slide-16
SLIDE 16

slides2 COMPSCI 2020

slide-17
SLIDE 17

slides2 COMPSCI 2020

slide-18
SLIDE 18

slides2 COMPSCI 2020

Use GeNie/Smile To create this network

slide-19
SLIDE 19

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms for inferring

conditional independence (CI)

 D-separation: Inferring CIs in graphs

slides2 COMPSCI 2020

(Darwiche, chapter 4 Pearl, Chapter 3)

slide-20
SLIDE 20

R and C are independent given A

This independence follows from the Markov assumption

slides2 COMPSCI 2020

slide-21
SLIDE 21

Properties of Probabilistic independence

Symmetry:

I(X,Z,Y)  I(Y,Z,X)

Decomposition:

I(X,Z,YW) I(X,Z,Y) and I(X,Z,W)

Weak union:

I(X,Z,YW)I(X,ZW,Y)

Contraction:

I(X,Z,Y) and I(X,ZY,W)I(X,Z,YW)

Intersection:

I(X,ZY,W) and I(X,ZW,Y)  I(X,Z,YW)

slides2 COMPSCI 2020

slide-22
SLIDE 22

slides2 COMPSCI 2020

slide-23
SLIDE 23

Pearl’s language: If two pieces of information are irrelevant to X then each one is irrelevant to X

slides2 COMPSCI 2020

slide-24
SLIDE 24

Example: Two coins (C1,C2,) and a bell (B)

slides2 COMPSCI 2020

slide-25
SLIDE 25

slides2 COMPSCI 2020

slide-26
SLIDE 26

slides2 COMPSCI 2020

slide-27
SLIDE 27

slides2 COMPSCI 2020

slide-28
SLIDE 28

slides2 COMPSCI 2020

slide-29
SLIDE 29

When there are no constraints

slides2 COMPSCI 2020

slide-30
SLIDE 30

slides2 COMPSCI 2020

slide-31
SLIDE 31

slides2 COMPSCI 2020

slide-32
SLIDE 32

Properties of Probabilistic independence

Symmetry:

I(X,Z,Y)  I(Y,Z,X)

Decomposition:

I(X,Z,YW) I(X,Z,Y) and I(X,Z,W)

Weak union:

I(X,Z,YW)I(X,ZW,Y)

Contraction:

I(X,Z,Y) and I(X,ZY,W)I(X,Z,YW)

Intersection:

I(X,ZY,W) and I(X,ZW,Y)  I(X,Z,YW)

Graphoid axioms: Symmetry, decomposition Weak union and contraction Positive graphoid: +intersection In Pearl: the 5 axioms are called Graphids, the 4, semi-graphois

slides2 COMPSCI 2020

slide-33
SLIDE 33

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring

conditional independence (CI)

 D-separation: Inferring CIs in graphs

 I-maps, D-maps, perfect maps  Markov boundary and blanket  Markov networks

slides2 COMPSCI 2020

slide-34
SLIDE 34

What we know so far on BN?

 A probability distribution of a Bayesian network

having directed graph G, satisfies all the Markov assumptions of independencies.

 5 graphoid, (or positive) axioms allow inferring more

conditional independence relationship for the BN.

 D-separation in G will allows deducing easily many of

the inferred independencies.

 G with d-separation yields an I-MAP of the probability

distribution.

slides2 COMPSCI 2020

slide-35
SLIDE 35

slides2 COMPSCI 2020

slide-36
SLIDE 36

d-speration

To test whether X and Y are d-separated by Z in dag G, we need to consider every path between a node in X and a node in Y, and then ensure that the path is blocked by Z.

A path is blocked by Z if at least one valve (node) on the path is ‘closed’ given Z.

A divergent valve or a sequential valve is closed if it is in Z

A convergent valve is closed if it is not on Z nor any of its descendants are in Z.

slides2 COMPSCI 2020

slide-37
SLIDE 37

slides2 COMPSCI 2020

slide-38
SLIDE 38

slides2 COMPSCI 2020

slide-39
SLIDE 39

slides2 COMPSCI 2020

slide-40
SLIDE 40

No path Is active = Every path is blocked

slides2 COMPSCI 2020

slide-41
SLIDE 41

Bayesian Networks as i-maps

 E: Employment  V: Investment  H: Health  W: Wealth  C: Charitable

contributions

 P: Happiness

E E E C E V W C P H Are C and V d-separated give E and P? Are C and H d-separated?

slides2 COMPSCI 2020

slide-42
SLIDE 42

d-Seperation Using Ancestral Graph

 X is d-separated from Y given Z (<X,Z,Y>d) iff:

Take the ancestral graph that contains X,Y,Z and their ancestral subsets.

Moralized the obtained subgraph

Apply regular undirected graph separation

Check: (E,{},V),(E,P,H),(C,EW,P),(C,E,HP)?

E E E C E V W C P H

slides2 COMPSCI 2020

slide-43
SLIDE 43

Idsep(R,EC,B)?

slides2 COMPSCI 2020

slide-44
SLIDE 44

slides2 COMPSCI 2020

slide-45
SLIDE 45

slides2 COMPSCI 2020

slide-46
SLIDE 46

slides2 COMPSCI 2020

slide-47
SLIDE 47

Idsep(C,S,B)=?

slides2 COMPSCI 2020

slide-48
SLIDE 48

slides2 COMPSCI 2020

slide-49
SLIDE 49

slides2 COMPSCI 2020

Is S1 conditionally on S2 independent of S3 and S4 In the following Bayesian network?

slide-50
SLIDE 50

slides2 COMPSCI 2020

slide-51
SLIDE 51

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring conditional

independence (CI)

 D-separation: Inferring CIs in graphs

 Soundness, completeness of d-seperation  I-maps, D-maps, perfect maps  Construction a minimal I-map of a distribution  Markov boundary and blanket

slides2 COMPSCI 2020

slide-52
SLIDE 52

slides2 COMPSCI 2020

slide-53
SLIDE 53

It is not a d-map

slides2 COMPSCI 2020

slide-54
SLIDE 54

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring conditional

independence (CI)

 D-separation: Inferring CIs in graphs

 Soundness, completeness of d-seperation  I-maps, D-maps, perfect maps  Construction a minimal I-map of a distribution  Markov boundary and blanket

slides2 COMPSCI 2020

slide-55
SLIDE 55

slides2 COMPSCI 2020

slide-56
SLIDE 56

slides2 COMPSCI 2020

slide-57
SLIDE 57

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring conditional

independence (CI)

 D-separation: Inferring CIs in graphs

 Soundness, completeness of d-seperation  I-maps, D-maps, perfect maps  Construction a minimal I-map of a distribution  Markov boundary and blanket

slides2 COMPSCI 2020

slide-58
SLIDE 58

slides2 COMPSCI 2020

So how can we construct an I-MAP of a probability distribution? And a minimal I-Map

slide-59
SLIDE 59

slides2 COMPSCI 2020

slide-60
SLIDE 60

slides2 COMPSCI 2020

slide-61
SLIDE 61

slides2 COMPSCI 2020

slide-62
SLIDE 62

slides2 COMPSCI 2020

slide-63
SLIDE 63

slides2 COMPSCI 2020

slide-64
SLIDE 64

Perfect Maps for DAGs

 Theorem 10 [Geiger and Pearl 1988]: For any dag D

there exists a P such that D is a perfect map of P relative to d-separation.

 Corollary 7: d-separation identifies any implied

independency that follows logically from the set of independencies characterized by its dag.

slides2 COMPSCI 2020

slide-65
SLIDE 65

Outline

 Basics of probability theory  DAGS, Markov(G), Bayesian networks  Graphoids: axioms of for inferring conditional

independence (CI)

 D-separation: Inferring CIs in graphs

 Soundness, completeness of d-seperation  I-maps, D-maps, perfect maps  Construction a minimal I-map of a distribution  Markov boundary and blanket

slides2 COMPSCI 2020

slide-66
SLIDE 66

slides2 COMPSCI 2020

slide-67
SLIDE 67

Blanket Examples What is a Markov blanket of C?

slides2 COMPSCI 2020

slide-68
SLIDE 68

Blanket Examples

slides2 COMPSCI 2020

slide-69
SLIDE 69

Markov Blanket

slides2 COMPSCI 2020

slide-70
SLIDE 70

Bayesian Networks as Knowledge-Bases

 Given any distribution, P, and an ordering we can

construct a minimal i-map.

 The conditional probabilities of x given its parents is

all we need.

 In practice we go in the opposite direction: the

parents must be identified by human expert… they can be viewed as direct causes, or direct influences.

slides2 COMPSCI 2020

slide-71
SLIDE 71

slides2 COMPSCI 2020

slide-72
SLIDE 72

Pearl corollary 4

slides2 COMPSCI 2020

Corollary 4: Given a dag G and a probability distribution P, a necessary and sufficient Condition for G to be a Bayesian network of P is If all the Markovian assumptions are satisfied

slide-73
SLIDE 73

slides2 COMPSCI 2020

slide-74
SLIDE 74

Markov Networks and Markov Random Fields (MRF)

Can we also capture conditional independence by undirected graphs? Yes: using simple graph separation

slides2 COMPSCI 2020

slide-75
SLIDE 75

Undirected Graphs as I-maps of Distributions

slides2 COMPSCI 2020

slide-76
SLIDE 76

Graphoids vs Undirected graphs

Symmetry: I(X,Z,Y)  I(Y,Z,X)

Decomposition: I(X,Z,YW) I(X,Z,Y) and I(X,Z,W)

Weak union: I(X,Z,YW)I(X,ZW,Y)

Contraction: I(X,Z,Y) and I(X,ZY,W)I(X,Z,YW)

Intersection: I(X,ZY,W) and I(X,ZW,Y)  I(X,Z,YW)

Symmetry: I(X,Z,Y)  I(Y,Z,X) Decomposition: I(X,Z,YW) I(X,Z,Y) and I(X,Z,Y) Intersection: I(X,ZW,Y) and I(X,ZY,W)I(X,Z,YW)

Strong union: I(X,Z,Y)  I(X,ZW, Y) Transitivity: I(X,Z,Y)  exists t s.t. I(X,Z,t) or I(t,Z,Y)

slides2 COMPSCI 2020

See Pearl’s book

Graphoids: Conditional Independence Seperation in Graphs

slide-77
SLIDE 77

Markov Networks

 An undirected graph G which is a minimal I-map of

a probability distribution Pr, namely deleting any edge destroys its i-mappness relative to (undirected) seperation, is called a Markov network of P.

slides2 COMPSCI 2020

slide-78
SLIDE 78

slides2 COMPSCI 2020

slide-79
SLIDE 79

The unusual edge (3,4) reflects the reasoning that if we fix the arrival time (5) the travel time (4) must depends on current time (3) slides2 COMPSCI 2020

slide-80
SLIDE 80

How can we construct a probability Distribution that will have all these independencies?

slides2 COMPSCI 2020

slide-81
SLIDE 81

So, How do we learn Markov networks From data?

Markov Random Field (MRF)

slide-82
SLIDE 82

Markov Networks

slide-83
SLIDE 83

Sample Applications for Graphical Models

slides2 COMPSCI 2020