Causal Semantics of Bayesian Networks Jirka Vomlel Institute of - - PowerPoint PPT Presentation

causal semantics of bayesian networks
SMART_READER_LITE
LIVE PREVIEW

Causal Semantics of Bayesian Networks Jirka Vomlel Institute of - - PowerPoint PPT Presentation

Causal Semantics of Bayesian Networks Jirka Vomlel Institute of Information Theory and Automation Academy of Sciences of the Czech Republic http://www.utia.cz/vomlel Salzburg, 26 February 2010 J. Vomlel ( UTIA AV CR) Causal Semantics of


slide-1
SLIDE 1

Causal Semantics of Bayesian Networks

Jirka Vomlel

Institute of Information Theory and Automation Academy of Sciences of the Czech Republic http://www.utia.cz/vomlel

Salzburg, 26 February 2010

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 1 / 22

slide-2
SLIDE 2

Preface (J. Pearl, Causality, 2000)

Development of Western science is based on two great achievements: the invention of the formal logical system by the Greek philosophers, and the discovery of the possibility to find out causal relationship by systematic experiment during Renaissance. Albert Einstein (1953)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 2 / 22

slide-3
SLIDE 3

Outline

Bayesian networks

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 3 / 22

slide-4
SLIDE 4

Outline

Bayesian networks Observations versus interventions

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 3 / 22

slide-5
SLIDE 5

Outline

Bayesian networks Observations versus interventions Causal semantics of Bayesian networks

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 3 / 22

slide-6
SLIDE 6

Outline

Bayesian networks Observations versus interventions Causal semantics of Bayesian networks Latent variables in causal models

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 3 / 22

slide-7
SLIDE 7

Outline

Bayesian networks Observations versus interventions Causal semantics of Bayesian networks Latent variables in causal models Tractable causal models with latent variables

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 3 / 22

slide-8
SLIDE 8

Bayesian network (BN)

See file fever1.net in Hugin.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-9
SLIDE 9

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V .

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-10
SLIDE 10

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V . Xi, i ∈ V are discrete random variables.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-11
SLIDE 11

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V . Xi, i ∈ V are discrete random variables. Let pa(i) denote the set of nodes that are parents of i in G - formally, pa(i) = {j ∈ V : (j → i) ∈ E}.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-12
SLIDE 12

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V . Xi, i ∈ V are discrete random variables. Let pa(i) denote the set of nodes that are parents of i in G - formally, pa(i) = {j ∈ V : (j → i) ∈ E}. Further, let for A ⊆ V symbol XA denotes a set of variables {Xj}j∈A.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-13
SLIDE 13

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V . Xi, i ∈ V are discrete random variables. Let pa(i) denote the set of nodes that are parents of i in G - formally, pa(i) = {j ∈ V : (j → i) ∈ E}. Further, let for A ⊆ V symbol XA denotes a set of variables {Xj}j∈A. For each random variable Xi, i ∈ V a conditional probability distribution P(Xi|Xpa(i)) is defined.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-14
SLIDE 14

Bayesian network (BN)

See file fever1.net in Hugin. Acyclic directed graph G = (V , E) where V ⊂ {1, 2, . . . , n} is a set of nodes and E is a set of directed edges - formally, a subset of V × V . Xi, i ∈ V are discrete random variables. Let pa(i) denote the set of nodes that are parents of i in G - formally, pa(i) = {j ∈ V : (j → i) ∈ E}. Further, let for A ⊆ V symbol XA denotes a set of variables {Xj}j∈A. For each random variable Xi, i ∈ V a conditional probability distribution P(Xi|Xpa(i)) is defined. Then the joint probability distribution defined by a Bayesian network is P(XV ) =

  • i∈V

P(Xi|Xpa(i)) .

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 4 / 22

slide-15
SLIDE 15

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-16
SLIDE 16

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

Definition (Path blocked by evidence)

A path in a acyclic directed graph is blocked by a set C if there is a node n ∈ V in the path such that

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-17
SLIDE 17

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

Definition (Path blocked by evidence)

A path in a acyclic directed graph is blocked by a set C if there is a node n ∈ V in the path such that the arrows do not meet head-to-head in n and n ∈ C or

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-18
SLIDE 18

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

Definition (Path blocked by evidence)

A path in a acyclic directed graph is blocked by a set C if there is a node n ∈ V in the path such that the arrows do not meet head-to-head in n and n ∈ C or the arrows meet head-to-head in n and neither n nor any of its descendants belong to C.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-19
SLIDE 19

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

Definition (Path blocked by evidence)

A path in a acyclic directed graph is blocked by a set C if there is a node n ∈ V in the path such that the arrows do not meet head-to-head in n and n ∈ C or the arrows meet head-to-head in n and neither n nor any of its descendants belong to C.

Definition (Conditional independence)

Let A, B, C be pairwise disjoint subsets of V . XA is independent of XB given XC (XA ⊥ ⊥ XB|XC) iff all paths between A and B are blocked by set C.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-20
SLIDE 20

Conditional independence

Let C ⊂ V denote the set with indexes corresponding to variables whose state is known.

Definition (Path blocked by evidence)

A path in a acyclic directed graph is blocked by a set C if there is a node n ∈ V in the path such that the arrows do not meet head-to-head in n and n ∈ C or the arrows meet head-to-head in n and neither n nor any of its descendants belong to C.

Definition (Conditional independence)

Let A, B, C be pairwise disjoint subsets of V . XA is independent of XB given XC (XA ⊥ ⊥ XB|XC) iff all paths between A and B are blocked by set C. See file fever1.net in Hugin again.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 5 / 22

slide-21
SLIDE 21

BN example: Removing Kidney stones (Charig et al., 1986)

Let V = {X1, X2, X3} where X1 is Stones’ size taking values s (small) and l (large),

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 6 / 22

slide-22
SLIDE 22

BN example: Removing Kidney stones (Charig et al., 1986)

Let V = {X1, X2, X3} where X1 is Stones’ size taking values s (small) and l (large), X2 is Treatment taking values A (all open procedures) and B (percutaneous nephrolithotomy),

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 6 / 22

slide-23
SLIDE 23

BN example: Removing Kidney stones (Charig et al., 1986)

Let V = {X1, X2, X3} where X1 is Stones’ size taking values s (small) and l (large), X2 is Treatment taking values A (all open procedures) and B (percutaneous nephrolithotomy), X3 is Success taking values 0 (False) and 1 (True).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 6 / 22

slide-24
SLIDE 24

BN example: Removing Kidney stones (Charig et al., 1986)

Let V = {X1, X2, X3} where X1 is Stones’ size taking values s (small) and l (large), X2 is Treatment taking values A (all open procedures) and B (percutaneous nephrolithotomy), X3 is Success taking values 0 (False) and 1 (True).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 6 / 22

slide-25
SLIDE 25

BN example: Removing Kidney stones (Charig et al., 1986)

Let V = {X1, X2, X3} where X1 is Stones’ size taking values s (small) and l (large), X2 is Treatment taking values A (all open procedures) and B (percutaneous nephrolithotomy), X3 is Success taking values 0 (False) and 1 (True). P(X1, X2, X3) = P(X3|X1, X2) · P(X2|X1) · P(X1)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 6 / 22

slide-26
SLIDE 26

Conditioning by observation

Assume that we have observed that a person with kidney stones is going to undergo treatment A. What is the probability of the success? P(X3 = 1|X2 = A) = P(X3 = 1, X2 = A) P(X2 = A)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 7 / 22

slide-27
SLIDE 27

Conditioning by observation

Assume that we have observed that a person with kidney stones is going to undergo treatment A. What is the probability of the success? P(X3 = 1|X2 = A) = P(X3 = 1, X2 = A) P(X2 = A) where P(X2 = A) =

  • x3∈{0,1}

P(X3 = x3, X2 = A)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 7 / 22

slide-28
SLIDE 28

Conditioning by observation

Assume that we have observed that a person with kidney stones is going to undergo treatment A. What is the probability of the success? P(X3 = 1|X2 = A) = P(X3 = 1, X2 = A) P(X2 = A) where P(X2 = A) =

  • x3∈{0,1}

P(X3 = x3, X2 = A) P(X3 = x3, X2 = A) =

  • x1∈{s,l}

P(X3 = x3, X2 = A, X1 = x1)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 7 / 22

slide-29
SLIDE 29

Conditioning by observation

Assume that we have observed that a person with kidney stones is going to undergo treatment A. What is the probability of the success? P(X3 = 1|X2 = A) = P(X3 = 1, X2 = A) P(X2 = A) where P(X2 = A) =

  • x3∈{0,1}

P(X3 = x3, X2 = A) P(X3 = x3, X2 = A) =

  • x1∈{s,l}

P(X3 = x3, X2 = A, X1 = x1) and using the definition of the Bayesian network P(X3 = x3, X2 = A, X1 = x1) = P(X3 = x3|X2 = A, X1 = x1) · P(X2 = A|X1 = x1) · P(X1 = x1)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 7 / 22

slide-30
SLIDE 30

Conditioning by observation

Assume that we have observed that a person with kidney stones is going to undergo treatment A. What is the probability of the success? P(X3 = 1|X2 = A) = P(X3 = 1, X2 = A) P(X2 = A) where P(X2 = A) =

  • x3∈{0,1}

P(X3 = x3, X2 = A) P(X3 = x3, X2 = A) =

  • x1∈{s,l}

P(X3 = x3, X2 = A, X1 = x1) and using the definition of the Bayesian network P(X3 = x3, X2 = A, X1 = x1) = P(X3 = x3|X2 = A, X1 = x1) · P(X2 = A|X1 = x1) · P(X1 = x1) See file kidney stones.net in Hugin.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 7 / 22

slide-31
SLIDE 31

Conditioning by intervention

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 8 / 22

slide-32
SLIDE 32

Conditioning by intervention

Now, we want to evaluate which treatment A or B is generaly better.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 8 / 22

slide-33
SLIDE 33

Conditioning by intervention

Now, we want to evaluate which treatment A or B is generaly better. Shall we check whether P(X3 = 1|X2 = A) > P(X3 = 1|X2 = B) ?

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 8 / 22

slide-34
SLIDE 34

Conditioning by intervention

Now, we want to evaluate which treatment A or B is generaly better. Shall we check whether P(X3 = 1|X2 = A) > P(X3 = 1|X2 = B) ? No! We have a covariate X1 in the model that is inluenced by X2 and has an impact on X3, i.e., there is an open path X2 ← X1 → X3.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 8 / 22

slide-35
SLIDE 35

Conditioning by intervention

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-36
SLIDE 36

Conditioning by intervention

When we use a treatment we intervene in the world.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-37
SLIDE 37

Conditioning by intervention

When we use a treatment we intervene in the world. We manipulate the value of the corresponding variable.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-38
SLIDE 38

Conditioning by intervention

When we use a treatment we intervene in the world. We manipulate the value of the corresponding variable. We break the causal mechasnism therefore the manipulated variable should be disconected from all variables that causally influence it.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-39
SLIDE 39

Conditioning by intervention

When we use a treatment we intervene in the world. We manipulate the value of the corresponding variable. We break the causal mechasnism therefore the manipulated variable should be disconected from all variables that causally influence it.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-40
SLIDE 40

Conditioning by intervention

When we use a treatment we intervene in the world. We manipulate the value of the corresponding variable. We break the causal mechasnism therefore the manipulated variable should be disconected from all variables that causally influence it. See file simpsons paradox kidney stones.net in Hugin.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 9 / 22

slide-41
SLIDE 41

Causal Bayesian networks

Each causal variable Xi is governed by a causal mechanism that stochastically determines its value based on the value of its parents.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 10 / 22

slide-42
SLIDE 42

Causal Bayesian networks

Each causal variable Xi is governed by a causal mechanism that stochastically determines its value based on the value of its parents. The causal mechanism takes the same form as the conditional probability distribution P(Xi|Xpa(i)).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 10 / 22

slide-43
SLIDE 43

Causal Bayesian networks

Each causal variable Xi is governed by a causal mechanism that stochastically determines its value based on the value of its parents. The causal mechanism takes the same form as the conditional probability distribution P(Xi|Xpa(i)). Causality flows in the direction of edges.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 10 / 22

slide-44
SLIDE 44

Causal Bayesian networks

Each causal variable Xi is governed by a causal mechanism that stochastically determines its value based on the value of its parents. The causal mechanism takes the same form as the conditional probability distribution P(Xi|Xpa(i)). Causality flows in the direction of edges. Intervention on a variable Xi ← x∗

i results in replacing its causal

mechanism with one that dictates Xi takes value x∗

i .

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 10 / 22

slide-45
SLIDE 45

Causal Bayesian networks

Each causal variable Xi is governed by a causal mechanism that stochastically determines its value based on the value of its parents. The causal mechanism takes the same form as the conditional probability distribution P(Xi|Xpa(i)). Causality flows in the direction of edges. Intervention on a variable Xi ← x∗

i results in replacing its causal

mechanism with one that dictates Xi takes value x∗

i .

Definition

Probability after intervention XA ← x∗

A is defined as

P(XB = xB|XA ← x∗

A)

=

  • i∈B

P(Xi|Xpa(i) = ¯ xpa(i)) where B = V \ A and for C ⊆ V the symbol ¯ xC denotes vector xC = (xi)i∈C with values xi corresponding to i ∈ A ∩ C substituted by x∗

i .

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 10 / 22

slide-46
SLIDE 46

Probability after intervention - example

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 11 / 22

slide-47
SLIDE 47

Probability after intervention - example

P(X1 = x1, X3 = x3|X2 ← x∗

2)

= P(X3 = x3|X1 = x1, X2 = x∗

2) · P(X1 = x1)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 11 / 22

slide-48
SLIDE 48

Probability after intervention - example

P(X1 = x1, X3 = x3|X2 ← x∗

2)

= P(X3 = x3|X1 = x1, X2 = x∗

2) · P(X1 = x1)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 11 / 22

slide-49
SLIDE 49

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as:

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-50
SLIDE 50

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt,

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-51
SLIDE 51

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR,

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-52
SLIDE 52

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR, a set of observed covariates XC, and

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-53
SLIDE 53

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR, a set of observed covariates XC, and a set of unobserved (latent) variables XU.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-54
SLIDE 54

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR, a set of observed covariates XC, and a set of unobserved (latent) variables XU. Probability distribution P(t, R, C) is known.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-55
SLIDE 55

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR, a set of observed covariates XC, and a set of unobserved (latent) variables XU. Probability distribution P(t, R, C) is known. P(t, R, C, U) is unknown but its underlaying structure is given by an acyclic directed graph G = (V , E).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-56
SLIDE 56

Observed covariates and latent variables in causal models

Definition (General Problem)

Assume variables are partitioned by {t} ∪ R ∪ C ∪ U = V as: a treatment variable Xt, a set of response variables XR, a set of observed covariates XC, and a set of unobserved (latent) variables XU. Probability distribution P(t, R, C) is known. P(t, R, C, U) is unknown but its underlaying structure is given by an acyclic directed graph G = (V , E). Can P(XR = xR|Xt ← x∗

t ) be determined from this information?

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 12 / 22

slide-57
SLIDE 57

Identifiability of causal effects

Definition

XC identifies causal effect of Xt on XR if for any pair of causal Bayesian networks P1, P2 with graph G it holds that P1(Xt = xt, XR = xR, XC = xC) ≡ P2(Xt = xt, XR = xR, XC = xC) = ⇒ P1(XR = xr|Xt ← x∗

t ) ≡ P2(XR = xr|Xt ← x∗ t ) .

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 13 / 22

slide-58
SLIDE 58

Identifiability of causal effects

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 14 / 22

slide-59
SLIDE 59

Identifiability of causal effects

Again, assume Xt = Treatment, and XR = Success. If Stones′size is

  • bservable then it identifies causal effect of Xt on XR, if it is latent then

causal effect of Xt on XR is not identifiable.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 14 / 22

slide-60
SLIDE 60

Intervention graph

We can decide the identifiability from the graph - we don’t need to resort to probability distributions.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 15 / 22

slide-61
SLIDE 61

Intervention graph

We can decide the identifiability from the graph - we don’t need to resort to probability distributions.

Definition

Intervention graph is obtained from the graph of considered causal Baysian network by augmenting each node t corresponding to an intervention with an additional parent t′.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 15 / 22

slide-62
SLIDE 62

Intervention graph

We can decide the identifiability from the graph - we don’t need to resort to probability distributions.

Definition

Intervention graph is obtained from the graph of considered causal Baysian network by augmenting each node t corresponding to an intervention with an additional parent t′.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 15 / 22

slide-63
SLIDE 63

Intervention graph

We can decide the identifiability from the graph - we don’t need to resort to probability distributions.

Definition

Intervention graph is obtained from the graph of considered causal Baysian network by augmenting each node t corresponding to an intervention with an additional parent t′. See file kidney stones augmented.net in Hugin.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 15 / 22

slide-64
SLIDE 64

Transformation of interventional to observational probabilities

To decide whether a set C identifies causal effect and to compute P(XR = xr|Xt ← x∗

t ) three inference rules (based on testing conditional

independence in the intervention graph) can be used (for A ⊆ C):

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 16 / 22

slide-65
SLIDE 65

Transformation of interventional to observational probabilities

To decide whether a set C identifies causal effect and to compute P(XR = xr|Xt ← x∗

t ) three inference rules (based on testing conditional

independence in the intervention graph) can be used (for A ⊆ C):

1 neutral observation

XR ⊥ ⊥ Xt|XA = ⇒ P(XR = xR|XA = xA, Xt = xt) = P(XR = xR|XA = xA)

Xr Xa Xt Xt′

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 16 / 22

slide-66
SLIDE 66

Transformation of interventional to observational probabilities

To decide whether a set C identifies causal effect and to compute P(XR = xr|Xt ← x∗

t ) three inference rules (based on testing conditional

independence in the intervention graph) can be used (for A ⊆ C):

2 neutral intervention

XR ⊥ ⊥ Xt′|XA = ⇒ P(XR = xR|XA = xA, Xt ← xt) = P(XR = xR|XA = xA)

Xr Xa Xt Xt′

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 16 / 22

slide-67
SLIDE 67

Transformation of interventional to observational probabilities

To decide whether a set C identifies causal effect and to compute P(XR = xr|Xt ← x∗

t ) three inference rules (based on testing conditional

independence in the intervention graph) can be used (for A ⊆ C):

3 equivalence of observation and intervention

XR ⊥ ⊥ Xt′|XA, Xt = ⇒ P(XR = xR|XA = xA, Xt ← xt) = P(XR = xR|XA = xA, Xt = xt)

Xr Xt′ Xa Xt

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 16 / 22

slide-68
SLIDE 68

Back-door theorem

Theorem

Assume A ⊆ C such that A satisfies XA ⊥ ⊥ Xt′

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 17 / 22

slide-69
SLIDE 69

Back-door theorem

Theorem

Assume A ⊆ C such that A satisfies XA ⊥ ⊥ Xt′ XR ⊥ ⊥ Xt′|XA, Xt (all back door trails are blocked by XA).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 17 / 22

slide-70
SLIDE 70

Back-door theorem

Theorem

Assume A ⊆ C such that A satisfies XA ⊥ ⊥ Xt′ XR ⊥ ⊥ Xt′|XA, Xt (all back door trails are blocked by XA). Then P(XR = xR|Xt ← xt) =

  • xA

P(XR = xR|XA = xA, Xt = xt) · P(XA = xA)

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 17 / 22

slide-71
SLIDE 71

Back-door theorem

Theorem

Assume A ⊆ C such that A satisfies XA ⊥ ⊥ Xt′ XR ⊥ ⊥ Xt′|XA, Xt (all back door trails are blocked by XA). Then P(XR = xR|Xt ← xt) =

  • xA

P(XR = xR|XA = xA, Xt = xt) · P(XA = xA)

Xr Xt′ Xa Xt

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 17 / 22

slide-72
SLIDE 72

Randomization

Xr Xt′ Xt Xu

non-identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 18 / 22

slide-73
SLIDE 73

Randomization

Xr Xt′ Xt Xu

non-identifiable

Xu Xt Xt′ Xr

identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 18 / 22

slide-74
SLIDE 74

Randomization

Xr Xt′ Xt Xu

non-identifiable

Xr Xt′ Xt Xc Xu

identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 18 / 22

slide-75
SLIDE 75

Sufficient covariate

Xr Xt′ Xt Xu

non-identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 19 / 22

slide-76
SLIDE 76

Sufficient covariate

Xr Xt′ Xt Xu

non-identifiable

Xr Xt′ Xt Xu Xc

identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 19 / 22

slide-77
SLIDE 77

Front door

Xr Xt′ Xt Xu

non-identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 20 / 22

slide-78
SLIDE 78

Front door

Xr Xt′ Xt Xu

non-identifiable

Xr Xu Xc Xt′ Xt

identifiable

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 20 / 22

slide-79
SLIDE 79

Discussion

We have described a way how a causal interpretation can be assigned to Bayesian networks.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 21 / 22

slide-80
SLIDE 80

Discussion

We have described a way how a causal interpretation can be assigned to Bayesian networks. However, there are many other ways to do this (e.g. Shafer, 1996).

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 21 / 22

slide-81
SLIDE 81

Discussion

We have described a way how a causal interpretation can be assigned to Bayesian networks. However, there are many other ways to do this (e.g. Shafer, 1996). Also, we need to check carefully that a Causal Bayesian network is a proper model for a given problem.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 21 / 22

slide-82
SLIDE 82

Literature

Steffen Lauritzen. Causal Inference from Graphical Models. Research Report R-99-2021, Department of Mathematics, Aalborg University. Available at:

http://www.math.aau.dk/fileadmin/user_upload/www.math.aau.dk/Forskning/Rapportserien/R-99-2021.ps

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 22 / 22

slide-83
SLIDE 83

Literature

Steffen Lauritzen. Causal Inference from Graphical Models. Research Report R-99-2021, Department of Mathematics, Aalborg University. Available at:

http://www.math.aau.dk/fileadmin/user_upload/www.math.aau.dk/Forskning/Rapportserien/R-99-2021.ps

Daphne Koller and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009. Chapter 21.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 22 / 22

slide-84
SLIDE 84

Literature

Steffen Lauritzen. Causal Inference from Graphical Models. Research Report R-99-2021, Department of Mathematics, Aalborg University. Available at:

http://www.math.aau.dk/fileadmin/user_upload/www.math.aau.dk/Forskning/Rapportserien/R-99-2021.ps

Daphne Koller and Nir Friedman. Probabilistic Graphical Models: Principles and Techniques, MIT Press, 2009. Chapter 21. Judea Pearl. Causality. Cambridge University Press, 2000.

  • J. Vomlel (´

UTIA AVˇ CR) Causal Semantics of Bayesian Networks 26/Feb/2010 22 / 22