A Calculus for Stochastic Interventions: Causal Effect Identification and Surrogate Experiments
Juan D. Correa and Elias Bareinboim
{jdcorrea, eb}@cs.columbia.edu
1
A Calculus for Stochastic Interventions: Causal Effect - - PowerPoint PPT Presentation
A Calculus for Stochastic Interventions: Causal Effect Identification and Surrogate Experiments Juan D. Correa and Elias Bareinboim {jdcorrea, eb}@cs.columbia.edu February, 2020, New York 1 Outline 2 Outline Hard/atomic interventions
Juan D. Correa and Elias Bareinboim
{jdcorrea, eb}@cs.columbia.edu
1
2
2
3
3
motivation (low, high), whether they got tutoring or not, and their GPA at the end.
3
motivation (low, high), whether they got tutoring or not, and their GPA at the end.
3
W
(previous GPA)
Z
(motivation)
factors) on the previous GPA.
motivation (low, high), whether they got tutoring or not, and their GPA at the end.
3
X W
(tutoring) (previous GPA)
Z
(motivation)
factors) on the previous GPA.
motivation.
motivation (low, high), whether they got tutoring or not, and their GPA at the end.
3
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
factors) on the previous GPA.
motivation.
previous GPA, student’s motivation and getting tutoring or not.
motivation (low, high), whether they got tutoring or not, and their GPA at the end.
3
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
G
factors) on the previous GPA.
motivation.
previous GPA, student’s motivation and getting tutoring or not.
4
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
4
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
students GPA can be predicted with small error given other features i.e., P(y | w, z, x).
4
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
students GPA can be predicted with small error given other features i.e., P(y | w, z, x).
natural regime, but we are interested in taking decisions to improve the students GPA.
4
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
students GPA can be predicted with small error given other features i.e., P(y | w, z, x).
natural regime, but we are interested in taking decisions to improve the students GPA.
student’s GPA receiving tutoring in a hypothetical (unrealized) reality.
4
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
students GPA can be predicted with small error given other features i.e., P(y | w, z, x).
natural regime, but we are interested in taking decisions to improve the students GPA.
student’s GPA receiving tutoring in a hypothetical (unrealized) reality.
5
(Pearl’s original treatment considered mostly this intervention. )
5
(Pearl’s original treatment considered mostly this intervention. )
5
(Pearl’s original treatment considered mostly this intervention. )
5
(Pearl’s original treatment considered mostly this intervention. )
5
(Pearl’s original treatment considered mostly this intervention. )
conditional on a set of variables W.
5
(Pearl’s original treatment considered mostly this intervention. )
conditional on a set of variables W.
enter for the remaining 20%.
5
6
6
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Intervention do(X = 1)
Make tutoring mandatory for all students.
Natural (current) Regime
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X=1 Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Intervention do(X = 1)
Make tutoring mandatory for all students.
Natural (current) Regime
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X=1 Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Intervention do(X = 1)
Make tutoring mandatory for all students.
Natural (current) Regime Intervened (hypothesized) Regime
X
6
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X=1 Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Intervention do(X = 1)
Make tutoring mandatory for all students.
Instead of P(y | X=1) we are reasoning about P(y | do(X=1)), or, more generally, P(y; σX=do(X=1))
Natural (current) Regime Intervened (hypothesized) Regime
X
7
effect of making tutoring mandatory for students with historically low GPA and
7
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Natural (current) Regime
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) Intervention σX = 1[W = 1]
Assign tutoring only to students with low GPA.
Natural (current) Regime
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) σX Intervention σX = 1[W = 1]
Assign tutoring only to students with low GPA.
Natural (current) Regime
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) σX Intervention σX = 1[W = 1]
Assign tutoring only to students with low GPA.
Natural (current) Regime Intervened (hypothesized) Regime
σX
effect of making tutoring mandatory for students with historically low GPA and
7
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation) σX Intervention σX = 1[W = 1]
Assign tutoring only to students with low GPA.
P(y ; )
σX
Natural (current) Regime Intervened (hypothesized) Regime
σX
8
8
P(y ∣ w, t; σX) = P(y ∣ w; σX) (Y ⊥ T ∣ W) σX
if in
8
P(y ∣ w, t; σX) = P(y ∣ w; σX) (Y ⊥ T ∣ W) σX
if in
P(y ∣ x, w; σX) = P(y ∣ x, w) (Y ⊥ Z ∣ W) σXX X
if in and
8
P(y ∣ w, t; σX) = P(y ∣ w; σX) (Y ⊥ T ∣ W) σX
if in
P(y ∣ x, w; σX) = P(y ∣ x, w) (Y ⊥ Z ∣ W) σXX X
if in and
P(y ∣ w; σX) = P(y ∣ w) (Y ⊥ Z ∣ W) σXX(W) X(W)
if in and
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX)
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX) P(x ∣ w; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX) P(y|x, w, z)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
= ∑
w,z
P(y|x, w, z)P(x ∣ w; σX)P(w, z) P(w, z)
Rule 3 (W, Z ⊥ X)
σXX X
in and
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
= ∑
w,z
P(y|x, w, z)P(x ∣ w; σX)P(w, z)
Rule 3 (W, Z ⊥ X)
σXX X
in and
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
= ∑
w,z
P(y|x, w, z)P(x ∣ w; σX)P(w, z)
Rule 3 (W, Z ⊥ X)
σXX X
in and Defined by σX
9
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
X Y W
(tutoring) (GPA) (previous GPA)
Z
(motivation)
σX
σX
P(y; σX) = ∑
w,z
P(y|x, w, z; σX)P(x ∣ w, z; σX)P(w, z; σX) = ∑
w,z
P(y|w, z)P(x ∣ w; σX)P(w, z; σX)
Rule 1 (X ⊥ Z ∣ W)
σX
in
= ∑
w,z
P(y|x, w, z)P(x ∣ w, z; σX)P(w, z; σX)
Rule 2 (Y ⊥ X ∣ W, Z)
σXX X
in and
= ∑
w,z
P(y|x, w, z)P(x ∣ w; σX)P(w, z)
Rule 3 (W, Z ⊥ X)
σXX X
in and Defined by σX Estimable from current regime
10
intervention is not identifiable (not uniquely computable) from observational data alone whenever unobserved confounders are present.
10
Identifiable?
Input: { P(v) } Query: { P(y; σX) } No
intervention is not identifiable (not uniquely computable) from observational data alone whenever unobserved confounders are present.
may be more accessible to manipulation than the target effect σX, e.g., randomizing diet vs randomizing cholesterol.
10
Identifiable?
Input: { P(v) } Query: { P(y; σX) } No
intervention is not identifiable (not uniquely computable) from observational data alone whenever unobserved confounders are present.
may be more accessible to manipulation than the target effect σX, e.g., randomizing diet vs randomizing cholesterol.
identify the effect of the interventions of interest.
10
Identifiable?
Input: { P(v) } Query: { P(y; σX) } No
Identifiable?
Input: { P(v), P(v;σZ1), P(v;σZ2), … } Query: { P(y; σX) } Yes
11
W Y Z R X
11
W Y Z R X
11
W Y Z R X
11
W Y Z R X
W Y Z R X
σX
W Y Z R X
σZ
11
W Y Z R X
W Y Z R X
σX
W Y Z R X
σZ
11
W Y Z R X
P(y; σX)
W Y Z R X
σX
W Y Z R X
σZ
11
W Y Z R X
P(y; σX)
W Y Z R X
σX
W Y Z R X
σZ
= ∑
r,w,x,z
P(r)P(x ∣ r; σX)P(z ∣ r, x, w)P(w ∣ r)∑
x′
P(y ∣ r, x′ , z; σZ)P(x′ ∣ r)
11
W Y Z R X
P(y; σX)
W Y Z R X
σX
W Y Z R X
σZ
From surrogate experiment
= ∑
r,w,x,z
P(r)P(x ∣ r; σX)P(z ∣ r, x, w)P(w ∣ r)∑
x′
P(y ∣ r, x′ , z; σZ)P(x′ ∣ r)
11
W Y Z R X
P(y; σX)
W Y Z R X
σX
W Y Z R X
σZ
Natural regime From surrogate experiment
= ∑
r,w,x,z
P(r)P(x ∣ r; σX)P(z ∣ r, x, w)P(w ∣ r)∑
x′
P(y ∣ r, x′ , z; σZ)P(x′ ∣ r)
11
W Y Z R X
P(y; σX)
W Y Z R X
σX
W Y Z R X
σZ
Natural regime From surrogate experiment
= ∑
r,w,x,z
P(r)P(x ∣ r; σX)P(z ∣ r, x, w)P(w ∣ r)∑
x′
P(y ∣ r, x′ , z; σZ)P(x′ ∣ r)
Defined by intervention
12
We introduce a set of inference rules called σ-calculus, which generalizes Pearl’s do-calculus, to reason about the effect of general types of
verifying claims about such interventions given a causal graph.
12
1
We introduce a set of inference rules called σ-calculus, which generalizes Pearl’s do-calculus, to reason about the effect of general types of
verifying claims about such interventions given a causal graph. We develop an efficient procedure to determine the identifiability of the (conditional) effect of non-atomic interventions from a combination of
12
1 2
13
Encode qualitative assumptions natural and intervened domain graphically.
13
1
Encode qualitative assumptions natural and intervened domain graphically.
13
Diagrams annotated with nodes. σX
1
Encode qualitative assumptions natural and intervened domain graphically. Find the mechanisms composing the effect of intervention.
13
Diagrams annotated with nodes. σX
1 2
Encode qualitative assumptions natural and intervened domain graphically. Find the mechanisms composing the effect of intervention. Derive the needed mechanisms from the given distributions.
13
Diagrams annotated with nodes. σX
1 2 3
Encode qualitative assumptions natural and intervened domain graphically. Find the mechanisms composing the effect of intervention. Derive the needed mechanisms from the given distributions. Construct an estimator from the available data.
13
Diagrams annotated with nodes. σX
1 2 3 4
Encode qualitative assumptions natural and intervened domain graphically. Find the mechanisms composing the effect of intervention. Derive the needed mechanisms from the given distributions. Construct an estimator from the available data.
13
Diagrams annotated with nodes. σX Use σ-calculus or equivalent algorithmic procedure.
1 2 3 4
14
statements about general interventions suitable to capture real-world situations.
14
statements about general interventions suitable to capture real-world situations.
14
statements about general interventions suitable to capture real-world situations.
corresponding mapping expression.
14
15