Causal Programming Causal Programming Joshua Brul Joshua Brul - - PowerPoint PPT Presentation

causal programming causal programming
SMART_READER_LITE
LIVE PREVIEW

Causal Programming Causal Programming Joshua Brul Joshua Brul - - PowerPoint PPT Presentation

5/2/2019 talk slides Causal Programming Causal Programming Joshua Brul Joshua Brul file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 1/67 5/2/2019 talk slides Smoking/cancer structural causal model


slide-1
SLIDE 1

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 1/67

Causal Programming Causal Programming

Joshua Brulé Joshua Brulé

slide-2
SLIDE 2

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 2/67

Smoking/cancer structural causal model Smoking/cancer structural causal model

smoking tar cancer

smoking = ( ) f1 ϵ1 tar = (smoking, ) f2 ϵ2 cancer = (tar, ) f3 ϵ3 ⊥̸ ⊥ ϵ1 ϵ3

slide-3
SLIDE 3

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 3/67

Causal calculus (Pearl 1995) Causal calculus (Pearl 1995)

  • nodes in a causal DAG

delete edges pointing into denotes delete edges emanating from

  • nodes that are not ancestors of any
  • node

Note: abbreviated

P(y ∣ , z, w) = P(y ∣ , w) if (Y ⊥ ⊥ Z ∣ X, W x ^ x ^ )GX

¯¯ ¯

P(y ∣ , , w) = P(y ∣ , z, w) if (Y ⊥ ⊥ Z ∣ X, W x ^ z ^ x ^ )GX

¯¯ ¯ Z − −

P(y ∣ , z, w) = P(y ∣ , w) if (Y ⊥ ⊥ Z ∣ X, W x ^ x ^ )G

, X ¯¯ ¯ Z(W) ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯ ¯

W, X, Y , Z G GX

¯ ¯ ¯ ¯ ¯

X GX

− −

X Z(W) Z W P(y ∣ do(x)) P(y ∣ ) x ^

slide-4
SLIDE 4

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 4/67

Example proof Example proof

slide-5
SLIDE 5

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 5/67

Causation coecient Causation coecient

slide-6
SLIDE 6

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 6/67

Correlation is not causation Correlation is not causation

"Correlation is not causation but it sure is a hint." "Empirically observed covariation is a necessary but not sufcient condition for causality." —Edward Tufte

slide-7
SLIDE 7

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 7/67

Correlation coecient Correlation coecient

ρ = cov(X, Y ) V ar[X]V ar[Y ] − − − − − − − − − − − − √ ρ = xyP(x, y) − xP(x) yP(y) ∑x ∑y ∑x ∑y ( P(x) − ( xP(x) )( P(y) − ( yP(y) ) ∑x x2 ∑x )2 ∑y y2 ∑y )2 − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − − √

slide-8
SLIDE 8

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 8/67

Correlation coecient (rewritten) Correlation coecient (rewritten)

V ar[X] = P(x) − ( xP(x) ∑

x

x2 ∑

x

)2 V ar[Y ] = P(y|x)P(x) − ( yP(y|x)P(x) ∑

x

y

y2 ∑

x

y

)2 ρ = xyP(y|x)P(x) − xP(x) yP(y|x)P(x) ∑x ∑y ∑x ∑x ∑y V ar[X]V ar[Y ] − − − − − − − − − − − − √

slide-9
SLIDE 9

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 9/67

Dening the causation coecient Dening the causation coecient

Substitute , abbreviated for i.e. Replace observational distribution with interventional distribution Substitute for 'Distribution of interventions' Interpret as the relative cohort sizes in an experimental study Natural causation coefcient:

P(y ∣ do(x)) P(y ∣ ) x ^ P(y ∣ x) (x) P ^ P(x) (x) = P(x) P ^

slide-10
SLIDE 10

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 10/67

Causation coecient Causation coecient

V ar[ ] = (x) − ( x (x) X ^ ∑

x

x2P ^ ∑

x

P ^ )2 V a [Y ] = P(y| ) (x) − ( yP(y| ) (x) rX

^

x

y

y2 x ^ P ^ ∑

x

y

x ^ P ^ )2 = γX→Y xyP(y| ) (x) − x (x) yP(y| ) (x) ∑x ∑y x ^ P ^ ∑x P ^ ∑x ∑y x ^ P ^ V ar[ ]V a [Y ] X ^ rX

^

− − − − − − − − − − − − − √

slide-11
SLIDE 11

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 11/67

Interpretation of Interpretation of

  • perfect positive/negative linear correlation
  • perfect positive/negative linear causation
  • "linearly uncorrelated"
  • "linearly acausal"

γ

ρ = ±1 γ = ±1 ρ = 0 γ = 0

slide-12
SLIDE 12

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 12/67

No-confounding No-confounding

implies Converse holds for Bernoulli (binary) random variables

x y

P(y ∣ x) = P(y ∣ ) x ^ = ρ γX→Y

slide-13
SLIDE 13

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 13/67

Independence and Invariance Independence and Invariance

Denitions: and are independent iff is invariant to iff Lemmas: For Bernoulli , , iff and are independent For Bernoulli , , iff is invariant to

X Y P(y ∣ x) = P(y), ∀x, y Y X P(y ∣ ) = P(y), ∀x, y x ^ X Y ρ = 0 X Y X Y = 0 γX→Y Y X

slide-14
SLIDE 14

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 14/67

Average treatment eect Average treatment eect

For Bernoulli random variables: has the same sign as > 0 - treatment is more effective < 0 - treatment is less effective

ATE(X → Y ) ≡ P(Y = 1 ∣ do(X = 1)) − P(Y = 1 ∣ do(X = 0)) = ATE(X → Y ) γX→Y V ar[ ] X ^ V a [Y ] rX

^

− − − − − − − − √ γ ATE(X → Y ) ATE(X → Y ) ATE(X → Y )

slide-15
SLIDE 15

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 15/67

Plot causation vs correlation Plot causation vs correlation

Every point on a plot is a structural causal model

γρ

slide-16
SLIDE 16

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 16/67

slide-17
SLIDE 17

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 17/67

Invariant and independent Invariant and independent

Neither manipulation nor observation of changes/provides information about e.g. Two events outside each other's past and future light cone

X Y

slide-18
SLIDE 18

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 18/67

slide-19
SLIDE 19

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 19/67

Causation vs. correlation: common causation Causation vs. correlation: common causation

"If an improbable coincidence has occurred, there must exist a common cause" (Reichenbach 1956) e.g. Myopia and ambient lighting at night (Quinn et al. 1999)

slide-20
SLIDE 20

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 20/67

Inverse causation Inverse causation

and have the opposite sign e.g. Tuberculosis in Arizona (Gardner 1982)

ρ γ

slide-21
SLIDE 21

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 21/67

Example model: inverse causation Example model: inverse causation

Let and . The following model exhibits inverse causation:

∼ Bernoulli(1/2) ϵZ ∼ Bernoulli(3/4) ϵY Z = ϵZ X = Z Y = { ¬Z X if = 1 ϵY if = 0 ϵY

slide-22
SLIDE 22

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 22/67

Inverse causation probability distributions Inverse causation probability distributions

slide-23
SLIDE 23

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 23/67

Causation vs. correlation: unfaithfulness Causation vs. correlation: unfaithfulness

and are unfaithful if they are independent but not invariant I dene this as a 'local' version of unfaithful distribution (Spirtes et al. 1993)

X Y

slide-24
SLIDE 24

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 24/67

"Friedman's thermostat" "Friedman's thermostat"

Observe correlation between furnace and outside temperature Observe no correlation between furnace and inside temperature Observe no correlation between inside and outside temperature

slide-25
SLIDE 25

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 25/67

"Traitorous lieutenant" "Traitorous lieutenant"

General wishes to send one bit, recipient XORs bits For 1, send (0, 1) or (1, 0) with equal probability For 0, send (1, 1) or (0, 0) with equal probability

General Lieutenant (loyal) Lieutenant (traitor) Recipient

slide-26
SLIDE 26

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 26/67

Genuine causation and confounding bias Genuine causation and confounding bias

and have the same sign May be biased by confounders

ρ γ

slide-27
SLIDE 27

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 27/67

Recovering intuition: Why do we think correlation Recovering intuition: Why do we think correlation causation? causation?

Need a way to analyze behavior of 'typical' models Don't draw samples from a model, draw models from a space of models How to parameterize that space?

slide-28
SLIDE 28

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 28/67

Parameterization Parameterization

z x y

Draw a sample model from maximum entropy distribution over the parameters Compute ( , ) for Plot a kernel density estimate

Z = ϵZ X = Z + αZ ϵX Y = X + Z + βX βZ ϵY M ρ γ M

slide-29
SLIDE 29

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 29/67

Causation vs correlation ( Causation vs correlation ( 12% inverse causation) 12% inverse causation)

slide-30
SLIDE 30

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 30/67

slide-31
SLIDE 31

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 31/67

Correlation/causation relationships Correlation/causation relationships

Most of these effects were known, not all were named provides unied framework (population, acyclic) Intuition for why correlation causation Other relationships: Spurious correlation (population vs sample distribution) Mutual causation (not in acyclic models) Reverse causation (confusing for ) No substitute for proper causal analysis

γ, ρ ≈ X → Y Y → X

slide-32
SLIDE 32

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 32/67

Causal programming Causal programming

slide-33
SLIDE 33

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 33/67

Declarative programming ("what" instead of "how") Declarative programming ("what" instead of "how")

(Purely) functional programming Functions, algebraic data types Function application Logic programming First-order horn clauses Resolution Linear programming Linear objective function, linear constraints Optimize Probabilistic programming Various Conditional sampling

slide-34
SLIDE 34

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 34/67

Causal inference relation Causal inference relation

  • set of structural causal models
  • set of distributions; known probability functions
  • query from the causal hierarchy (Shpitser 2008), e.g.

,

  • formula that computes

as a function of for every model in

  • set of endogenous variables (usually implicit)

⟨M, D, Q, F⟩V M D Q P(y ∣ x) P(y ∣ do(x)) F Q D M V

slide-35
SLIDE 35

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 35/67

Identication (nd F) Identication (nd F)

Model, Model,

x z y

Distribution, Query Distribution, Query

,

Formula Formula

M =

D = P(x, y, z) Q = P(y ∣ do(x)) P(z ∣ x) P(y ∣ , z)P( ) ∑z ∑x′ x′ x′

slide-36
SLIDE 36

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 36/67

Solutions Solutions

, , where: , ,

Causal discovery (nd M) Causal discovery (nd M)

Distribution, Query Distribution, Query

, where

Models Models

D = P(x, y) X⊥̸ ⊥ Y Q = P(y ∣ do(x)) ⟨ , D, Q, ⟩ M1 F1 ⟨ , D, Q, ⟩ M2 F2 = (a) M1 = P(y ∣ x) F1 = (b) M2 = P(y) F2

slide-37
SLIDE 37

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 37/67

Context matters Context matters

There always exist compatible models where identication is impossible

slide-38
SLIDE 38

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 38/67

Solutions Solutions

Research design (nd D) Research design (nd D)

Model Model Query Query

⟨M, , Q, ⟩, ⟨M, , Q, ⟩ D1 F1 D2 F2 = P(y ∣ , , x)P( , ) F1 ∑

, w3 w4

w3 w4 w3 w4 = P(y ∣ , , x)P( , ) F2 ∑

, w4 w5

w4 w5 w4 w5 = P(x, y, , ) D1 w3 w4 = P(x, y, , ) D2 w4 w5 Q = P(y ∣ do(x))

slide-39
SLIDE 39

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 39/67

Query generation (nd Q) Query generation (nd Q)

"Testable implications" e.g. Can identify and , but not

P(y ∣ do(x)) P(z ∣ do(x)) P(y ∣ do(z))

slide-40
SLIDE 40

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 40/67

Optimization problems Optimization problems

Cost function over M, D, Q M - favor simple models (Occam's razor) D - optimal research design Q - (inverse) value of information

slide-41
SLIDE 41

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 41/67

"Meta-theory" / "Framework" "Meta-theory" / "Framework"

Sensitive to domains of M, D, Q Specify domains to get usable/implementable theory Framework to classify existing methods/problems

slide-42
SLIDE 42

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 42/67

(Some) Prior work / existing algorithms (Some) Prior work / existing algorithms

Identication Identication

ID (Shpitser 2006): M = (causal diagrams), D = , Q = IDC* (Shpitser & Pearl 2007): M = "", D = , Q = zID (Bareinboim 2012): M = "", D = , Q = Selection bias (Bareinboim 2014): M = "", D = , Q =

Causal discovery Causal discovery

Inductive causation based algorithms, e.g. PC, FCI

Research design / query generation (research opportunity?) Research design / query generation (research opportunity?)

Informally studied, no formal algorithms?

P(v) P(y ∣ do(x)) P(v ∣ do(z))∀Z ⊆ V P(α ∣ β) P(v ∣ do(z)) P(y ∣ do(x)) P(v ∣ S = 1) P(y ∣ do(x))

slide-43
SLIDE 43

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 43/67

Causal programming language Causal programming language

slide-44
SLIDE 44

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 44/67

Learn Lisp in < 1 minute Learn Lisp in < 1 minute

Everything is a function call Move the left parentheses one word to the left load_image("xkcd-297.png")

In [2]: (load-image "xkcd-297.png") Out[2]:

slide-45
SLIDE 45

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 45/67

"Core" Whittemore "Core" Whittemore

(model {:x [], :y [:x]}) - a (set of) structural causal model(s) (data [:x :y]) - the "signature" of a distribution, e.g. (q [:y] :do [:x]) - a query, e.g. (identify m d q) - returns a formula (estimate distribution formula) - applies formula to distribution

P(x, y) P(y ∣ do(x))

slide-46
SLIDE 46

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 46/67

Example: Treatment of renal calculi (Charig et al. 1986) Example: Treatment of renal calculi (Charig et al. 1986)

slide-47
SLIDE 47

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 47/67

Load data Load data

In [3]: (def kidney-dataset (read-csv "data/renal-calculi.csv")) (count kidney-dataset) In [4]: (head kidney-dataset) Out[3]: 700 Out[4]:

:size :success :treatment "small" "yes" "surgery" "large" "yes" "nephrolithotomy" "small" "yes" "surgery" "small" "yes" "surgery" "large" "yes" "nephrolithotomy" "large" "yes" "surgery" "small" "yes" "nephrolithotomy" "small" "yes" "surgery" "large" "no" "nephrolithotomy" "large" "yes" "nephrolithotomy"

slide-48
SLIDE 48

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 48/67

Categorical distribution Categorical distribution

In [5]: (def kidney-distribution (categorical kidney-dataset)) (plot-univariate kidney-distribution :size) Out[5]:

slide-49
SLIDE 49

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 49/67

Simpson's paradox Simpson's paradox

P(success=yes | treatment=surgery) < P(success=yes | treatment=nephrolithotomy)

In [7]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "surgery"})) In [8]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "nephrolithotomy"})) Out[7]: 0.78 Out[8]: 0.8257142857142857

slide-50
SLIDE 50

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 50/67

P(success=yes | treatment, size=small)

In [9]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "surgery" :size "small"})) In [10]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "nephrolithotomy" :size "small"}))

P(success=yes | treatment, size=large)

In [11]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "surgery" :size "large"})) In [12]: (estimate kidney-distribution (q {:success "yes"} :given {:treatment "nephrolithotomy" :size "large"})) Out[9]: 0.9310344827586207 Out[10]: 0.8666666666666667 Out[11]: 0.7300380228136882 Out[12]: 0.6875

slide-51
SLIDE 51

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 51/67

Model assumptions Model assumptions

size = ( ) fsize ϵsize treatment = (size, ) ftreatment ϵtreatment success = (treatment, size, ) fsuccess ϵsuccess

In [13]: (define charig1986 (model {:size [] :treatment [:size] :success [:treatment :size]})) Out[13]:

size treatment success

slide-52
SLIDE 52

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 52/67

Identify Identify

In [14]: (define f (identify charig1986 (data [:treatment :success :size]) (q [:success] :do {:treatment "surgery"}))) In [15]: (identify charig1986 (data [:treatment :success]) (q [:success] :do {:treatment "surgery"})) Out[14]:

[ P(size)P(success ∣ size, treatment)] ∑

size

where: treatment = "surgery"

Out[15]: #whittemore.core.Fail{:cause #{{:hedge #whittemore.core.Model{:pa {:treatment #{}, :success #{:treatment}}, :bi #{#{:treatment :success}}}, :s #{:succes s}}}}

slide-53
SLIDE 53

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 53/67

Estimate Estimate

In [16]: (estimate kidney-distribution f) In [17]: (plot-univariate (estimate kidney-distribution f)) Out[16]: #whittemore.core.Categorical{:pmf {{:success "yes"} 0.8325462173856037, {:succ ess "no"} 0.16745378261439622}} Out[17]:

slide-54
SLIDE 54

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 54/67

Problem: Problem: notation is overloaded notation is overloaded

; real number in the range [0, 1] ; conditional distribution of ; function from domain of to conditional distributions of

P()

P(Y = y ∣ X = x) P(y ∣ X = x) Y P(y ∣ x) X Y

slide-55
SLIDE 55

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 55/67

Solution: syntactic sugar Solution: syntactic sugar

In [18]: (infer charig1986 kidney-distribution (q {:success "yes"} :do {:treatment "surgery"})) In [19]: (infer charig1986 kidney-distribution (q {:success "yes"} :do {:treatment "nephrolithotomy"})) Out[18]: 0.8325462173856037 Out[19]: 0.778875

slide-56
SLIDE 56

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 56/67

Infer and plot Infer and plot

In [20]: (def associational-plot (plot-p-map {"P(success | nephro...)" (estimate kidney-distribution (q {:success "yes"} :given {:treatment "nephrolithotomy"})) "P(success | surgery)" (estimate kidney-distribution (q {:success "yes"} :given {:treatment "surgery"}))})) (def interventional-plot (plot-p-map {"P(success | do(nephro...))" (infer charig1986 kidney-distribution (q {:success "yes"} :do {:treatment "nephrolithotomy"})) "P(success | do(surgery))" (infer charig1986 kidney-distribution (q {:success "yes"} :do {:treatment "surgery"}))})) Out[20]: #'user/interventional-plot

slide-57
SLIDE 57

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 57/67

In [21]: associational-plot Out[21]:

slide-58
SLIDE 58

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 58/67

In [22]: interventional-plot Out[22]:

slide-59
SLIDE 59

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 59/67

Nonstandard adjustments Nonstandard adjustments

"How Conditioning on Posttreatment Variables Can Ruin Your Experiment and What to Do about It" (Montgomery et al. 2018) This article provides the most systematic account to date of the problems with and solutions to a recurring problem in experimental political science: conditioning on posttreatment variables. ...we recommend avoiding selecting on or controlling for posttreatment covariates.

slide-60
SLIDE 60

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 60/67

In [23]: (define wainer1989 (model {:pests_0 [] :birds [:pests_0] :pests_1 [:pests_0] :fumigants [:pests_0] :pests_2 [:pests_1 :fumigants] :pests_3 [:pests_2 :birds] :crops [:fumigants :pests_2 :pests_3]})) Out[23]:

pests_0 birds pests_1 fumigants pests_3 pests_2 crops

slide-61
SLIDE 61

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 61/67

In [25]: (define wainer-short (model {:z_0 [] :b [:z_0] :z_1 [:z_0] :x [:z_0] :z_2 [:z_1 :x] :z_3 [:z_2 :b] :y [:x :z_2 :z_3]})) Out[25]:

z_0 b z_1 x z_3 z_2 y

slide-62
SLIDE 62

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 62/67

In [27]: (define concomitant-example "Figure 3.8 (f) from (Shpitser 2008)" (model {:y [:x :z_1 :z_2] :z_2 [:z_1] :z_1 [:x] :x []} #{:y :z_1} #{:x :z_2})) In [28]: (identify concomitant-example (data [:x :y :z_1 :z_2]) (q [:y] :do [:x])) Out[27]:

y z_1 z_2 x

Out[28]:

[ [ P(x)P( ∣ x, )] P( ∣ x)P(y ∣ x, , )] ∑

, z1 z2

x

z2 z1 z1 z1 z2 where: (unbound)

slide-63
SLIDE 63

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 63/67

Distribution protocol Distribution protocol

(estimate this formula) (measure this event) (signature this) User extensible; potential for integration with probabilistic programming

slide-64
SLIDE 64

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 64/67

"Nanopass" simplication "Nanopass" simplication

Tikka and Karvanen modify the ID algorithm to simplify formulas Whittemore seperates identication and simplication steps "Pattern matching" rules to simplify formulas Marginalize rule Conditional rule Not currently user extensible

P(x, y) → P(y) ∑x → P(x ∣ y)

P(x,y) P(y)

slide-65
SLIDE 65

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 65/67

Install (Ubuntu) Install (Ubuntu) Source Source

github.com/jtcbrule/whittemore

$ sudo apt install leiningen $ pip3 install jupyter $ lein new whittmore demo $ cd demo $ lein jupyter notebook

slide-66
SLIDE 66

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 66/67

Questions? Questions?

slide-67
SLIDE 67

5/2/2019 talk slides file:///home/josh/causal-programming-guest-lecture/slides.html?print-pdf#/ 67/67

In [29]: (define butterfly (model {:x_1 [] :z_1 [:x_1] :y [:z_1 :z_2] :x_2 [] :z_2 [:x_2]} #{:x_1 :z_2} #{:z_1 :x_2})) In [30]: (identify butterfly (q [:y] :do [:x_1 :x_2])) Out[29]:

x_1 z_1 z_2 y x_2

Out[30]:

[ P(y ∣ , , , )P( ∣ )P( ∣ )] ∑

, z1 z2

x1 x2 z1 z2 z2 x2 z1 x1 where: (unbound)