Inferring Heterogeneous Causal Effects in Presence of Spatial - - PowerPoint PPT Presentation

inferring heterogeneous causal effects in presence of
SMART_READER_LITE
LIVE PREVIEW

Inferring Heterogeneous Causal Effects in Presence of Spatial - - PowerPoint PPT Presentation

Inferring Heterogeneous Causal Effects in Presence of Spatial Confounding ICML, 2019 Muhammad Osama, Dave Zachariah, Thomas B. Sch on Division of System and Control, Department of Information Technology, Uppsala University 1 / 9


slide-1
SLIDE 1

Inferring Heterogeneous Causal Effects in Presence

  • f Spatial Confounding

ICML, 2019 Muhammad Osama, Dave Zachariah, Thomas B. Sch¨

  • n

Division of System and Control, Department of Information Technology, Uppsala University

1 / 9 muhammad.osama@it.uu.se

slide-2
SLIDE 2

Causal inference problem

◮ y ∈ R: Outcome of interest

2 / 9 muhammad.osama@it.uu.se

slide-3
SLIDE 3

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable

2 / 9 muhammad.osama@it.uu.se

slide-4
SLIDE 4

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

2 / 9 muhammad.osama@it.uu.se

slide-5
SLIDE 5

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

◮ Target quantity: Average effect of assigning z = z on y at location s

2 / 9 muhammad.osama@it.uu.se

slide-6
SLIDE 6

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

◮ Target quantity: Average effect of assigning z = z on y at location s τ = d d z E

  • y(

z) | s

  • (1)

2 / 9 muhammad.osama@it.uu.se

slide-7
SLIDE 7

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

◮ Target quantity: Average effect of assigning z = z on y at location s τ = d d z E

  • y(

z) | s

  • (1)

y = income z = age

2 / 9 muhammad.osama@it.uu.se

slide-8
SLIDE 8

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

◮ Target quantity: Average effect of assigning z = z on y at location s τ = d d z E

  • y(

z) | s

  • (1)

◮ c: Unobserved confounding variables

y = income z = age

2 / 9 muhammad.osama@it.uu.se

slide-9
SLIDE 9

Causal inference problem

◮ y ∈ R: Outcome of interest ◮ z ∈ R : Exposure variable ◮ Dn = {yi, zi, si}n

i=1, where s ∈ Rd is spatial location

◮ Target quantity: Average effect of assigning z = z on y at location s τ = d d z E

  • y(

z) | s

  • (1)

◮ c: Unobserved confounding variables

y = income z = age c = unemployment

2 / 9 muhammad.osama@it.uu.se

slide-10
SLIDE 10

Causal Inference Problem

c y z s

  • 5

5

  • 3
  • 2
  • 1

1 2 3

Example: Here τ = 0 yet Cov(z, y) = 0.

3 / 9 muhammad.osama@it.uu.se

slide-11
SLIDE 11

Approach

◮ Assumptions:

4 / 9 muhammad.osama@it.uu.se

slide-12
SLIDE 12

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • 4 / 9

muhammad.osama@it.uu.se

slide-13
SLIDE 13

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

4 / 9 muhammad.osama@it.uu.se

slide-14
SLIDE 14

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

y = τ(s)z + β(s) + ǫ (2)

4 / 9 muhammad.osama@it.uu.se

slide-15
SLIDE 15

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

y = τ(s)z + β(s) + ǫ (2) ◮ β(s) is a nuisance function correlated with spatially varying exposure z.

4 / 9 muhammad.osama@it.uu.se

slide-16
SLIDE 16

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

y = τ(s)z + β(s) + ǫ (2) ◮ β(s) is a nuisance function correlated with spatially varying exposure z.

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

τ(s)

4 / 9 muhammad.osama@it.uu.se

slide-17
SLIDE 17

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

y = τ(s)z + β(s) + ǫ (2) ◮ β(s) is a nuisance function correlated with spatially varying exposure z.

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

τ(s)

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

  • τ(s) via (2)

4 / 9 muhammad.osama@it.uu.se

slide-18
SLIDE 18

Approach

◮ Assumptions: ◮ E

  • y(

z) | s

  • = E
  • y|z =

z, s

  • ◮ E
  • y|z =

z, s

  • is affine in z

y = τ(s)z + β(s) + ǫ (2) ◮ β(s) is a nuisance function correlated with spatially varying exposure z.

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

τ(s)

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

  • τ(s) via (2)

2 4 6 8 10 2 4 6 8 10

  • 1
  • 0.5

0.5 1

  • τ(s) proposed

4 / 9 muhammad.osama@it.uu.se

slide-19
SLIDE 19

Error-in-variables model

◮ Let w = y − E

  • y|s
  • and v = z − E
  • z|s
  • [1]

5 / 9 muhammad.osama@it.uu.se

slide-20
SLIDE 20

Error-in-variables model

◮ Let w = y − E

  • y|s
  • and v = z − E
  • z|s
  • [1]

◮ (2) becomes w = τ(s)v + ǫ (3)

5 / 9 muhammad.osama@it.uu.se

slide-21
SLIDE 21

Error-in-variables model

◮ Let w = y − E

  • y|s
  • and v = z − E
  • z|s
  • [1]

◮ (2) becomes w = τ(s)v + ǫ (3) ◮ The effect τ(s) is directly identifiable from (3) which we parameterize as τθ(s) ∈ {f(s) : f = φ(s)⊤θ},

5 / 9 muhammad.osama@it.uu.se

slide-22
SLIDE 22

Error-in-variables model

◮ Let w = y − E

  • y|s
  • and v = z − E
  • z|s
  • [1]

◮ (2) becomes w = τ(s)v + ǫ (3) ◮ The effect τ(s) is directly identifiable from (3) which we parameterize as τθ(s) ∈ {f(s) : f = φ(s)⊤θ}, ◮ Residuals w and v are not observed but estimated so that w =

  • y −

E[y|s]

  • w

+

  • E[y|s] − E[y|s]
  • w

, v =

  • z −

E[z|s]

  • v

+

  • E[z|s] − E[z|s]
  • v

, where w and v denote errors

5 / 9 muhammad.osama@it.uu.se

slide-23
SLIDE 23

Proposed robust method

◮ Then (3) becomes

  • w =
  • vφ(s) + δ(s)

⊤θ + ǫ where δ(s) = vφ(s) is an unobserved random deviation

6 / 9 muhammad.osama@it.uu.se

slide-24
SLIDE 24

Proposed robust method

◮ Then (3) becomes

  • w =
  • vφ(s) + δ(s)

⊤θ + ǫ where δ(s) = vφ(s) is an unobserved random deviation ◮ Robust estimator with tolerance against worst-case deviation δ(s)

  • θ = arg min

θ

  • max

δ ∈ ∆

  • En
  • |

w − ( vφ(s) + δ)⊤θ|2

  • (4)

6 / 9 muhammad.osama@it.uu.se

slide-25
SLIDE 25

Proposed robust method

◮ Then (3) becomes

  • w =
  • vφ(s) + δ(s)

⊤θ + ǫ where δ(s) = vφ(s) is an unobserved random deviation ◮ Robust estimator with tolerance against worst-case deviation δ(s)

  • θ = arg min

θ

  • max

δ ∈ ∆

  • En
  • |

w − ( vφ(s) + δ)⊤θ|2

  • (4)

where ∆ =

  • δ : En
  • |δk|2

≤ n−1 En

  • |

vφk(s)|2 , ∀k

  • 6 / 9

muhammad.osama@it.uu.se

slide-26
SLIDE 26

Proposed robust method

◮ Then (3) becomes

  • w =
  • vφ(s) + δ(s)

⊤θ + ǫ where δ(s) = vφ(s) is an unobserved random deviation ◮ Robust estimator with tolerance against worst-case deviation δ(s)

  • θ = arg min

θ

  • max

δ ∈ ∆

  • En
  • |

w − ( vφ(s) + δ)⊤θ|2

  • (4)

where ∆ =

  • δ : En
  • |δk|2

≤ n−1 En

  • |

vφk(s)|2 , ∀k

  • ◮ (4) is a convex problem and can be solved using coordinate

descent.

6 / 9 muhammad.osama@it.uu.se

slide-27
SLIDE 27

Real data

◮ y : Number of crimes, z : number of poor families across states s = {1, . . . , 50}

7 / 9 muhammad.osama@it.uu.se

slide-28
SLIDE 28

Real data

◮ y : Number of crimes, z : number of poor families across states s = {1, . . . , 50}

0.20 0.25 0.30 0.35 0.40 [Effect estimate of poverty on crime]

(a) Estimate τ(s)

Significant Insignificant

(b) Significance at 5% level

◮ Results consistent with previous findings [2]

7 / 9 muhammad.osama@it.uu.se

slide-29
SLIDE 29

Conclusion

◮ We propose an orthogonalization-based strategy for estimating heterogeneous effects from spatial data in presence

  • f spatially varying confounding variables

◮ Our proposed method is robust to errors-in-variables ◮ Visit poster # 80 at Pacific Ballroom 6.30pm − 9pm

8 / 9 muhammad.osama@it.uu.se

slide-30
SLIDE 30

References

Chernozukhov et al., Double machine learning for treatment and causal parameters, cemmap working paper, Centre for Microdata Methods and Practice, 2016. Ellis et al., Crime, delinquency, and social status: A reconsideration, Journal of Offender Rehabilitation, 2001.

9 / 9 muhammad.osama@it.uu.se