Matching & Regression: Accounting for Rival Explanations - - PowerPoint PPT Presentation

matching regression accounting for rival explanations
SMART_READER_LITE
LIVE PREVIEW

Matching & Regression: Accounting for Rival Explanations - - PowerPoint PPT Presentation

Regression Matching and Conditioning Multiple Regression Matching & Regression: Accounting for Rival Explanations Department of Government London School of Economics and Political Science Regression Matching and Conditioning Multiple


slide-1
SLIDE 1

Regression Matching and Conditioning Multiple Regression

Matching & Regression: Accounting for Rival Explanations

Department of Government London School of Economics and Political Science

slide-2
SLIDE 2

Regression Matching and Conditioning Multiple Regression

1 Regression, Briefly 2 Matching and Conditioning 3 Multiple Regression

slide-3
SLIDE 3

Regression Matching and Conditioning Multiple Regression

1 Regression, Briefly 2 Matching and Conditioning 3 Multiple Regression

slide-4
SLIDE 4

Regression Matching and Conditioning Multiple Regression

Uses of Regression

1 Description 2 Prediction 3 Causal Inference

slide-5
SLIDE 5

Regression Matching and Conditioning Multiple Regression

Mathematically, regression. . .

. . . describes multivariate relationships in a sample of data points

slide-6
SLIDE 6

Regression Matching and Conditioning Multiple Regression

Mathematically, regression. . .

. . . describes multivariate relationships in a sample of data points . . . depending on sampling procedure, estimates those relationships in the population

slide-7
SLIDE 7

Regression Matching and Conditioning Multiple Regression

Mathematically, regression. . .

. . . describes multivariate relationships in a sample of data points . . . depending on sampling procedure, estimates those relationships in the population . . . depending on model fit, provides a way to predict outcome values for new cases

slide-8
SLIDE 8

Regression Matching and Conditioning Multiple Regression

Mathematically, regression. . .

. . . describes multivariate relationships in a sample of data points . . . depending on sampling procedure, estimates those relationships in the population . . . depending on model fit, provides a way to predict outcome values for new cases . . . depending on model completeness, provides inferences about the effect of X on Y

slide-9
SLIDE 9

Regression Matching and Conditioning Multiple Regression

1 Regression, Briefly 2 Matching and Conditioning 3 Multiple Regression

slide-10
SLIDE 10

Regression Matching and Conditioning Multiple Regression

Causal inference is about comparing an

  • bserved outcome to a counterfactual,

“potential outcome” for the same cases Regression provides a “statistical solution” to the fundamental problem of causal inference (Holland)

slide-11
SLIDE 11

Regression Matching and Conditioning Multiple Regression

An Example

For example, if we think smoking might cause lung cancer, how would we know? How would we know if smoking caused lung cancer for an individual who smoked? What’s the relevant counterfactual? How would we know if smoking causes lung cancer on average across many individuals? What’s the relevant counterfactual?

slide-12
SLIDE 12

Regression Matching and Conditioning Multiple Regression

Confounding

A source of “endogeneity” Synonyms: selection bias, omitted variable bias In lay terms: the (non)correlation between X and Y does not reflect a causal relationship between X and Y are related for other reasons

Most commonly: Some Z causes both X and Y

slide-13
SLIDE 13

Regression Matching and Conditioning Multiple Regression

Addressing Confounding

slide-14
SLIDE 14

Regression Matching and Conditioning Multiple Regression

Addressing Confounding

1 Correlate a “putative” cause (X) and an

  • utcome (Y )
slide-15
SLIDE 15

Regression Matching and Conditioning Multiple Regression

Addressing Confounding

1 Correlate a “putative” cause (X) and an

  • utcome (Y )

2 Identify all possible confounds (Z)

slide-16
SLIDE 16

Regression Matching and Conditioning Multiple Regression

Addressing Confounding

1 Correlate a “putative” cause (X) and an

  • utcome (Y )

2 Identify all possible confounds (Z) 3 “Condition” on all confounds

Calculate correlation between X and Y at each combination of levels of Z

slide-17
SLIDE 17

Regression Matching and Conditioning Multiple Regression

Mill’s Method of Difference

If an instance in which the phenomenon under investigation occurs, and an instance in which it does not occur, have every circumstance save one in common, that one occurring only in the former; the circumstance in which alone the two instances differ, is the effect, or cause, or an necessary part of the cause, of the phenomenon.

slide-18
SLIDE 18

Regression Matching and Conditioning Multiple Regression

Smoking Example

slide-19
SLIDE 19

Regression Matching and Conditioning Multiple Regression

Smoking Example

1 Partition sample into “smokers”

(X = 1) and “non-smokers” (X = 0)

slide-20
SLIDE 20

Regression Matching and Conditioning Multiple Regression

Smoking Example

1 Partition sample into “smokers”

(X = 1) and “non-smokers” (X = 0)

2 Identify possible confounds

Sex Parental smoking etc.

slide-21
SLIDE 21

Regression Matching and Conditioning Multiple Regression

Smoking Cancer Sex Environment Other factors Parental Smoking

slide-22
SLIDE 22

Regression Matching and Conditioning Multiple Regression

Smoking Cancer Sex Environment Other factors Parental Smoking

slide-23
SLIDE 23

Regression Matching and Conditioning Multiple Regression

Smoking Example

1 Partition sample into “smokers”

(X = 1) and “non-smokers” (X = 0)

2 Identify possible confounds

Sex Parental smoking etc.

slide-24
SLIDE 24

Regression Matching and Conditioning Multiple Regression

Smoking Example

1 Partition sample into “smokers”

(X = 1) and “non-smokers” (X = 0)

2 Identify possible confounds

Sex Parental smoking etc.

3 Estimate difference in cancer rates

between smokers and non-smokers within each group of covariates

slide-25
SLIDE 25

Regression Matching and Conditioning Multiple Regression

Example I

X Y (Cancer) Smokers 0.15 Non-smokers 0.05 ATE = ¯ YX=1 − ¯ YX=0 = 0.15 − 0.05 = 0.10

slide-26
SLIDE 26

Regression Matching and Conditioning Multiple Regression

Example II

Z1 (Sex) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . ATE =pMale ∗ ( ¯ YX=1,Z1=1 − ¯ YX=0,Z1=1)+ pFemale ∗ ( ¯ YX=1,Z1=0 − ¯ YX=0,Z1=0)

slide-27
SLIDE 27

Regression Matching and Conditioning Multiple Regression

Example III

Z2 (Parent) Z1 (Sex) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . 1 1 Smokers . . . 1 1 Non-smokers . . .

ATE =pMale, Parent non-smoker ∗ ( ¯ YX=1,Z1=1,Z2=0 − ¯ YX=0,Z1=1,Z2=0)+ pFemale, Parent non-smoker ∗ ( ¯ YX=1,Z1=0,Z2=0 − ¯ YX=0,Z1=0,Z2=0)+ pMale, Parent smoker ∗ ( ¯ YX=1,Z1=1,Z2=1 − ¯ YX=0,Z1=1,Z2=1)+ pFemale, Parent smoker ∗ ( ¯ YX=1,Z1=0,Z2=1 − ¯ YX=0,Z1=0,Z2=1)+

slide-28
SLIDE 28

Regression Matching and Conditioning Multiple Regression

Exact Matching

Repeat this partitioning of the space into “strata” (or “subclasses”) Requires at least one “treated” and one “untreated” case at every combination

  • f every covariate

More convenient notation: Naive Effect = ¯ YX=1 − ¯ YX=0 ATE = ¯ YX=1,Z − ¯ YX=0,Z

slide-29
SLIDE 29

Regression Matching and Conditioning Multiple Regression

Note that matching is just a version of Mill’s method of difference used for a large number

  • f cases.
slide-30
SLIDE 30

Regression Matching and Conditioning Multiple Regression

Omitted Variables

In the language of potential outcomes: E[Yi|Xi = 1] − E[Yi|Xi = 0] =

  • Naive Effect

E[Y1i|Xi = 1] − E[Y0i|Xi = 1]

  • Treatment Effect on Treated (ATT)

+ E[Y0i|Xi = 1] − E[Y0i|Xi = 0]

  • Selection Bias

By conditioning, we assert that the potential (control)

  • utcomes are equivalent between treated and non-treated

cases, so the difference we observe between treatment and control outcomes is only the average causal effect of the “treatment”.

slide-31
SLIDE 31

Regression Matching and Conditioning Multiple Regression

Common Conditioning Strategies

slide-32
SLIDE 32

Regression Matching and Conditioning Multiple Regression

Common Conditioning Strategies

1 Condition on nothing (“naive effect”)

slide-33
SLIDE 33

Regression Matching and Conditioning Multiple Regression

Common Conditioning Strategies

1 Condition on nothing (“naive effect”) 2 Condition on some variables

slide-34
SLIDE 34

Regression Matching and Conditioning Multiple Regression

Common Conditioning Strategies

1 Condition on nothing (“naive effect”) 2 Condition on some variables 3 Condition on all observables

slide-35
SLIDE 35

Regression Matching and Conditioning Multiple Regression

Common Conditioning Strategies

1 Condition on nothing (“naive effect”) 2 Condition on some variables 3 Condition on all observables

Which of these are good strategies?

slide-36
SLIDE 36

Regression Matching and Conditioning Multiple Regression

Caveat!

We can only condition on observed confounding variables If we think other confounds might exist, but are unobservable, no form of conditioning can help us

Example: Tobacco companies argued that an unknown genetic factor was a common cause of both smoking addiction and lung cancer

slide-37
SLIDE 37

Regression Matching and Conditioning Multiple Regression

Post-treatment Bias

We usually want to know the total effect of a cause If we include a mediator, D, of the X → Y relationship, the coefficient on X:

Only reflects the direct effect Excludes the indirect effect of X through D

So don’t control for mediators!

slide-38
SLIDE 38

Regression Matching and Conditioning Multiple Regression

Post-Treatment Bias

Smoking Tar Cancer Sex Environment Other factors Parental Smoking

slide-39
SLIDE 39

Regression Matching and Conditioning Multiple Regression

Post-Treatment Bias

D (Tar) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . .

slide-40
SLIDE 40

Regression Matching and Conditioning Multiple Regression

Post-Treatment Bias

D (Tar) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . Imagine:

ATETar =(¯ DX=1 − ¯ DX=0) = 1 ATECancer of Tar =( ¯ YD=1 − ¯ YD=0) = 1

slide-41
SLIDE 41

Regression Matching and Conditioning Multiple Regression

Post-Treatment Bias

D (Tar) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . Imagine:

ATETar =(¯ DX=1 − ¯ DX=0) = 1 ATECancer of Tar =( ¯ YD=1 − ¯ YD=0) = 1

slide-42
SLIDE 42

Regression Matching and Conditioning Multiple Regression

Post-Treatment Bias

D (Tar) X Y (Cancer) Smokers . . . Non-smokers . . . 1 Smokers . . . 1 Non-smokers . . . Imagine:

ATETar =(¯ DX=1 − ¯ DX=0) = 1 ATECancer of Tar =( ¯ YD=1 − ¯ YD=0) = 1 ATECancer of Smoking =pD=1( ¯ YX=1,D=1 − ¯ YX=0,D=1)+ pD=0( ¯ YX=1,D=0 − ¯ YX=0,D=0)

slide-43
SLIDE 43

Regression Matching and Conditioning Multiple Regression

slide-44
SLIDE 44

Regression Matching and Conditioning Multiple Regression

1 Regression, Briefly 2 Matching and Conditioning 3 Multiple Regression

slide-45
SLIDE 45

Regression Matching and Conditioning Multiple Regression

Multiple Regression

Regression achieves the same objectives as matching

Estimate average causal of a variable conditional on other variables

slide-46
SLIDE 46

Regression Matching and Conditioning Multiple Regression

Multiple Regression

Regression achieves the same objectives as matching

Estimate average causal of a variable conditional on other variables

Requires a linear relationship between all RHS (X variables) and Y

Can be a set of binary indicator variables

slide-47
SLIDE 47

Regression Matching and Conditioning Multiple Regression

Multiple Regression

Regression achieves the same objectives as matching

Estimate average causal of a variable conditional on other variables

Requires a linear relationship between all RHS (X variables) and Y

Can be a set of binary indicator variables

We interpret coefficient estimates as marginal average treatment effects

slide-48
SLIDE 48

Regression Matching and Conditioning Multiple Regression

From Line to Surface I

In simple regression, we estimate a line In multiple regression, we estimate a surface Each coefficient is the marginal effect, all else constant (at mean) This can be hard to picture in your mind

slide-49
SLIDE 49

Regression Matching and Conditioning Multiple Regression

From Line to Surface II

x y ˆ y = ˆ β0 + ˆ β1X

slide-50
SLIDE 50

Regression Matching and Conditioning Multiple Regression

From Line to Surface II

x y ˆ y = ˆ β0 + ˆ β2Z z

slide-51
SLIDE 51

Regression Matching and Conditioning Multiple Regression

From Line to Surface II

x y ˆ y = ˆ β0 + ˆ β1X + ˆ β2Z z

slide-52
SLIDE 52

Regression Matching and Conditioning Multiple Regression

Cusack, Iversen, and Soskice

Proportional Representation (Other factors) Ethno-Linguistic Division Strength/Threat

  • f Left
slide-53
SLIDE 53

Regression Matching and Conditioning Multiple Regression

Testing Rival Hypotheses

Rival hypotheses can be derived from two (or more) different theories We can conduct independent tests of each

Is there evidence consistent with Hyp 1? Is there evidence consistent with Hyp 2?

Regression allows us to test both simultaneously on the same data

Is the data more consistent with Hyp 1 or Hyp 2?

Draw inference about causality and about validity of theories based on data

slide-54
SLIDE 54

Regression Matching and Conditioning Multiple Regression

Cusack, Iversen, and Soskice

Proportional Representation (Other factors) Ethno-Linguistic Division Strength/Threat

  • f Left
slide-55
SLIDE 55

Regression Matching and Conditioning Multiple Regression

Cusack, Iversen, and Soskice

Proportional Representation (Other factors) Ethno-Linguistic Division Business-Labour Coordination

slide-56
SLIDE 56

Regression Matching and Conditioning Multiple Regression

Cusack, Iversen, and Soskice

Proportional Representation (Other factors) Ethno-Linguistic Division Strength/Threat

  • f Left

Business-Labour Coordination Z

slide-57
SLIDE 57

Regression Matching and Conditioning Multiple Regression

Rival Theories

Rokkan–Boix: PR = β0 + β1Threat + ǫ (1)

slide-58
SLIDE 58

Regression Matching and Conditioning Multiple Regression

slide-59
SLIDE 59

Regression Matching and Conditioning Multiple Regression

Aside: Interpretation

All our interpretation rules from earlier still apply in a multivariate regression Now we interpret a coefficient as an effect “all else constant” Generally, not good to give all coefficients a causal interpretation

Think “forward causal inference” We’re interested in the X → Y effect All other coefficients are there as “controls”

slide-60
SLIDE 60

Regression Matching and Conditioning Multiple Regression

Rival Theories

Rokkan–Boix: PR = β0 + β1Threat + ǫ (1)

slide-61
SLIDE 61

Regression Matching and Conditioning Multiple Regression

Rival Theories

Rokkan–Boix: PR = β0 + β1Threat + ǫ (1) Cusack, Iversen, and Soskice: PR = β0 + β2Coordination + ǫ (2)

slide-62
SLIDE 62

Regression Matching and Conditioning Multiple Regression

slide-63
SLIDE 63

Regression Matching and Conditioning Multiple Regression

Rival Theories

Rokkan–Boix: PR = β0 + β1Threat + ǫ (1) Cusack, Iversen, and Soskice: PR = β0 + β2Coordination + ǫ (2)

slide-64
SLIDE 64

Regression Matching and Conditioning Multiple Regression

Rival Theories

Rokkan–Boix: PR = β0 + β1Threat + ǫ (1) Cusack, Iversen, and Soskice: PR = β0 + β2Coordination + ǫ (2) Combined test: PR = β0+β1Threat +β2Coordination+ǫ (3)

slide-65
SLIDE 65

Regression Matching and Conditioning Multiple Regression

slide-66
SLIDE 66

Regression Matching and Conditioning Multiple Regression (1) (2) stthroct2 0.047 0.008 (0.035) (0.052) coordds −6.019∗∗∗ −5.284∗∗∗ (0.706) (1.008) dispro2 0.042 0.083 (0.052) (0.066) fragdum 3.624 0.123 (8.239) (8.911) Constant 28.239∗∗∗ 25.211∗∗∗ (5.866) (6.565) Observations 13 12 R2 0.947 0.948 Adjusted R2 0.920 0.919 Residual Std. Error 4.217 (df = 8) 4.207 (df = 7) F Statistic 35.673∗∗∗ (df = 4; 8) 32.084∗∗∗ (df = 4; 7) Note:

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

slide-67
SLIDE 67

Regression Matching and Conditioning Multiple Regression

slide-68
SLIDE 68

Regression Matching and Conditioning Multiple Regression

So the effect found by Rokkan and Boix was confounded by business–labour coordination. What was happening when they omitted the coordination variable?

slide-69
SLIDE 69

Regression Matching and Conditioning Multiple Regression

Omitted Variable Bias

We want to estimate: Y = β0 + β1X + β2Z + ǫ We actually estimate: ˜ y = ˜ β0 + ˜ β1x + ǫ = ˜ β0 + ˜ β1x + (0 ∗ z) + ǫ = ˜ β0 + ˜ β1x + ν Bias: ˜ β1 = ˆ β1 + ˆ β2˜ δ1, where ˜ z = ˜ δ0 + ˜ δ1x

slide-70
SLIDE 70

Regression Matching and Conditioning Multiple Regression

But have Cusack, Iversen, and Soskice considered all possible confounds?

slide-71
SLIDE 71

Regression Matching and Conditioning Multiple Regression

slide-72
SLIDE 72

Regression Matching and Conditioning Multiple Regression

slide-73
SLIDE 73

Regression Matching and Conditioning Multiple Regression (1) (2) stthroct2 0.058 0.006 (0.048) (0.043) coordds −5.556∗∗∗ −0.398 (1.578) (2.467) dispro2 0.013 −0.049 (0.102) (0.083) fragdum 4.983 3.366 (9.642) (7.465) brit 4.088 30.412∗ (12.258) (14.469) Constant 26.911∗∗∗ 9.390 (7.388) (9.253) Observations 13 12 R2 0.948 0.970 Adjusted R2 0.910 0.945 Residual Std. Error 4.472 (df = 7) 3.449 (df = 6) F Statistic 25.390∗∗∗ (df = 5; 7) 39.083∗∗∗ (df = 5; 6) Note:

∗p<0.1; ∗∗p<0.05; ∗∗∗p<0.01

slide-74
SLIDE 74

Regression Matching and Conditioning Multiple Regression

slide-75
SLIDE 75

Regression Matching and Conditioning Multiple Regression

Aside: Interpolation/Extrapolation In prediction, we may want to use our estimated coefficients to predict outcome values for new cases Interpolation is prediction within the interval covered by our observed data Extrapolation is prediction outside the interval covered by our observed data

slide-76
SLIDE 76

Regression Matching and Conditioning Multiple Regression

slide-77
SLIDE 77

Regression Matching and Conditioning Multiple Regression

Lingering Issues

slide-78
SLIDE 78

Regression Matching and Conditioning Multiple Regression

Lingering Issues

1 Inference to a population

Inferences from data to population depend on generalizability

slide-79
SLIDE 79

Regression Matching and Conditioning Multiple Regression

Lingering Issues

1 Inference to a population

Inferences from data to population depend on generalizability

2 Interactions terms

Allow us to test whether than effect varies across values of other variables

PR = β0 + β1Threat + β2Coord + ǫ = β0 + β1Threat + β2Coord + β3(Threat ∗ Coord) + ǫ

slide-80
SLIDE 80

Regression Matching and Conditioning Multiple Regression

Lingering Issues

1 Inference to a population

Inferences from data to population depend on generalizability

2 Interactions terms

Allow us to test whether than effect varies across values of other variables

PR = β0 + β1Threat + β2Coord + ǫ = β0 + β1Threat + β2Coord + β3(Threat ∗ Coord) + ǫ

3 RHS variables must be collinear

slide-81
SLIDE 81

Regression Matching and Conditioning Multiple Regression