Causal inference
Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans
Causal inference Part I.b: randomized experiments, matching and - - PowerPoint PPT Presentation
Causal inference Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans Example of a randomized experiment: Job Training Partnership Act (JTPA) Largest
Part I.b: randomized experiments, matching and regression (this lecture starts with other slides on randomized experiments) Frank Venmans
sites
earnings, race, etc.
drastically cut.
experiment
cigarette smokers)
distribution as non-smokers?
Death rates pipe smokers # Pipe-smokers # non-smokers Age 20-50 15 11,000 29,000 Age 50-70 35 13,000 9,000 Age +70 50 16,000 2,000 Total 40,000 40,000
than average
Predetermined Covariates:
(smoking) if for each individual i, X0i=X1i
Outcomes
(possibly) not predetermined are called outcomes if for some individual i, π
0π β π 1π
post-treatment bias
1, π 0 β₯ πΈ|X (selection on observables)
must be included in the model (X is a vector of covariates)!
ATET
0 β₯ πΈ|X (selection on observables)
same for smokers and non-smokers
values of X, there are no treated units, this is not a problem.
π΅ππΉ = π
1 π β π π πΏ π=1 ππ π
; π½ π΅ππΉπ = π
1 π β π π π1
π
π1 π π=1
π is # of treated obs in cell k
Xk Death rate smokers Death rate non-smokers Diff. # smokers # Obs. Old 28 24 4 3 10 Young 22 16 6 7 10 Total 23,8 21,6 2,2 10 20
π΅ππΉπ by Β« imputing Β» the missing potential outcome of eacht treated unit using the observed outcome from the Β« closest Β» control unit:
π΅ππΉπ =
1 π1
π
π β π π π πΈπ=1
π π the outcome of an untreated observation such that ππ π is
the closest value to ππ among the untreated observations.
π΅ππΉπ =
1 π1
π
π β 1 π
π
π π π π=1 πΈπ=1
Single matching vs multiple matching
efficient (lower standard errors of estimate) Matching with replacement vs without replacement
times => lower bias
served already as a match. Therefore the second best match is used. This increases the set of information that is used. More efficient.
specified.
variance) has the same weight. Ex if there are 3 variables, you would match points that are closest in a standardized 3D plot.
country, or sector), combined with another distance metric for other variables.
badly.
π΅ππΉπ =
1 π1
π
π β π π π
β π 0 ππ β π 0 ππ π
πΈπ=1
0 + πΎ 1π1 + πΎ 2π2 β¦ estimated by OLS
companies, even if the matching algorithm searches among the smallest of control companies, the mean size of the control may still be greater than the mean size of the treated companies. Bias correction will calculate a size effect and deduce this from the estimated treatment effect (Abadie & Imbens, 2006).
estimation of the form πππ π½ π΅ππΉπ =
1 π1
2
π
π β π π π β π½
π΅ππΉπ
2 πΈπ=1
(analytical solution not given)
confounding variables: π π = π πΈ = 1 π
1, π 0 β₯ πΈ π β π 1, π 0 β₯ πΈ π π
combination of their confounders X, then they are a good (unbiased) match.
would be a women that is a little bit older.
and common support
π΅ππΉ =
1 π
π
π πΈπβπ ππ π ππ 1βπ ππ π π=1
π΅ππΉπ =
1 π
π
π πΈπβπ ππ 1βπ ππ π π=1
matching.
2 = πΒ²
being equal, an increase of X1 by one unit will increase Y by πΎ1.
inverse
(simultaneΓ―ty).
underestimates the causal effect of parentsβ educations on their childrenβs results. Because the indirect effect is not included.
πππ’π ππ¦ πππ’ππ’πππ πΉ ππβ² = ππ½
Death rate smoking Encouragement by peers in the past Age Gender Alcohol Error= all other factors Other health conditions Car accidents Genetic
πΉ π|π β 0 β πππ€ π, π β 0 β π and Xare driven by common factors.
Death rate smoking Encouragement by peers in the past Age Gender Alcohol Error Other health conditions Car accident Genetic
= πΉ π1π2 πΎ2
β 0 ππ π1 πππ π2 πππ π ππππ’ππ πππ π πππ π2 πππ π ππππ’ππ
+ πΉ[π1π] β 0 β E u|X1 β 0
problem)
aviation sector; supply and demand function)
number of cigarettes per day, but we omit age =>error correlated with X
Y=Death rate X=# cigarets per day Young person, negative error from age Unbiased relationship for Y conditional on confounders Observed relationship
0, π 1 β₯ πΈ|π
continuum of counterfactual scenarios.
1 is a variance-weighted average treatment effect ( ~ a variance- weighted multiple matching without replacement).
an advantage over OLS.
model is not necessary.