Heterogeneity, Endogeneity and Causal Effect Estimation Kevin - - PDF document

heterogeneity endogeneity and causal effect estimation
SMART_READER_LITE
LIVE PREVIEW

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin - - PDF document

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard ttssr Oxford MFE This version: March 11, 2020 March 2020 Causal Effect Estimation Potential


slide-1
SLIDE 1

Heterogeneity, Endogeneity and Causal Effect Estimation

Kevin Sheppard

❤tt♣s✿✴✴✇✇✇✳❦❡✈✐♥s❤❡♣♣❛r❞✳❝♦♠

Oxford MFE

This version: March 11, 2020

March 2020

slide-2
SLIDE 2

Causal Effect Estimation

Potential Outcomes Challenges in Effect Estimation Experimental and Quasi-Experimenal Data ◮ Randomized Controlled Experiments and ATE ◮ Imperfect Compliance and LATE Observational Data ◮ Regression Discontinuity ◮ Difference-in-Difference ◮ Panel Models 2 / 50

slide-3
SLIDE 3

Potential Outcomes Framework

Observed outcome for individual or firm i

Yi

Di is the treatment status variable for individual i

Di =

  • if untreated

1

treated

Outcome variable is determined by

Yi = β0i + β1iDi

β1i is a heterogeneous treatment effect for individual i Also known as the potential outcomes model Two outcomes

Yi(0) = β0i and Yi(1) = β0i + β1i

3 / 50

slide-4
SLIDE 4

Key Measures

Definition (Average Treatment Effect (ATE))

The Average Treatment Effect measures the average effect of treatment across the entire population

ATE = E [β1i] = E [Yi(1)] − E [Yi (0)]

Definition (Average Treatment Effect on the Treated (TOT))

The Average Treatment Effect on the Treated measures the effect of treatment

  • n the treated

TOT = E

  • β1i|D = 1
  • = E
  • Yi(1)|D = 1
  • − E
  • Yi (0) |D = 1
  • 4 / 50
slide-5
SLIDE 5

ATE and TOT

ATE is a weighted average

ATE = ωTOT + (1 − ω) TUT

Average Treatment Effect on the Untreated (TUT)

TUT = E

  • β1i|D = 0
  • = E
  • Y1i|D = 0
  • − E
  • Y0i|D = 0
  • ω = Pr [D = 1] if the probability treated

Should we measure ATE or TOT? ◮ TOT makes sense when treatment is non-compulsory Individuals who do not undertake treatment are not relevant for cost-benefit

calculation

◮ ATE is more sensible for mandatory programs Measures the effect on both those who would like to participate and those who

would not

5 / 50

slide-6
SLIDE 6

Naïve estimation

Estimate the regression on observed data

Yi = b0 + b1Di + εi

◮ ˆ

b0i

p

→ E

  • Yi|D = 0
  • ◮ ˆ

b1

p

→ E

  • Yi|D = 1
  • − E
  • Yi|D = 0
  • .

Leads to selection bias

E

  • Yi|D = 1
  • − E
  • Yi|D = 0
  • Observed Effect

=E

  • Yi(1)|D = 1
  • − E
  • Yi(0)|D = 1
  • Avg. Treatment Effect on the Treated (TOT)

+ E

  • Yi(0)|D = 1
  • − E
  • Yi(0)|D = 0
  • Selection Bias (SB)

In terms of the regression

E

  • ˆ

b1

  • Observed Effect

=E

  • β1i|D = 1
  • TOT

+ E

  • β0i|D = 1
  • − E
  • β0i|D = 0
  • Selection Bias (SB)

SB is the difference in the no-treatment outcomes for the treated and

untreated

6 / 50

slide-7
SLIDE 7

(Missing) Counterfactuals

Fundamental problem: Cannot see counterfactual

Treatment (Di) 1 Observe

Yi (0) = β0i Yi (1) = β0i + β1i

Counterfactual

Yi (1) = β0i + β1i Yi (0) = β0i

No data on Yi (1) when Di = 0 and Yi (0) when Di = 1 TOT measures the effect conditional on receiving treatment ◮ Missing counterfactual: E

  • Yi (0) |D = 1
  • Observed effect is contaminated with selection bias

7 / 50

slide-8
SLIDE 8

Example: Financial Stress and Payday Loans

Example: Financial Stress and Payday loans Outcome is a measure of financial distress: 90-days delinquent on a debt Treatment is taking out a payday loan TOT: Difference in delinquency if loan taken or not given loan wanted

(D = 1)

SB: Difference in outcome if loan not taken for those who want a loan and

those who do not want a loan

◮ Plausible TOT is negative but SB is positive ◮ Positive SB if

E

  • β0i|D = 1
  • > E
  • β0i|D = 0
  • Default rates absent a loan are higher for loan takers than for non-takers

◮ Observed effect could have either sign 8 / 50

slide-9
SLIDE 9

Randomization

Randomization removes selection bias Well executed Randomized Controlled Trials are the gold standard for

causal effect estimation

A RCT ensures that

{β0i, β1i} ⊥ ⊥ Di and {Yi (0) , Yi (1)} ⊥ ⊥ Di

Randomly give loans only to those seeking them ◮ Creates group with Yi (0) as if D = 1

Independence and Conditioning

If Z and W are independent random variables, then

E[Z|W = w1] = E[Z|W = w2] = E [Z].

Knowledge of W provides no information about Z. 9 / 50

slide-10
SLIDE 10

Randomization

Gains to Randomization

E

  • Yi(0)|D = 0
  • = E
  • Yi(1)|D = 1
  • and E
  • β0i|D = 1
  • = E
  • β0i|D = 0
  • since

treatment independent of desire to be treated

Track outcomes of both groups

E

  • Yi|D = 1
  • − E
  • Yi|D = 0
  • Observed Effect with Randomization

= E

  • Yi(1)|D = 1
  • − E
  • Yi(0)|D = 0
  • = E
  • Yi(1)|D = 1
  • − E
  • Yi(0)|D = 1
  • In the notation of a regression model

E

  • ˆ

b1

  • = E
  • β1i|D = 1
  • +
  • E
  • β0i|D = 1
  • − E
  • β0i|D = 0
  • = E
  • β1i|D = 1
  • +
  • E
  • β0i|D = 1
  • − E
  • β0i|D = 1
  • = E
  • β1i|D = 1
  • 10 / 50
slide-11
SLIDE 11

Issues Affecting RCT Validity

Internal Validity: are the results valid for the sample used? ◮ Is the assignment actual random?

Xit = α + βDit + εt, H0 : β = 0, H1 : β = 0

◮ Are participants complying? ◮ Are there spill-overs of non-rival treatments to non-treated? ◮ Hawthorne Effect: studying a subject changes their behavior External Validity: do the results generalize to a broader sample? ◮ Is the RCT sample representative of the target population? ◮ Are there other key personnel that are essential for success? 11 / 50

slide-12
SLIDE 12

LATE: Local Average Treatment Effects

Previous result requires perfect compliance ◮ Treated if offered, not-treated if not offered When treatment is not random, or compliance is not perfect, simple

estimators are not consistent

Possible to use an instrument to recover a meaningful measure of

treatment effect

Measure is local in the sense that it measures the effect of a particular

subgroup of the treated

Notation ◮ Di is treatment status ◮ Zi is treatment assignment (offer to treat) Compliance ◮ Perfect if Di = Zi ◮ Imperfect if Di = Zi for some i Zi may be random even if Di is not ◮ Treatment assignment is made by lottery due to limited capacity (Zi) ◮ Treatment status conditional on offer depends on expected benefits (Di) 12 / 50

slide-13
SLIDE 13

System of Equations

Leads to two-equation system

Structural Equation Yi = β0i + β1iDi Treatment Equation Di = π0i + π1iZi

Causal chain Zi → Di → Yi Treatment equation measures potential treatment status

Di (z) = π0i + π1iz

◮ Di (0) = π0i is status when not assigned ◮ Di (1) = π0i + π1i is status when not assigned ◮ Both Di (0) and Di (1) may be 0 or 1 Treatment responsiveness π1i is heterogeneous like treatment effect β1i 13 / 50

slide-14
SLIDE 14

Independence

Assumption (Independence)

The potential outcomes and potential treatment assignments are independent

  • f Zi

{β0i, β1i, π0i, π1i} ⊥ ⊥ Zi

Often described as as if randomly assigned Note that the instrument is independent of the potential treatment status Zi does not affect the probability that either occur (π•i) Zi does not affect the outcomes if treatment is taken or not (β•i) Is this a reasonable assumption? ◮ Often plausible when Zi is assigned using randomization (lottery) ◮ Sometimes plausible for Zi taken from observational data 14 / 50

slide-15
SLIDE 15

Exclusion

Assumption (Exclusion Restriction)

The instrument does not appear in the structural equation so that only treatment assignment affect the outcome.

Violations of the exclusion restriction mean that Zi affects Yi through more

than just Di

Classic example is when Zi directly affects both Yi and Di In many cases, Zi affects Di and another variable Xi which in turn affects

Yi Zi → Di → Yi, Zi → Xi → Yi

Suppose selection for a randomly assigned government funding program

increases probability of program participation (Zi → Di)

If selection also increases the probability that a firm receives series B

funding, than effect confounded with fund raising (Zi → Xi)

Exclusion restriction ensures that Z does not affect the potential outcome

Y (0)i = β0i and Y (1)i = β0i + βi1 for Z ∈ {0, 1}

15 / 50

slide-16
SLIDE 16

Instrumental Variable Estimation

The 2SLS estimator obtained by

  • 1. Regress Di = p0 + p1Zi + ηi and retain ˆ

Di = ˆ p0 + ˆ p1Zi

  • 2. Regress Yi = b0 + b1 ˆ

Di + εi

In large samples

ˆ b2SLS

1 p

→ E [β1iπ1i]

E [π1i] = E

  • β1i

π1i E [π1i]

  • = LATE

LATE is a weighted average of treatment effects Weights are determined by responsiveness to treatment assignment ◮ Holds if either of Di or Zi are not binary If effects are not heterogeneous (β1i = β1 or π1i = π1) then LATE = ATE 16 / 50

slide-17
SLIDE 17

Types of Participants

Useful to describe structure implied by Di and Zi Four types of program participants ◮ Compliers: Di = Zi (π0i = 0, π1i = 1) ◮ Always-takers: Di = 1 for any Zi (π0i = 1, π1i = 0) ◮ Never-takers: Di = 0 for any Zi (π0i = π1i = 0) ◮ Defiers: Di = 1 − Zi (π0i = 1, π1i = −1) Compliers are the ideal candidates and ultimately what we can measure Defiers invalidate measurement using the instrument LATE is determined only by compliers and defiers 17 / 50

slide-18
SLIDE 18

No Defiers

Assumption (No Defiers)

There are no defiers, so that π1i ≥ 0. With this additional assumption

LATE = E

  • β1i|π1i = 1
  • so that LATE only measures the treatment response of the compliers.

18 / 50

slide-19
SLIDE 19

Intention-to-Treat (ITT)

Common to report the effect on the intention-to-treat

ITT = E

  • Yi|Z = 1
  • − E
  • Yi|Z = 0
  • = E [β1iπ1i]

= LATE × E [π1i]

Difference in outcomes conditional on only the instrument which

measures the intention to treat

With perfect compliance ITT = LATE 19 / 50

slide-20
SLIDE 20

Example

The Oregon Health Insurance Experiment

2008 Medicaid expansion in US state or Oregon Used a lottery to choose participants from a waiting list Constructed a control group from non-winners Not everyone selected participated in the program (imperfect compliance) Non-selected prohibited from participation Finkelstein, A., Taubman, S., Wright, B., Bernstein, M., Gruber, J., Newhouse, J.P., Allen, H., Baicker, K. and Oregon Health Study Group, 2012. The Oregon Health Insurance Experiment: evidence from the first year. The Quarterly journal of economics, 127(3), pp.1057-1106. 20 / 50

slide-21
SLIDE 21

Estimation

The Oregon Health Insurance Experiment

Estimating Intent-to-Treat (ITT)

Yihj = β0 + β1LOTTERYh + Xihβ 2 + Vihβ 3 + εihj

◮ i: individual, h: household, j: domain of variable ◮ LOTTERYh indicates household was selected by the lottery (Zi = 1) ◮ Xih are required controls and Vih are optional controls Estimating LATE

Yihj = π0 + π1INSURANCEih + Xihπ2 + Vihπ3 + νihj

◮ INSURANCEih is a measure of insurance coverage (Di) ◮ Use 2SLS

INSURANCEih = δ0 + δ1LOTTERYh + Xihδ2 + Vihδ3 + ξihj

◮ Insurance is “Ever on Medicaid” 21 / 50

slide-22
SLIDE 22

Results

The Oregon Health Insurance Experiment

TABLE VIII FINANCIAL STRAIN (SURVEY DATA)

Control mean ITT LATE p-values (1) (2) (3) (4) Any out of pocket medical expenses, last six months 0.555 0.058 0.200 [<0.0001] (0.497) (0.0077) (0.026) {<0.0001} Owe money for medical expenses currently 0.597 0.052 0.180 [<0.0001] (0.491) (0.0076) (0.026) {<0.0001} Borrowed money or skipped other bills to pay medical bills, last six months 0.364 0.045 0.154 [<0.0001] (0.481) (0.0073) (0.025) {<0.0001} Refused treatment because of med- ical debt, last six months 0.081 0.011 0.036 [0.01] (0.273) (0.0041) (0.014) {0.01} Standardized treatment effect 0.089 0.305 [<0.0001] (0.010) (0.035)

22 / 50

slide-23
SLIDE 23

Difference-in-Difference

Repeated cross-sections across multiple periods Examine evidence between treated group and control group Control chosen to be similar except for treatment Basic model

Yit = αi + βt + εit

Assume two periods, treatment only in second Treated individuals or firms have

Yit = αi + βt + δDit + εit

23 / 50

slide-24
SLIDE 24

Difference-in-Difference

Two groups, A (treated) and B (untreated) Construct averages

E

¯

YAt

  • = αA + βt + δI[t=2], t = 1, 2

E

¯

YBt

  • = αB + βt

Difference across groups removes own effects

E

¯

YA2 − ¯ YA1

  • = β2 − β1 + δ

E

¯

YB2 − ¯ YB1

  • = β2 − β1

Difference across time removes time effects

E

¯

YA2 − ¯ YB2

  • = αA − αB + δ

E

¯

YA1 − ¯ YB1

  • = αA − αB

Solution is to difference twice

E

¯

YA2 − ¯ YA1 − ¯ YB2 − ¯ YB1

  • = δ

24 / 50

slide-25
SLIDE 25

Difference-in-Difference

Key assumption

Assumption (Counterfactual (Parallel Trends))

E

  • Yi2 − Yi1|D = 1
  • = E
  • Yi2 − Yi1|D = 0
  • The growth rate does not depend on whether the unit is treated (D = 1) or

untreated.

Uses lagged average of A to remove group-specific effects Generates counter-factual using group B 25 / 50

slide-26
SLIDE 26

Difference-in-Difference

Counter-factual can be interpreted as parallel line

t = 1 t = 2

E[YA1] E[YA2|D = 0] E[YA2|D = 1] E[YB1] E[YB2|D = 0]

Effect

26 / 50

slide-27
SLIDE 27

Difference-in-Difference

Estimated as a regression

Yit = αi + βt + δI[i∈A,t=2] + εit

◮ Extends trivially to include other controls

Yit = αi + βt + δI[i∈A,t=2] + Xitγ + εit

Uses dummy variables

Yit = αAI[i∈A] + αBI[i∈B] + β1I[t=1] + β2I[t=2] + δI[i∈A,t=2] + εit

◮ Extends to multiple groups and time periods

Yit =

G

  • g=1

αgI[i∈Gg] +

T

  • j=1

βjI[t=j] + δI[t≥τi] + Xitγ + εit

where τi is the treatment time for individual i

Main issue: Violation of parallel trends assumption

βAt = βBt

27 / 50

slide-28
SLIDE 28

Non-parallel Trends

t = 1 t = 2 t = 3 t = 4 t = 5 False Effect

Treated Untreated False Counterfactual Intervention

28 / 50

slide-29
SLIDE 29

Example

The Distortive Effects of Too Big To Fail: Evidence from the Danish Market for Retail Deposits

Examine how uninsured deposits are affected by implicit Too Big Too Fail

(TBTF) guarantees

Variation introduced by changes in the limits of the amounts insured Main model

ln (Deposits)btk = α+β ×Abovek +β2 ×Aftert +β3 ×Abovek ×Aftert +Xiγ+εbtk

◮ Bank b, Year t ◮ k indicates the range of deposits in DKK 50,000 bins (e.g., 0–50K,

50K–100K, ...)

◮ Interest in the difference between accounts that remain insured with those

that become uninsured

Key variable is Abovek × Aftert which would have a 0 coefficient if no effect Iyer, R., Lærkholm Jensen, T., Johannesen, N., & Sheridan, A. (2019). The Distortive Effects of Too Big To Fail: Evidence from the Danish Market for Retail

  • Deposits. The Review of Financial Studies, 32(12), 4653-4695.

29 / 50

slide-30
SLIDE 30

Results

The Distortive Effects of Too Big To Fail: Evidence from the Danish Market for Retail Deposits

  • .6
  • .4
  • .2

.2 .4 Change relative to 2009 (log-points) 2006:12 2007:12 2008:12 2009:12 2010:12 2011:12 Year Below 750k Above 750k

30 / 50

slide-31
SLIDE 31

Results

The Distortive Effects of Too Big To Fail: Evidence from the Danish Market for Retail Deposits

Table 2 Baseline results (1) (2) (3) (4) Deposits (in logs) After reform × Above limit −0.363∗∗∗ −0.373∗∗∗ −0.378∗∗∗ −0.366∗∗∗ (0.0462) (0.0430) (0.0501) (0.0500) After reform 0.292∗∗∗ 0.304∗∗∗ 0.305∗∗∗ (0.0403) (0.0394) (0.0456) Above limit −0.578∗∗∗ −0.593∗∗∗ (0.0214) (0.0187) Equity / debt (in 2007) 1.941 (1.452) Loans / assets (in 2007) 1.302 (0.834) Log of assets (in 2007) 0.802∗∗∗ (0.0398) Constant 1.154 (0.773) Observations 3,507 3,507 3,507 3,507 R-squared .869 .951 .970 .990 Bank FEs No Yes Yes Yes Bank-range FEs No No Yes Yes Bank-time FEs No No No Yes

31 / 50

slide-32
SLIDE 32

Results

The Distortive Effects of Too Big To Fail: Evidence from the Danish Market for Retail Deposits

  • .6
  • .4
  • .2

.2 .4 Change relative to 2009 (log-points) 2006:12 2007:12 2008:12 2009:12 2010:12 2011:12 Year Systemic below 750k Systemic above 750k Non-systemic below 750k Non-systemic above 750k

32 / 50

slide-33
SLIDE 33

Regression Discontinuity

Regression discontinuity exploits discrete jumps in treatment as a

function of an observable Xi

Individuals with Xi ≤ X are untreated Individuals with Xi ≤ X are treated Theoretical framework examines the difference locally with X

E

  • Yi|0 < Xi − X < h
  • − E
  • Yi|0 < X − Xi < h
  • h is a bandwidth parameter that shrinks to 0 as the sample size increase

Intuition is that individuals are homogeneous local to X 33 / 50

slide-34
SLIDE 34

RDD Practice

In practice often included as part of a model

Yi = β0 + β1Xi + δI[Xi≥X] + εi

Can use more sophisticated models Causality from model requires treatment indicator I[Xi≥X] to be

uncorrelated with omitted variables

Also requires functional form to be correct that that E

  • εi|X
  • = 0

◮ Rules out neglected nonlinearities 34 / 50

slide-35
SLIDE 35

Regression Discontinuity

Linear Model with a Discontinuity

  • 3
  • 2
  • 1

1 2 3

  • 3

3 6 9 Treatment Threshold Fitted Regression Estimated Effect (Discontinuity)

Neglected Nonlinearity

  • 3
  • 2
  • 1

1 2 3

  • 3

3 6 9

E[Y |X ]

Fitted Regression Estimated Effect

35 / 50

slide-36
SLIDE 36

Local Estimation

  • 0.4
  • 0.2

0.0 0.2 0.4 5

N=600

  • 0.2
  • 0.1

0.0 0.1 0.2 5

N=2400

  • 0.10
  • 0.05

0.00 0.05 0.10 5

N=9600

E[Y |X ]

Fitted Regression Estimated Effect

36 / 50

slide-37
SLIDE 37

Example

Does Corporate Social Responsibility Lead to Superior Financial Performance?

Investigate the effect of CSR proposals that just pass or just fail board

votes

Key assumption: the difference between just passing and failing is as if

random

◮ Close failures and close passes are identical aside from the vote Regress abnormal returns on a dummy for passing ◮ Returns computed using Carhart 4 factor model (FF3 + Momentum) Use RDD estimate is based on a small window near a tied vote Also considers a full model which is piece-wise polynomial

yit = β × Dit + Pl

  • vit, γl
  • + Pr
  • vit, γr
  • ◮ P•
  • vit, γ•
  • is a polynomial in the vote shape for left and right of passing

Flammer, C. (2015). Does corporate social responsibility lead to superior financial performance? A regression discontinuity approach. Management Science, 61(11), 2549-2568. 37 / 50

slide-38
SLIDE 38

Results

Does Corporate Social Responsibility Lead to Superior Financial Performance? A RD Approach

38 / 50

slide-39
SLIDE 39

Results

Does Corporate Social Responsibility Lead to Superior Financial Performance? A RD Approach

–0.010 –0.005 0.000 0.005 0.010 0.015 0.020 –50 –45 –40 –35 –30 –25 –20 –15 –10 –5 5 10 15 20 25 30 35 40 45 50

Abnormal return on the day of the vote

Victory margin (2% bins)

vertical axis indicates abnormal returns on the day of the vote. Abnormal returns are computed using the four-factor model of Carhart (1997).

39 / 50

slide-40
SLIDE 40

Panel Data and Fixed Effects

Panel data is double indexed

Yit

◮ i is the entity (or unit): Traders, Firms, Borrowers, ... ◮ t is the time period Panels track entities over time N entities, T time periods ◮ N is assumed to be large, T is usually small ◮ Asymptotics assume N → ∞ 40 / 50

slide-41
SLIDE 41

Panel Data and Fixed Effects

Panel data can be used to estimate pooled OLS models

Yit = Xitβ + εit

◮ Ignores the panel structure Panel structure allows us to model unobserved heterogeneity

Yit = Xitβ + Wiγ + εit

Wi is a vector of entity-specific characteristics Key: Assumed to be time invariant Estimating pooled OLS results in biased coefficients if Wi is correlated

with Xit

In large samples,

ˆ β

p

→ β + Λγ

Wi = XitΛ + ηit

◮ This is omitted variable bias 41 / 50

slide-42
SLIDE 42

Pooled OLS

  • 1

1 2 3 4 5 6

  • 2

2 4 6 Entity 1 Entity 2 Entity 3 Pooled OLS

42 / 50

slide-43
SLIDE 43

Panel Data and Fixed Effects

Could collect data on Wi if available, and include in the model In many plausible scenarios it is not observable ◮ Individual ability or intrinsic motivation ◮ Firm management culture Fixed Effect estimator allow β to be estimated when Wi is not known Note that Wiγ is a constant for entity i

Yit = Xitβ + ωi + εit

Demean entity-by-entity

Yit − ¯ Yi =

  • Xit − ¯

Xi

  • β + (ωi − ¯

ωi) + (εit − ¯ εi)

ωi is time-invariant so ωi − ¯

ωi = 0 Yit − ¯ Yi =

  • Xit − ¯

Xi

  • β + (εit − ¯

εi)

Note that FE models cannot estimate time-invariant effects 43 / 50

slide-44
SLIDE 44

Panel Data and Fixed Effects

Model is equivalent to including entity dummies

Yit = Xitβ + Diλi + εit

When T = 2 identical to first-difference estimator

Yi2 − Yi1 = (Xi2 − Xi1) β + (εi2 − εi1)

Inefficient to use first difference estimator for T ≥ 3 44 / 50

slide-45
SLIDE 45

Fixed Effects Regression

  • 2
  • 1

1 2

  • 2
  • 1

1 2 Entity 1 (Demeaned) Entity 2 (Demeaned) Entity 3 (Demeaned) Fixed-Effect Regression

45 / 50

slide-46
SLIDE 46

Panel Data and Fixed Effects

Estimates of ˆ

ωi are not consistent when T is finite

Estimated using OLS on entity-wise demeaned data

˜ Yit = ˜ Xitβ + ˜ εit

◮ Known as the Least Squares Dummy Variable (LSDV) estimator ◮ Intercept is not meaningful in FE models when reported 46 / 50

slide-47
SLIDE 47

Inference in Fixed Effects Models

Robust inference requires a clustered variance covariance estimator ◮ White Covariance

Σ−1

XX SΣ−1 XX

S = E

  • ˜

ε2

it ˜

X′

it ˜

Xit

  • , ˆ

S = (NT)−1

N

  • i=1

T

  • t=1

ˆ ˜ ε2

it ˜

x′

it ˜

xit

◮ Clustered (Rogers) covariance

SC = E

  • ξ′ξ
  • , ˆ

S = N−1

N

  • i=1

ˆ ξ

′ ˆ

ξ ξ =

T

  • t=1

˜ εit ˜ Xit

Replace S when an estimator that allows dependence within entity ξ′ξ contains all squares and cross-products Imposes no restrictions on the dependence within an entity 47 / 50

slide-48
SLIDE 48

Time Effects

Panels models often include time effects

Yit = Xitβ + ωi + γt + εit

γt is a shock that affects all observations in period t ◮ Commonly used for model common aggregate movement Estimated regression use data demeaned using entity and time period When N is large and T is small, time effects are consistently estimated ◮ Does not need special treatment for inference ◮ Identical to including dummies for each time period In general, fixed effects can be used to remove constant effects in any

dimension with repeated observations

◮ Industry ◮ City, County, State, or Regional 48 / 50

slide-49
SLIDE 49

Example

Real effects of workers’ financial distress: Evidence from teacher spillovers

Examine how test pass rate is affected by teacher financial distress

ycgt = β × Bankruptcycgt + Xcgtγ + Zcgtλ + δdt + ηgt + φcg

  • Fixed Effects

+ εcgt

◮ Subscripts: c: campus g: grade of student t: year ◮ Bankruptcycgt is teacher bankruptcy ◮ Xcgt are average teacher characteristics ◮ Zcgtare student demographic characteristics Maturana, G., & Nickerson, J. (2019). Real effects of workers’ financial distress: Evidence from teacher spillovers. Journal of Financial Economics. 49 / 50

slide-50
SLIDE 50

Example

Real effects of workers’ financial distress: Evidence from teacher spillovers

ycgt = β × Bankruptcycgt + Xcgtγ + Zcgtλ + δdt + ηgt + φcg

  • Fixed Effects

+ εcgt

Fixed Effects ◮ δdt - District-Year: Control for local economic conditions ◮ ηgt- Grade-Year: Control for changes to the test across grade and time ◮ φcg- Campus-Grade: Control for heterogeneity across different campuses

and grades

Estimates are computed as deviations from all three FE Standard errors are clustered by campus-grade and campus-year ◮ CG allows arbitrary correlation within all students in a single grade and

campus in any year

◮ CY allows arbitrary correlation within all students in a single campus and

year across all grades

50 / 50