heterogeneity endogeneity and causal effect estimation
play

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin - PDF document

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard ttssr Oxford MFE This version: March 11, 2020 March 2020 Causal Effect Estimation Potential


  1. Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard ❤tt♣s✿✴✴✇✇✇✳❦❡✈✐♥s❤❡♣♣❛r❞✳❝♦♠ Oxford MFE This version: March 11, 2020 March 2020

  2. Causal Effect Estimation � Potential Outcomes � Challenges in Effect Estimation � Experimental and Quasi-Experimenal Data ◮ Randomized Controlled Experiments and ATE ◮ Imperfect Compliance and LATE � Observational Data ◮ Regression Discontinuity ◮ Difference-in-Difference ◮ Panel Models 2 / 50

  3. Potential Outcomes Framework � Observed outcome for individual or firm i Y i � D i is the treatment status variable for individual i � if untreated 0 D i = treated 1 � Outcome variable is determined by Y i = β 0 i + β 1 i D i � β 1 i is a heterogeneous treatment effect for individual i � Also known as the potential outcomes model � Two outcomes Y i ( 0 ) = β 0 i and Y i ( 1 ) = β 0 i + β 1 i 3 / 50

  4. Key Measures Definition (Average Treatment Effect (ATE)) The Average Treatment Effect measures the average effect of treatment across the entire population ATE = E [ β 1 i ] = E [ Y i ( 1 )] − E [ Y i ( 0 )] Definition (Average Treatment Effect on the Treated (TOT)) The Average Treatment Effect on the Treated measures the effect of treatment on the treated � � � � � � TOT = E β 1 i | D = 1 = E Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 1 4 / 50

  5. ATE and TOT � ATE is a weighted average ATE = ω TOT + ( 1 − ω ) TUT � Average Treatment Effect on the Untreated ( TUT ) � � � � � � β 1 i | D = 0 Y 1 i | D = 0 − E Y 0 i | D = 0 TUT = E = E � ω = Pr [ D = 1 ] if the probability treated � Should we measure ATE or TOT? ◮ TOT makes sense when treatment is non-compulsory � Individuals who do not undertake treatment are not relevant for cost-benefit calculation ◮ ATE is more sensible for mandatory programs � Measures the effect on both those who would like to participate and those who would not 5 / 50

  6. Naïve estimation � Estimate the regression on observed data Y i = b 0 + b 1 D i + ε i p ◮ ˆ � � b 0 i → E Y i | D = 0 p ◮ ˆ � � � � → E Y i | D = 1 − E Y i | D = 0 . b 1 � Leads to selection bias � � � � � � � � Y i | D = 1 − E Y i | D = 0 = E Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 1 E � �� � � �� � Observed Effect Avg. Treatment Effect on the Treated (TOT) � � � � Y i ( 0 ) | D = 1 − E Y i ( 0 ) | D = 0 + E � �� � Selection Bias (SB) � In terms of the regression � � � � � � � � ˆ = E β 1 i | D = 1 + E β 0 i | D = 1 − E β 0 i | D = 0 E b 1 � �� � � �� � � �� � TOT Selection Bias (SB) Observed Effect � SB is the difference in the no-treatment outcomes for the treated and untreated 6 / 50

  7. (Missing) Counterfactuals � Fundamental problem: Cannot see counterfactual Treatment ( D i ) 0 1 Observe Y i ( 0 ) = β 0 i Y i ( 1 ) = β 0 i + β 1 i Counterfactual Y i ( 1 ) = β 0 i + β 1 i Y i ( 0 ) = β 0 i � No data on Y i ( 1 ) when D i = 0 and Y i ( 0 ) when D i = 1 � TOT measures the effect conditional on receiving treatment ◮ Missing counterfactual: E � � Y i ( 0 ) | D = 1 � Observed effect is contaminated with selection bias 7 / 50

  8. Example: Financial Stress and Payday Loans � Example: Financial Stress and Payday loans � Outcome is a measure of financial distress: 90-days delinquent on a debt � Treatment is taking out a payday loan � TOT : Difference in delinquency if loan taken or not given loan wanted ( D = 1 ) � SB : Difference in outcome if loan not taken for those who want a loan and those who do not want a loan ◮ Plausible TOT is negative but SB is positive ◮ Positive SB if � � � � β 0 i | D = 1 > E β 0 i | D = 0 E � Default rates absent a loan are higher for loan takers than for non-takers ◮ Observed effect could have either sign 8 / 50

  9. Randomization � Randomization removes selection bias � Well executed Randomized Controlled Trials are the gold standard for causal effect estimation � A RCT ensures that { β 0 i , β 1 i } ⊥ ⊥ D i and { Y i ( 0 ) , Y i ( 1 ) } ⊥ ⊥ D i � Randomly give loans only to those seeking them ◮ Creates group with Y i ( 0 ) as if D = 1 Independence and Conditioning If Z and W are independent random variables, then E [ Z | W = w 1 ] = E [ Z | W = w 2 ] = E [ Z ] . � Knowledge of W provides no information about Z . 9 / 50

  10. Randomization Gains to Randomization � � � � � � � � E Y i ( 0 ) | D = 0 = E Y i ( 1 ) | D = 1 and E β 0 i | D = 1 = E β 0 i | D = 0 since treatment independent of desire to be treated � Track outcomes of both groups � � � � � � � � Y i | D = 1 − E Y i | D = 0 Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 0 E = E � �� � Observed Effect with Randomization � � � � Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 1 = E � In the notation of a regression model � � � � � � � � �� ˆ β 1 i | D = 1 β 0 i | D = 1 − E β 0 i | D = 0 E b 1 = E + E � � � � � � �� β 1 i | D = 1 β 0 i | D = 1 − E β 0 i | D = 1 = E + E � � β 1 i | D = 1 = E 10 / 50

  11. Issues Affecting RCT Validity � Internal Validity: are the results valid for the sample used? ◮ Is the assignment actual random? X it = α + β D it + ε t , H 0 : β = 0, H 1 : β � = 0 ◮ Are participants complying? ◮ Are there spill-overs of non-rival treatments to non-treated? ◮ Hawthorne Effect: studying a subject changes their behavior � External Validity: do the results generalize to a broader sample? ◮ Is the RCT sample representative of the target population? ◮ Are there other key personnel that are essential for success? 11 / 50

  12. LATE : Local Average Treatment Effects � Previous result requires perfect compliance ◮ Treated if offered, not-treated if not offered � When treatment is not random, or compliance is not perfect, simple estimators are not consistent � Possible to use an instrument to recover a meaningful measure of treatment effect � Measure is local in the sense that it measures the effect of a particular subgroup of the treated � Notation ◮ D i is treatment status ◮ Z i is treatment assignment (offer to treat) � Compliance ◮ Perfect if D i = Z i ◮ Imperfect if D i � = Z i for some i � Z i may be random even if D i is not ◮ Treatment assignment is made by lottery due to limited capacity ( Z i ) ◮ Treatment status conditional on offer depends on expected benefits ( D i ) 12 / 50

  13. System of Equations � Leads to two-equation system Structural Equation Y i = β 0 i + β 1 i D i Treatment Equation D i = π 0 i + π 1 i Z i � Causal chain Z i → D i → Y i � Treatment equation measures potential treatment status D i ( z ) = π 0 i + π 1 i z ◮ D i ( 0 ) = π 0 i is status when not assigned ◮ D i ( 1 ) = π 0 i + π 1 i is status when not assigned ◮ Both D i ( 0 ) and D i ( 1 ) may be 0 or 1 � Treatment responsiveness π 1 i is heterogeneous like treatment effect β 1 i 13 / 50

  14. Independence Assumption (Independence) The potential outcomes and potential treatment assignments are independent of Z i { β 0 i , β 1 i , π 0 i , π 1 i } ⊥ ⊥ Z i � Often described as as if randomly assigned � Note that the instrument is independent of the potential treatment status � Z i does not affect the probability that either occur ( π • i ) � Z i does not affect the outcomes if treatment is taken or not ( β • i ) � Is this a reasonable assumption? ◮ Often plausible when Z i is assigned using randomization (lottery) ◮ Sometimes plausible for Z i taken from observational data 14 / 50

  15. Exclusion Assumption (Exclusion Restriction) The instrument does not appear in the structural equation so that only treatment assignment affect the outcome. � Violations of the exclusion restriction mean that Z i affects Y i through more than just D i � Classic example is when Z i directly affects both Y i and D i � In many cases, Z i affects D i and another variable X i which in turn affects Y i Z i → D i → Y i , Z i → X i → Y i � Suppose selection for a randomly assigned government funding program increases probability of program participation ( Z i → D i ) � If selection also increases the probability that a firm receives series B funding, than effect confounded with fund raising ( Z i → X i ) � Exclusion restriction ensures that Z does not affect the potential outcome Y ( 0 ) i = β 0 i and Y ( 1 ) i = β 0 i + β i 1 for Z ∈ { 0, 1 } 15 / 50

  16. Instrumental Variable Estimation � The 2SLS estimator obtained by 1. Regress D i = p 0 + p 1 Z i + η i and retain ˆ D i = ˆ p 0 + ˆ p 1 Z i 2. Regress Y i = b 0 + b 1 ˆ D i + ε i � In large samples � � → E [ β 1 i π 1 i ] π 1 i p b 2 SLS ˆ = E β 1 i = LATE 1 E [ π 1 i ] E [ π 1 i ] � LATE is a weighted average of treatment effects � Weights are determined by responsiveness to treatment assignment ◮ Holds if either of D i or Z i are not binary � If effects are not heterogeneous ( β 1 i = β 1 or π 1 i = π 1 ) then LATE = ATE 16 / 50

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend