Heterogeneity, Endogeneity and Causal Effect Estimation Kevin - PDF document

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard ❤tt♣s✿✴✴✇✇✇✳❦❡✈✐♥s❤❡♣♣❛r❞✳❝♦♠ Oxford MFE This version: March 11, 2020 March 2020

Causal Effect Estimation � Potential Outcomes � Challenges in Effect Estimation � Experimental and Quasi-Experimenal Data ◮ Randomized Controlled Experiments and ATE ◮ Imperfect Compliance and LATE � Observational Data ◮ Regression Discontinuity ◮ Difference-in-Difference ◮ Panel Models 2 / 50

Potential Outcomes Framework � Observed outcome for individual or firm i Y i � D i is the treatment status variable for individual i � if untreated 0 D i = treated 1 � Outcome variable is determined by Y i = β 0 i + β 1 i D i � β 1 i is a heterogeneous treatment effect for individual i � Also known as the potential outcomes model � Two outcomes Y i ( 0 ) = β 0 i and Y i ( 1 ) = β 0 i + β 1 i 3 / 50

Key Measures Definition (Average Treatment Effect (ATE)) The Average Treatment Effect measures the average effect of treatment across the entire population ATE = E [ β 1 i ] = E [ Y i ( 1 )] − E [ Y i ( 0 )] Definition (Average Treatment Effect on the Treated (TOT)) The Average Treatment Effect on the Treated measures the effect of treatment on the treated � � � � � � TOT = E β 1 i | D = 1 = E Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 1 4 / 50

ATE and TOT � ATE is a weighted average ATE = ω TOT + ( 1 − ω ) TUT � Average Treatment Effect on the Untreated ( TUT ) � � � � � � β 1 i | D = 0 Y 1 i | D = 0 − E Y 0 i | D = 0 TUT = E = E � ω = Pr [ D = 1 ] if the probability treated � Should we measure ATE or TOT? ◮ TOT makes sense when treatment is non-compulsory � Individuals who do not undertake treatment are not relevant for cost-benefit calculation ◮ ATE is more sensible for mandatory programs � Measures the effect on both those who would like to participate and those who would not 5 / 50

Naïve estimation � Estimate the regression on observed data Y i = b 0 + b 1 D i + ε i p ◮ ˆ � � b 0 i → E Y i | D = 0 p ◮ ˆ � � � � → E Y i | D = 1 − E Y i | D = 0 . b 1 � Leads to selection bias � � � � � � � � Y i | D = 1 − E Y i | D = 0 = E Y i ( 1 ) | D = 1 − E Y i ( 0 ) | D = 1 E � �� Observed Effect Avg. Treatment Effect on the Treated (TOT) � � � � Y i ( 0 ) | D = 1 − E Y i ( 0 ) | D = 0 + E � �� Selection Bias (SB) � In terms of the regression � � � � � � � � ˆ = E β 1 i | D = 1 + E β 0 i | D = 1 − E β 0 i | D = 0 E b 1 � �� TOT Selection Bias (SB) Observed Effect � SB is the difference in the no-treatment outcomes for the treated and untreated 6 / 50

(Missing) Counterfactuals � Fundamental problem: Cannot see counterfactual Treatment ( D i ) 0 1 Observe Y i ( 0 ) = β 0 i Y i ( 1 ) = β 0 i + β 1 i Counterfactual Y i ( 1 ) = β 0 i + β 1 i Y i ( 0 ) = β 0 i � No data on Y i ( 1 ) when D i = 0 and Y i ( 0 ) when D i = 1 � TOT measures the effect conditional on receiving treatment ◮ Missing counterfactual: E � � Y i ( 0 ) | D = 1 � Observed effect is contaminated with selection bias 7 / 50

Example: Financial Stress and Payday Loans � Example: Financial Stress and Payday loans � Outcome is a measure of financial distress: 90-days delinquent on a debt � Treatment is taking out a payday loan � TOT : Difference in delinquency if loan taken or not given loan wanted ( D = 1 ) � SB : Difference in outcome if loan not taken for those who want a loan and those who do not want a loan ◮ Plausible TOT is negative but SB is positive ◮ Positive SB if � � � � β 0 i | D = 1 > E β 0 i | D = 0 E � Default rates absent a loan are higher for loan takers than for non-takers ◮ Observed effect could have either sign 8 / 50

Randomization � Randomization removes selection bias � Well executed Randomized Controlled Trials are the gold standard for causal effect estimation � A RCT ensures that { β 0 i , β 1 i } ⊥ ⊥ D i and { Y i ( 0 ) , Y i ( 1 ) } ⊥ ⊥ D i � Randomly give loans only to those seeking them ◮ Creates group with Y i ( 0 ) as if D = 1 Independence and Conditioning If Z and W are independent random variables, then E [ Z | W = w 1 ] = E [ Z | W = w 2 ] = E [ Z ] . � Knowledge of W provides no information about Z . 9 / 50

Issues Affecting RCT Validity � Internal Validity: are the results valid for the sample used? ◮ Is the assignment actual random? X it = α + β D it + ε t , H 0 : β = 0, H 1 : β � = 0 ◮ Are participants complying? ◮ Are there spill-overs of non-rival treatments to non-treated? ◮ Hawthorne Effect: studying a subject changes their behavior � External Validity: do the results generalize to a broader sample? ◮ Is the RCT sample representative of the target population? ◮ Are there other key personnel that are essential for success? 11 / 50

LATE : Local Average Treatment Effects � Previous result requires perfect compliance ◮ Treated if offered, not-treated if not offered � When treatment is not random, or compliance is not perfect, simple estimators are not consistent � Possible to use an instrument to recover a meaningful measure of treatment effect � Measure is local in the sense that it measures the effect of a particular subgroup of the treated � Notation ◮ D i is treatment status ◮ Z i is treatment assignment (offer to treat) � Compliance ◮ Perfect if D i = Z i ◮ Imperfect if D i � = Z i for some i � Z i may be random even if D i is not ◮ Treatment assignment is made by lottery due to limited capacity ( Z i ) ◮ Treatment status conditional on offer depends on expected benefits ( D i ) 12 / 50

System of Equations � Leads to two-equation system Structural Equation Y i = β 0 i + β 1 i D i Treatment Equation D i = π 0 i + π 1 i Z i � Causal chain Z i → D i → Y i � Treatment equation measures potential treatment status D i ( z ) = π 0 i + π 1 i z ◮ D i ( 0 ) = π 0 i is status when not assigned ◮ D i ( 1 ) = π 0 i + π 1 i is status when not assigned ◮ Both D i ( 0 ) and D i ( 1 ) may be 0 or 1 � Treatment responsiveness π 1 i is heterogeneous like treatment effect β 1 i 13 / 50

Independence Assumption (Independence) The potential outcomes and potential treatment assignments are independent of Z i { β 0 i , β 1 i , π 0 i , π 1 i } ⊥ ⊥ Z i � Often described as as if randomly assigned � Note that the instrument is independent of the potential treatment status � Z i does not affect the probability that either occur ( π • i ) � Z i does not affect the outcomes if treatment is taken or not ( β • i ) � Is this a reasonable assumption? ◮ Often plausible when Z i is assigned using randomization (lottery) ◮ Sometimes plausible for Z i taken from observational data 14 / 50

Exclusion Assumption (Exclusion Restriction) The instrument does not appear in the structural equation so that only treatment assignment affect the outcome. � Violations of the exclusion restriction mean that Z i affects Y i through more than just D i � Classic example is when Z i directly affects both Y i and D i � In many cases, Z i affects D i and another variable X i which in turn affects Y i Z i → D i → Y i , Z i → X i → Y i � Suppose selection for a randomly assigned government funding program increases probability of program participation ( Z i → D i ) � If selection also increases the probability that a firm receives series B funding, than effect confounded with fund raising ( Z i → X i ) � Exclusion restriction ensures that Z does not affect the potential outcome Y ( 0 ) i = β 0 i and Y ( 1 ) i = β 0 i + β i 1 for Z ∈ { 0, 1 } 15 / 50

Instrumental Variable Estimation � The 2SLS estimator obtained by 1. Regress D i = p 0 + p 1 Z i + η i and retain ˆ D i = ˆ p 0 + ˆ p 1 Z i 2. Regress Y i = b 0 + b 1 ˆ D i + ε i � In large samples � � → E [ β 1 i π 1 i ] π 1 i p b 2 SLS ˆ = E β 1 i = LATE 1 E [ π 1 i ] E [ π 1 i ] � LATE is a weighted average of treatment effects � Weights are determined by responsiveness to treatment assignment ◮ Holds if either of D i or Z i are not binary � If effects are not heterogeneous ( β 1 i = β 1 or π 1 i = π 1 ) then LATE = ATE 16 / 50

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin - PDF document

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard ttssr Oxford MFE This version: March 11, 2020 March 2020 Causal Effect Estimation Potential

Heterogeneity, Endogeneity and Causal Effect Estimation Kevin Sheppard

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Endogeneity and Instrumental Variables Ping Yu School of Economics and Finance The University of

Part VII Accounting for the Endogeneity of Schooling 327 / 785 Endogeneity of schooling Mean

Dealing With and Understanding Endogeneity Enrique Pinzn StataCorp LP October 20, 2016

Dealing With and Understanding Endogeneity Enrique Pinzn StataCorp LP September 29, 2016

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Data-efficient causal effect estimation Adith Swaminathan adswamin@microsoft.com Joint work with

Randomized Experiments The goal of randomized experiments is to identify The causal

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

Identification and Estimation of Dynamic Causal Effects in Macroeconomics Jim Stock and Mark

Dealing with the endogeneity issue in the estimation of educational efficiency using DEA Daniel

Jet list decoding D. J. Bernstein University of Illinois at Chicago Thanks to: NSF (1018836)

KBS Knowledge-Based Systems Group 1 / 18 Motivation Overview Preliminaries Independent

Advanced Algorithms LP-based Algorithms LP rounding: Relax the

Healt Health a h and Nutrition nd Nutrition Lesson content and ide esson content and ideas as

! Belohlavek R., Trnecka M. (DAMOL) Basic Level in Formal Concept Analysis August 8, 2013 1 /

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Control Lyapunov functions and partial differential equations Jean-Michel Coron Laboratoire

Chubanovs Method Khachiyans Algorithm . . . A New Polynomial-Time Karmarkars

Sambuz

Useful Links

Newsletter

Mail Us