A Course in Applied Econometrics Outline Lecture 1 1. Introduction - PowerPoint PPT Presentation

“A Course in Applied Econometrics” Outline Lecture 1 1. Introduction Estimation of Average Treatment Effects 2. Potential Outcomes Under Unconfoundedness, Part I 3. Estimands and Identification 4. Estimation and Inference Guido Imbens IRP Lectures, UW Madison, August 2008 1 Unusual case with many proposed (semi-parametric) estima- 1. Introduction tors (matching, regression, propensity score, or combinations), many of which are actually used in practice. We are interested in estimating the average effect of a program or treatment, allowing for heterogenous effects, assuming that We discuss implementation, and assessment of the critical as- selection can be taken care of by adjusting for differences in sumptions (even if they are not testable). observed covariates. In practice concern with overlap in covariate distributions tends This setting is of great applied interest. to be important. Once overlap issues are addressed, choice of estimators is less Long literature, in both statistics and economics. Influential important. Estimators combining matching and regression or economics/econometrics papers include Ashenfelter and Card weighting and regression are recommended for robustness rea- (1985), Barnow, Cain and Goldberger (1980), Card and Sulli- sons. van (1988), Dehejia and Wahba (1999), Hahn (1998), Heck- man and Hotz (1989), Heckman and Robb (1985), Lalonde Key role for analysis of the joint distribution of treatment in- (1986). In stat literature work by Rubin (1974, 1978), Rosen- dicator and covariates prior to using outcome data. baum and Rubin (1983). 2 3

2. Potential Outcomes (Rubin, 1974) We observe N units, indexed by i = 1 , . . . , N , viewed as drawn randomly from a large population. Several additional pieces of notation. We postulate the existence for each unit of a pair of potential First, the propensity score (Rosenbaum and Rubin, 1983) is outcomes, defined as the conditional probability of receiving the treat- Y i (0) for the outcome under the control treatment and ment, Y i (1) for the outcome under the active treatment Y i (1) − Y i (0) is unit-level causal effect e ( x ) = Pr( W i = 1 | X i = x ) = E [ W i | X i = x ] . Covariates X i (not affected by treatment) Each unit is exposed to a single treatment; W i = 0 if unit i Also the two conditional regression and variance functions: receives the control treatment and W i = 1 if unit i receives the active treatment. We observe for each unit the triple σ 2 ( W i , Y i , X i ), where Y i is the realized outcome: µ w ( x ) = E [ Y i ( w ) | X i = x ] , w ( x ) = V ( Y i ( w ) | X i = x ) . � Y i (0) if W i = 0 , Y i ≡ Y i ( W i ) = Y i (1) if W i = 1 . 6 7 4. Estimation and Inference 3. Estimands and Identification Assumption 1 (Unconfoundedness, Rosenbaum and Rubin, 1983a) Population average treatments ( Y i (0) , Y i (1)) ⊥ ⊥ W i | X i . τ P = E [ Y i (1) − Y i (0)] τ P,T = E [ Y i (1) − Y i (0) | W = 1] . “conditional independence assumption,” “selection on observ- Most of the discussion in these notes will focus on τ P , with ables.” In missing data literature “missing at random.” extensions to τ P,T available in the references. To see the link with standard exogeneity assumptions, assume constant effect and linear regression: We will also look at the sample average treatment effect (SATE): Y i (0) = α + X ′ Y i = α + τ · W i + X ′ i β + ε i , = ⇒ i β + ε i N τ S = 1 � ( Y i (1) − Y i (0)) N with ε i ⊥ ⊥ X i . Given the constant treatment effect assumption, i =1 unconfoundedness is equivalent to independence of W i and ε i τ P versus τ S does not matter for estimation, but matters for conditional on X i , which would also capture the idea that W i variance. is exogenous. 8 9

Motivation for Unconfoundeness Assumption (I) Motivation for Unconfoundeness Assumption (II) The first is a statistical, data descriptive motivation. A second argument is that almost any evaluation of a treatment involves comparisons of units who received the treatment with units who did not. A natural starting point in the evaluation of any program is a comparison of average outcomes for treated and control units. The question is typically not whether such a comparison should be made, but rather which units should be compared, that is, A logical next step is to adjust any difference in average out- which units best represent the treated units had they not been comes for differences in exogenous background characteristics treated. (exogenous in the sense of not being affected by the treatment). It is clear that settings where some of necessary covariates are not observed will require strong assumptions to allow for iden- Such an analysis may not lead to the final word on the efficacy tification. E.g., instrumental variables settings Absent those of the treatment, but the absence of such an analysis would assumptions, typically only bounds can be identified (e.g., Man- seem difficult to rationalize in a serious attempt to understand ski, 1990, 1995). the evidence regarding the effect of the treatment. 10 11 Motivation for Unconfoundeness Assumption (III) Example of a model that is consistent with unconfoundedness: suppose we are interested in estimating the average effect of Overlap a binary input on a firm’s output, or Y i = g ( W, ε i ). Second assumption on the joint distribution of treatments and Suppose that profits are output minus costs, covariates: W i = arg max E [ π i ( w ) | c i ] = arg max E [ g ( w, ε i ) − c i · w | c i ] , w w Assumption 2 (Overlap) implying W i = 1 { E [ g (1 , ε i ) − g (0 , ε i ) ≥ c i | c i ] } = h ( c i ) . 0 < Pr( W i = 1 | X i ) < 1 . If unobserved marginal costs c i differ between firms, and these Rosenbaum and Rubin (1983a) refer to the combination of the marginal costs are independent of the errors ε i in the firms’ two assumptions as ”stongly ignorable treatment assignment.” forecast of output given inputs, then unconfoundedness will hold as ( g (0 , ε i ) , g (1 , ε i )) ⊥ ⊥ c i . 12 13

Alternative Assumptions Identification Given Assumptions E [ Y i ( w ) | W i , X i ] = E [ Y i ( w ) | X i ] , τ ( x ) ≡ E [ Y i (1) − Y i (0) | X i = x ] = E [ Y i (1) | X i = x ] − E [ Y i (0) | X i = x ] for w = 0 , 1. Although this assumption is unquestionably = E [ Y i (1) | X i = x, W i = 1] − E [ Y i (0) | X i = x, W i = 0] weaker, in practice it is rare that a convincing case can be made for the weaker assumption without the case being equally = E [ Y i | X i , W i = 1] − E [ Y i | X i , W i = 0] . strong for the stronger Assumption. The reason is that the weaker assumption is intrinsically tied to To make this feasible, one needs to be able to estimate the functional form assumptions, and as a result one cannot iden- expectations E [ Y i | X i = x, W i = w ] for all values of w and x in the tify average effects on transformations of the original outcome support of these variables. This is where overlap is important. (e.g., logarithms) without the strong assumption. Given identification of τ ( x ), If we are interested in τ P,T it is sufficient to assume τ P = E [ τ ( X i )] Y i (0) ⊥ ⊥ W i | X i , 14 15 Efficiency Bound Propensity Score Hahn (1998): for any regular estimator for τ P , denoted by ˆ τ , Result 1 Suppose that Assumption 1 holds. Then: with √ d ( Y i (0) , Y i (1)) ⊥ ⊥ W i | e ( X i ) . N · (ˆ τ − τ P ) − → N (0 , V ) , Only need to condition on scalar function of covariates, which the variance must satisfy: would be much easier in practice if X i is high-dimensional. σ 2 σ 2 � 1 ( X i ) 0 ( X i ) � 1 − e ( X i ) + ( τ ( X i ) − τ P ) 2 V ≥ E e ( X i ) + . (1) (Problem is that the propensity score e ( x ) is almost never known.) Estimators exist that achieve this bound. 16 17

A. Regression Estimators Estimate µ w ( x ) consistently and estimate τ P or τ S as Estimators N τ reg = 1 � ˆ (ˆ µ 1 ( X i ) − ˆ µ 0 ( X i )) . N i =1 A. Regression Estimators Simple implementations include B. Matching µ w ( x ) = β ′ x + τ · w, in which case the average treatment effect is equal to τ . In C. Propensity Score Estimators this case one can estimate τ simply by least squares estimation using the regression function D. Mixed Estimators (recommended) Y i = α + β ′ X i + τ · W i + ε i . More generally, one can specify separate regression functions for the two regimes, µ w ( x ) = β ′ w x . 18 19 These simple regression estimators can be sensitive to differences in the covariate distributions for treated and control 8 units. 7 The reason is that in that case the regression estimators rely 6 heavily on extrapolation. 5 Note that µ 0 ( x ) is used to predict missing outcomes for the 4 treated. Hence on average one wishes to use predict the control 3 outcome at X T = � i W i · X i /N T , the average covariate value 2 for the treated. With a linear regression function, the average prediction can be written as ¯ Y C + ˆ β ′ ( X T − X C ). 1 0 If X T and X C are close, the precise specification of the regression function will not matter much for the average prediction. −1 0.4 0.35 0.3 0.25 0.2 0.15 0.1 0.05 0 With the two averages very different, the prediction based on a linear regression function can be sensitive to changes in the specification. 20

A Course in Applied Econometrics Outline Lecture 1 1. Introduction - PowerPoint PPT Presentation

A Course in Applied Econometrics Outline Lecture 1 1. Introduction Estimation of Average Treatment Effects 2. Potential Outcomes Under Unconfoundedness, Part I 3. Estimands and Identification 4. Estimation and Inference Guido Imbens

BS2247 Introduction to Econometrics Lecture 1: Basic Mathematical Review Dr. Kai Sun Aston

Applied Econometrics with R R and econometrics Robust standard errors Example: Sandwich

The Nature of Econometrics and Economic Data Steps in an Empirical Analysis The Structure of

August 29, 2018 Introduction and Overview ECO 752 Research Methods II - Fall 2018 - 8/29/2018

Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D.

Taking an Active Learning Course Online Cornell Economics June 17, 2020 2 Applied Econometrics

Course Orientation q Course Description q Course Outcomes q Course Requirements q Course Outline

Econometrics and Regression ? Galton (1870, Heriditary Genius , 1886, Regression to- wards

Course Offerings Course Offering Algebra 1 Algebra 2 Geometry 4 th Year OR OR Math OR

A Course in Applied Econometrics Outline Lecture 15 1. Introduction 2. Motivation Weak

A Course in Applied Econometrics 1. Introduction Lecture 5 2. Basics 3. Local Average

A Course in Applied Econometrics 1. Introduction Lecture 12 2. Basics 3. Graphical

A Course in Applied Econometrics 1. Introduction Lecture 10 2. Example I: Missing Data 3.

A Course in Applied Econometrics Outline Lecture 16 1. Introduction 2. Generalized Method

A Course in Applied Econometrics Lecture 2 Outline 1. Assessing Unconfoundedness (not

A Course in Applied Econometrics 1. Introduction Lecture 13 2. Basics Bayesian Inference

2012 ASIS&T Lecture by Gloria Leckie Professor Emeritus Faculty of Information and Media

On the simple and partial Mantel tests with spatial data Gilles Guillot 1 cois Rousset 2 Joint

Adding a Level-1 Predictor PSYC 575 August 25, 2020 (updated: 7 September 2020) Week Learning

Title A multilevel approach to health systems analysis using RISS (Reporting-by-Intranet

Motivation Before computers, statistical analysis used probability theory to derive statistical

Economics 399: Research Reports Dr. Roger Graves. Director, Writing Across the Curriculum Topic

Slide Set 1 Introduction Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and

Exploring patterns of expenditure among older people and what explains these David Hayes and

A Course in Applied Econometrics Outline Lecture 1 1. Introduction - PowerPoint PPT Presentation

A Course in Applied Econometrics Outline Lecture 1 1. Introduction Estimation of Average Treatment Effects 2. Potential Outcomes Under Unconfoundedness, Part I 3. Estimands and Identification 4. Estimation and Inference Guido Imbens

BS2247 Introduction to Econometrics Lecture 1: Basic Mathematical Review Dr. Kai Sun Aston

Applied Econometrics with R R and econometrics Robust standard errors Example: Sandwich

The Nature of Econometrics and Economic Data Steps in an Empirical Analysis The Structure of

August 29, 2018 Introduction and Overview ECO 752 Research Methods II - Fall 2018 - 8/29/2018

Introduction to Econometrics Review of Probability &amp; Statistics Peerapat Wongchaiwat, Ph.D.

Taking an Active Learning Course Online Cornell Economics June 17, 2020 2 Applied Econometrics

Course Orientation q Course Description q Course Outcomes q Course Requirements q Course Outline

Econometrics and Regression ? Galton (1870, Heriditary Genius , 1886, Regression to- wards

Course Offerings Course Offering Algebra 1 Algebra 2 Geometry 4 th Year OR OR Math OR

A Course in Applied Econometrics Outline Lecture 15 1. Introduction 2. Motivation Weak

A Course in Applied Econometrics 1. Introduction Lecture 5 2. Basics 3. Local Average

A Course in Applied Econometrics 1. Introduction Lecture 12 2. Basics 3. Graphical

A Course in Applied Econometrics 1. Introduction Lecture 10 2. Example I: Missing Data 3.

A Course in Applied Econometrics Outline Lecture 16 1. Introduction 2. Generalized Method

A Course in Applied Econometrics Lecture 2 Outline 1. Assessing Unconfoundedness (not

A Course in Applied Econometrics 1. Introduction Lecture 13 2. Basics Bayesian Inference

2012 ASIS&amp;T Lecture by Gloria Leckie Professor Emeritus Faculty of Information and Media

On the simple and partial Mantel tests with spatial data Gilles Guillot 1 cois Rousset 2 Joint

Adding a Level-1 Predictor PSYC 575 August 25, 2020 (updated: 7 September 2020) Week Learning

Title A multilevel approach to health systems analysis using RISS (Reporting-by-Intranet

Motivation Before computers, statistical analysis used probability theory to derive statistical

Economics 399: Research Reports Dr. Roger Graves. Director, Writing Across the Curriculum Topic

Slide Set 1 Introduction Pietro Coretto pcoretto@unisa.it Econometrics Master in Economics and

Exploring patterns of expenditure among older people and what explains these David Hayes and

Introduction to Econometrics Review of Probability & Statistics Peerapat Wongchaiwat, Ph.D.

2012 ASIS&T Lecture by Gloria Leckie Professor Emeritus Faculty of Information and Media