Even Simpler Standard Errors for Two-Stage Optimization Estimators: - PowerPoint PPT Presentation

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command by Joseph V. Terza Department of Economics Indiana University Purdue University Indianapolis Indianapolis, IN 46202 (July, 2018)

Two-Stage Estimation: Example -- Smoking and Infant Birth Weight -- Consider the regression model of Mullahy (1997) in which Y = infant birth weight in lbs. X = number of cigarettes smoked per day during pregnancy. p -- Objective to regress Y on X with a view toward the estimation of (and drawing p inferences regarding) the causal effect of the latter on the former. Mullahy, J. (1997): "Instrumental-Variable Estimation of Count Data Models: Applications to Models of Cigarette Smoking Behavior," Review of Economics and Statistics , 79, 586-593. 2

Smoking and Infant Birth Weight (cont’d) -- Two complicating factors: -- the regression specification is nonlinear because Y is non-negative. -- X is likely to be endogenous – correlated with unobservable variates that are p also correlated with Y. -- For example, unobserved unhealthy behaviors may be correlated with both smoking and infant birth weight. -- If the endogeneity of X is not explicitly accounted for in estimation, effects on Y p due to the unobservables will be attributed to X and the regression results will not p be causally interpretable (CI). 3

Remedy: Two-Stage Residual Inclusion (2SRI) Estimation -- Can use a 2SRI estimator (Terza et al., 2008, Terza 2017a and 2018) to account for endogeneity and avoid bias. -- The two stage are: -- Estimate “auxiliary” regression of X on some controls [including p instrumental variables (IV)]. -- Estimate “outcome” regression of Y on X , controls (not including IV), and p the residuals from the auxiliary regression. Terza, J., Basu, A. and Rathouz, P. (2008): “Two-Stage Residual Inclusion Estimation: Addressing Endogeneity in Health Econometric Modeling,” Journal of Health Economics , 27, 531-543. Terza, J.V. (2017a): “Two-Stage Residual Inclusion Estimation: A Practitioners Guide to Stata Implementation,” the Stata Journal , 17, 916-938. Terza, J.V. (2018): “Two-Stage Residual Inclusion Estimation in Health Services Research and Health Economics,” Health Services Research , 53, 1890-1899. 4

Two-Stage Estimation: Example – Education and Family Size -- As another example, we revisit the regression model of Wang and Famoye (1997). -- We diverge a bit from the authors and begin the analysis by specifying the potential outcome (PO) version of the model in which * X  exogenously imposed (EI) version of relevant causal variable p  EI wife’s years of education  relevant PO for EI version of relevant causal variable Y * X p * ≡ potential number of children in the family if EI wife’s education is X . p Wang, W. and Famoye, F. (1997): “Modeling Household Fertility Decisions with Generalized Poisson Regression,” Journal of Population Economics , 10, pp. 273-283. 5

Education and Family Size (cont’d) -- For the sake of argument we assume the following PO specification * *   pdf(Y | X ) f (X , X ; π ) POI(Y , λ ) (1) * o (Y | X ) p o * X X * o p p Xp   Y 0, 1, ..., where * X p POI(A, b)  the pdf of the Poisson random variable A with parameter b A  b exp( b)  . A! * *    . (2) λ E[Y | X ] exp(X β X β ) o p p o o * X p and X is a vector of regression controls (no endogeneity here). o    . π = β = [ β β ] -- Here p o 6

Two-Stage Marginal Effect (2SME) Estimation: Education and Family Size -- Suppose that our estimation objective is the average incremental effect (AIE) of an additional year of education on the number of children in the family, i.e.,   AIE(1) E[Y ] E[Y ] (3) pre pre   X 1 X 1 p p pre X where is the pre-increment EI wife’s education. p -- Given (2) we can rewrite (3) as     pre pre      AIE(1) E exp([X 1] β X β ) E exp(X β X β )  (4)    p p o o p p o o 7

2SME Estimation: Education and Family Size (cont’d) ˆ ˆ β (say -- Assuming we have consistent estimates of β and β and β ) and taking o p o p pre X to be the EI version of observable wife’s education ( X ), (4) can be consistently pi p estimated using*   1 n    ˆ ˆ ˆ ˆ       AIE 1 exp([X 1] β X β ) exp(X β X β ) (5) pi p oi o pi p oi o n  i 1 where X represents the observed vector of controls. oi * Y X and *Note that substituting the observed values ( Y , X , and X ) for , X p i pi oi * o X p in (1) will not necessarily yield consistent maximum likelihood estimates (MLE) of β . The specific conditions under which such MLE are consistent are β and o p detailed in Terza (2018). Terza, J.V. (2018): “Regression-Based Causal Analysis from the Potential Outcomes Perspective,” Unpublished Manuscript, Department of Economics, Indiana University Purdue University Indianapolis. 8

2SME Estimation: Education and Family Size (cont’d) -- The two stages are:   by Poisson regressing Y on -- Estimate β = [ β β ] X and X . p o p o -- Estimate AIE of an additional year of wife’s education using (5). 9

Asymptotically Correct Standard Errors (ACSE) for Two-Stage Estimators: Using the Mata DERIV Command -- The objective here is to show how the Mata DERIV command can be used to simplify otherwise daunting coding and calculation of ACSE for the class of two- stage estimators of which 2SRI and 2SME are members. -- For brevity and ease of exposition, I focus here on 2SME estimators. 10

A Somewhat General Form of the 2SME Estimator -- Let’s first consider a more general form of the 2SME estimator  me n  i   ME (6) n  i 1 where  i pre , Δ , X , π ), ˆ oi ˆ me is shorthand notation for π is the first-stage me(X pi i estimator of π and  m(1, X ; π ) m(0, X ; π ) (6-a) o o pre pre pre , Δ , X , π )     me(X m(X , X , π ) m(X , X , π ) (6-b) p o p o p o a b  m( , ; π ) . (6-c)  a a pre b   X , X p o 11

The 2SME Estimator (cont’d) -- (14-a) defines the general form of the average treatment effect (ATE) -- (14-b) defines the general form of the average incremental effect (AIE) -- (14-c) defines the general form of the average marginal effect (AME) 12

ACSE for 2SME Estimators -- In this case, we seek the estimated asymptotically correct variance of  ME [i.e. EACV(  ME )] the square root of which is the correct asymptotic standard error. -- Based on general results for two-stage optimization estimators (2SOE) and the fact that 2SME estimators are 2SOE, Terza (2016a and b) shows that the formulation of the EACV(  ME ) is Terza, J.V. (2016a): “Simpler Standard Errors for Two-Stage Optimization Estimators,” the Stata Journal , 16, 368-385. Terza, J.V. (2016b): “Inference Using Sample Means of Parametric Nonlinear Data Transformations,” Health Services Research , 51, 1109-1113. 13

ACSE for 2SME Estimators (cont’d)        n n n 2           me me me ME     i   i i π π     i 1 i 1 i 1  ˆ (7)   AVAR( ) π   n n n             where  AVAR(ˆ β ) is the estimated asymptotic covariance matrix of ˆ π  π me denotes the gradient of me with respect to π and  i pre pre   X and ˆ π me represents π me with X , π substituted for X , X and π ; pi oi p o respectively. 14

ACSE for 2SME Estimators (cont’d) --  AVAR(ˆ π ) can be obtained directly from the Stata output for the relevant Stata regression command.   n 2     me ME  i n me is easily calculated using Mata, given that  i  i 1   -- ME has n n  i 1 already been calculated (i.e.,  i me and  ME are already in hand). n    me i π  i 1 -- Direct calculation of the remaining component of (7), viz. , requires n  i   analytic derivation of π me and Mata coding of π me . 15

ACSE for 2SME Estimators: Education and Family Size To the above education and family size model we add:  X [employed eduwe agewife faminc race city 1] o where employed =1 if employed, 0 if not agewife = wife’s age in years faminc = family income race = 1 if wife is white, 0 if not city = if the family is situated in a county whose largest city has more than 50K people. 16

ACSE for 2SME Estimators: Education and Family Size (cont’d) -- Recall that in this case we seek to estimate the AIE of an additional year of wife’s education using   1 n    ˆ ˆ ˆ ˆ       AIE 1 exp([X 1] β X β ) exp(X β X β ) (8) pi p oi o pi p oi o n  i 1 ˆ ˆ ˆ   is the vector of Poisson parameter estimates. where β = [ β β ] p o -- Following Terza (2016b, 2017b), in this example we have  i ˆ ˆ ˆ ˆ            me exp([X 1] β X β ) [X 1] X exp(X β X β ) X X     β pi p oi o pi o pi p o o pi oi (9) Terza, J.V. (2017b): “Causal Effect Estimation and Inference Using Stata,” the Stata Journal , 17, 939-961. 17

Even Simpler Standard Errors for Two-Stage Optimization Estimators: - PowerPoint PPT Presentation

Even Simpler Standard Errors for Two-Stage Optimization Estimators: Mata Implementation via the DERIV Command by Joseph V. Terza Department of Economics Indiana University Purdue University Indianapolis Indianapolis, IN 46202 (July, 2018)

Basic Errors Compiling in Unix Syntax errors Common Errors, and Debugging Run-Time errors

in Big-Data Analytic Systems Rui Li , Peizhen Guo, Bo Hu, Wenjun Hu Yale University Background

VOLVO PENTA STAGE V SOLUTION Engine concept and range presentation April 2019 ADDITIONAL

Unified error reporting -- A worthy goal? Andi Kleen, Intel Corporation Sep 2009

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Statements and open sentences Statements: 2 is an even integer. 3 is an even integer.

IGCSE MISY Mandalay 2020-2022 MISY Mandalay Key Stage 4 MISY Key Stages EYFS KS4 KS5 KS1

24/10/2018 01/12/2018 01/07/2019 01/07/2020 01/07/2021 01/07/2022 Stage 2 Stage 3 Royal

Exceptions Introduction to Computing Using Python Types of errors We saw different types of

Making State Government Simpler, Faster, Better, and Less Costly Michael Buerger and Rich

SimpleR SimpleR - goals and intentions A Windows-based interface to R for basic statistics T

ELO TRANSLATION PROJECT SARAH **** SOME VOCAB Errors Logic Errors Runtime Errors

Treasurers Institute Sun, Nov. 17, 2019 Property Tax Errors Property Tax Errors Property Tax

NMVTIS INFORMATION FOR TACA MARCH 2019 NMVTIS ERRORS Odometer Reading Discrepancies

GENIE Systematic Errors GENIE Systematic Errors GENIE Systematic Errors Hugh Gallagher, Tufts

Unforced Errors Unforced Errors My mother taught me that in polite society, we do not talk

Introd u cing the dataset IN TR OD U C TION TO P YTH ON FOR FIN AN C E Adina Ho w e Instr u

New Jersey Electric Vehicle Infrastructure Stakeholder Group Meeting #4 Predecisional Draft Mike

GlideinWMS Marco Mambelli Stakeholders Meeting July 11, 2018 Overview Releases since last

GWA Board Meeting March 14, 2018 Presenters: Alyson Watson March 14, 2018 Initial

Chapter II: Basics from Linear Algebra, Probability Theory, and Statistics Information

Reca Recall: Econom Economics cs Goa oal in n Li Life fe Pi Pigging Ou Out (consump

Sustainable Human Development Index a pragmatic proposal for monitoring sustainability within

12/2/2013 The Common Core State Standards and Students with Moderate/Severe Disabilities Using