PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. - PowerPoint PPT Presentation

PS 406 – Week 7 Section: Instrumental Variables/2SLS and RDD D.J. Flynn May 14, 2014 1 1 Updated 5/16/14 with sample bootstrap code, and 5/20/14 with example of subsetting on multiple criteria. D.J. Flynn PS406 – Week 7 Section Spring 2014 1 / 16

The IV approach and its assumptions Endogeneity as a threat to causal inference Exogeneity assumption: E ( u | X ) = 0 When X is endogeneous, then ˆ β will be a mixture of the relationship between X and Y and the relationship between X and u 2 Common violations: omitted variables measurement error (on X ) simultaneity/reciprocal causation others... instrumental variables is a method for uncovering the relationship between X → Y in the presence of endogeneity 2 See Jay’s slides 1-15 for more. D.J. Flynn PS406 – Week 7 Section Spring 2014 2 / 16

The IV approach and its assumptions Key challenge: finding credible instruments When we want to use IV regression, we first need to identify a credible instrumental variable (or instrument ) The most common IV set-up is as follows: Z → X → Y An instrument ( Z ) MUST meet three requirements: doesn’t belong in the model (theoretic) 1 “relevance criterion”: cov ( z , x ) � 0 2 “exclusion restriction”: E ( z T u ) = 0 3 If these assumptions hold, then E ( ˆ β IV ) = β D.J. Flynn PS406 – Week 7 Section Spring 2014 3 / 16

The IV approach and its assumptions The IV estimator Recall the OLS estimate of β is: ( X T X ) − 1 X T y = cov ( x , y ) var ( x ) The IV estimate of β is: ( z T x ) − 1 z T y = cov ( z , y ) cov ( z , x ) Important implications: weak instrument = any covariance b/w Z and Y produces significant bias strong instrument = if X and u are strongly related (which they are – that’s why we’re doing IV), then Z and u are surely related too = violation of exclusion restriction Ideally, you want a “moderately strong” (?) instrument. This is hard to find in practice, so need to discuss strength/limitations of any instrument you use (Jay). D.J. Flynn PS406 – Week 7 Section Spring 2014 4 / 16

The IV approach and its assumptions Cool (?) IV examples Ladd (2012): X=talk radio exposure, Y=media trust, Z=miles of daily 1 commute to work Gerber (1998): X=campaign spending, Y=election outcomes, 2 Z=challenger wealth Acemoglu et al. (2001): X=political institutions, Y=economic 3 development, Z=settler mortality rates Card (1995): X=education, Y=earnings, Z=geographic proximity to 4 college/university..... D.J. Flynn PS406 – Week 7 Section Spring 2014 5 / 16

IV Regression in R IV regression with the Card data install.packages("AER") library(AER) data(CollegeDistance) names(CollegeDistance) clean<-na.omit(data.frame(wage=CollegeDistance$wage, education=CollegeDistance$education, distance=CollegeDistance$distance)) D.J. Flynn PS406 – Week 7 Section Spring 2014 6 / 16

IV Regression in R #cov(z,x): cov(clean$distance,clean$education) #model w/o IV: reg<-lm(wage~education,data=clean) summary(reg) D.J. Flynn PS406 – Week 7 Section Spring 2014 7 / 16

IV Regression in R #incorporating IV: install.packages("sem") library(sem) #format: DV ~ exogenous and endogenous Xs, ~ #exogenous Xs and instruments iv.model<-tsls(wage~education, ~ distance,data=clean) summary(iv.model) #note: you could, of course, look at subgroups of the population (e.g., those who might be on the margin of going to college). this is just an average effect across all cases in the dataset D.J. Flynn PS406 – Week 7 Section Spring 2014 8 / 16

IV Regression in R 2SLS vs. IV When you have one instrument per endogenous X, 2SLS collapses to instrumental variables. Here we have one instrument, so we can solve for the coefficient using the IV formula: D.J. Flynn PS406 – Week 7 Section Spring 2014 9 / 16

IV Regression in R ymat<-clean$wage zmat<-matrix(1,nrow=4739,ncol=2) zmat[,2] <- clean$distance xmat<-matrix(1, nrow=4739, ncol=2) xmat[,2]<-clean$education solve(t(zmat) %*% xmat) %*% t(zmat) %*% ymat #notice coefficients are the same! When we have 2+ instruments (2SLS), the formula for β becomes more complicated: [ X T Z ( Z T Z ) X ] − 1 [ X T Z ( Z T )( Z T Z ) − 1 Z T Y ] D.J. Flynn PS406 – Week 7 Section Spring 2014 10 / 16

RDD in R RDD using the Israeli class size data 3 Back to the Maimonides’ Rule data: classes with 40-49 students get 1 teacher and an aide, classes with 50+ students get an additional teacher setwd() final5<-read.dta("final5.dta") #here we’re using a bandwidth of +/-5 students: tempdata<-subset(final5, abs(final5$c_size - 50) <=5) summary(tempdata$c_size) 3 Data file available on BB. D.J. Flynn PS406 – Week 7 Section Spring 2014 11 / 16

RDD in R tempabove<-subset(tempdata, c_size>50 & c_size <=55) tempbelow<-subset(tempdata, c_size<50 & c_size >=45) #regression for cases above threshold: above.lm<-lm(avgmath~c_size,data=tempabove) summary(above.lm) #regression for cases below threshold: below.lm<-lm(avgmath~c_size,data=tempbelow) summary(below.lm) Note: This slide was updated 5/20/14 with a more helpful example of subsetting on multiple criteria. D.J. Flynn PS406 – Week 7 Section Spring 2014 12 / 16

RDD in R #fitted value for tempabove at c_size=50: above.predict<-above.lm$coef[1] + above.lm$coef[2]*50 #fitted value for tempbelow at c_size=50: below.predict<-below.lm$coef[1] + below.lm$coef[2]*50 #causal effect: above.predict-below.predict D.J. Flynn PS406 – Week 7 Section Spring 2014 13 / 16

RDD in R Bootstrapping the RDD effect estimate #make clean dataset: clean<-na.omit(data.frame(avgmath=tempdata$avgmath, c_size=tempdata$c_size)) #make results vector: results<-vector(mode="numeric",length=1000) for (i in 1:1000) { permdata<-clean #sample from c_size and store values permdata$c_size.temp<-sample(permdata$c_size, nrow(permdata),replace=TRUE) #create subsets around threshold and run models: permabove<-subset(permdata, c_size.temp>50 & c_size.temp<=55) permbelow<-subset(permdata, c_size.temp<50 & c_size.temp>=45) D.J. Flynn PS406 – Week 7 Section Spring 2014 14 / 16

RDD in R above.lm.perm<-lm(avgmath~c_size.temp, data=permabove) below.lm.perm<-lm(avgmath~c_size.temp, data=permbelow) #store fitted values for each model: fitted.above<-above.lm.perm$coef[1] + above.lm.perm$coef[2]*50 fitted.below<-below.lm.perm$coef[1] + below.lm.perm$coef[2]*50 #store difference: results[i] <- fitted.above - fitted.below } D.J. Flynn PS406 – Week 7 Section Spring 2014 15 / 16

RDD in R #look at distribution of effect estimates: summary(results) hist(results) #95% CI: quantile(results, c(0.025, 0.975)) #p-value: sum(abs(results) > abs(fitted.above - fitted.below)) / length(results) D.J. Flynn PS406 – Week 7 Section Spring 2014 16 / 16

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. - PowerPoint PPT Presentation

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. Flynn May 14, 2014 1 1 Updated 5/16/14 with sample bootstrap code, and 5/20/14 with example of subsetting on multiple criteria. D.J. Flynn PS406 Week 7 Section Spring 2014

Spark RDD Operations Transformation and Actions 1 MapReduce Vs RDD Both MapReduce and RDD can

Spark RDD Operations Transformations and Actions 1 RDD Processing Model RDD can be modeled

Instrumental Variables for Dummies January 2011 () IV January 2011 1 / 4 Instrumental

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department

Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department

USCG 406 MHz DF Capabilities USCG 406 MHz DF Capabilities 2008 Beacon Manufacturers Workshop

Instrumental Variables Philosophy of Economics University of Virginia Matthias Brinkmann

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Variables (IV) in Stata Austin Nichols @austnnchols Magic Bullets Instrumental Variables

Gov 2002 - Causal Inference II: Instrumental Variables Matthew Blackwell Arthur Spirling

350 Ryman Street P.O. Box 7909 Missoula, Montana 59807-7909 (406) 523-2500 Fax (406) 523-2595

Endogeneity and Instrumental Variables Ping Yu School of Economics and Finance The University of

Risk and Ambiguity in Models of Business Cycles Dave Backus, Axelle Ferriere, and Stan Zin

Slide Set 12 Model Specification and Identification Pietro Coretto pcoretto@unisa.it

Regional airports and regional growth in Europe: which way does the causality run? Kirsi Mukkala

Statistics and Data Analysis Regression Analysis (2) Ling-Chieh Kung Department of Information

MONETIZING RENEWABLE ENERGY INFRASTRUCTURE SYSTEMS JULIAN CONRAD JUERGENSMEYER PROFESSOR

Ou Our Energy F gy Future: e: Ren enewables, s, T Transm smissi ssion, Storag age &

Bayesian RL Tutorial 1/25 Gaussian Process Temporal Difference Learning Yaakov Engel

Extremal trajectories and Maxwell points in sub-Riemannian problem on the Engel group A. A.

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. - PowerPoint PPT Presentation

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. Flynn May 14, 2014 1 1 Updated 5/16/14 with sample bootstrap code, and 5/20/14 with example of subsetting on multiple criteria. D.J. Flynn PS406 Week 7 Section Spring 2014

Spark RDD Operations Transformation and Actions 1 MapReduce Vs RDD Both MapReduce and RDD can

Spark RDD Operations Transformations and Actions 1 RDD Processing Model RDD can be modeled

Instrumental Variables for Dummies January 2011 () IV January 2011 1 / 4 Instrumental

Econ 2148, fall 2017 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2019 Instrumental variables I, origins and binary treatment case Maximilian Kasy

Econ 2148, fall 2017 Instrumental variables II, continuous treatment Maximilian Kasy Department

Econ 2148, fall 2019 Instrumental variables II, continuous treatment Maximilian Kasy Department

USCG 406 MHz DF Capabilities USCG 406 MHz DF Capabilities 2008 Beacon Manufacturers Workshop

Instrumental Variables Philosophy of Economics University of Virginia Matthias Brinkmann

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

YCL Week 3 Lets talk about variables! Variables Variables are containers for data. Variables

Variables (IV) in Stata Austin Nichols @austnnchols Magic Bullets Instrumental Variables

Gov 2002 - Causal Inference II: Instrumental Variables Matthew Blackwell Arthur Spirling

350 Ryman Street P.O. Box 7909 Missoula, Montana 59807-7909 (406) 523-2500 Fax (406) 523-2595

Endogeneity and Instrumental Variables Ping Yu School of Economics and Finance The University of

Risk and Ambiguity in Models of Business Cycles Dave Backus, Axelle Ferriere, and Stan Zin

Slide Set 12 Model Specification and Identification Pietro Coretto pcoretto@unisa.it

Regional airports and regional growth in Europe: which way does the causality run? Kirsi Mukkala

Statistics and Data Analysis Regression Analysis (2) Ling-Chieh Kung Department of Information

MONETIZING RENEWABLE ENERGY INFRASTRUCTURE SYSTEMS JULIAN CONRAD JUERGENSMEYER PROFESSOR

Ou Our Energy F gy Future: e: Ren enewables, s, T Transm smissi ssion, Storag age &amp;

Bayesian RL Tutorial 1/25 Gaussian Process Temporal Difference Learning Yaakov Engel

Extremal trajectories and Maxwell points in sub-Riemannian problem on the Engel group A. A.

Ou Our Energy F gy Future: e: Ren enewables, s, T Transm smissi ssion, Storag age &