PS 406 Week 8 Section: Panel Methods and Missing Data D.J. Flynn - PowerPoint PPT Presentation

PS 406 – Week 8 Section: Panel Methods and Missing Data D.J. Flynn May 21, 2014 D.J. Flynn PS406 – Week 8 Section Spring 2014 1 / 17

Today’s plan Panel Methods 1 Review Panel models in R Missing Data 2 Review Multiple imputation in R D.J. Flynn PS406 – Week 8 Section Spring 2014 2 / 17

Panel Methods Review Recap of panel data N individuals, T time periods We assume there is correlation within each i over-time, but independence across i Types of regressors: Varying regressors: Xs that vary across i and T (e.g., income) Time-invariant regressors: Xs that vary across i but not T (e.g., race, gender): X it = X i ∀ i Individual-invariant regressors: Xs that vary across T but not i (e.g., unemployment rate, time trends): X it = X t ∀ i D.J. Flynn PS406 – Week 8 Section Spring 2014 3 / 17

Panel Methods Panel models in R Panel data models pooled OLS model : OLS applied to panel data (so you’ll end up with N ∗ T observations) y it = α + X it β + ǫ it individual-specific effects models: we assume heterogeneity across i , which we capture with α i fixed effects model : individual-specific effects correlated with regressors: y it = α i + X it β + ǫ it random effects model : individual-specific effects uncorrelated with regressors: y it = X it β + ( α + ǫ it ) Deciding between these requires tests. Let’s look at how to estimate panel models and test assumptions... D.J. Flynn PS406 – Week 8 Section Spring 2014 4 / 17

Panel Methods Panel models in R More on fixed vs. random effects 1 Such models assist in controlling for unobserved heterogeneity when this heterogeneity is constant over time and correlated with independent variables...There are two common assumptions made about the individual specific effect, the random effects assumption and the fixed effects assumption. The random effects assumption (made in a random effects model) is that the individual specific effects are uncorrelated with the independent variables. The fixed effect assumption is that the individual specific effect is correlated with the independent variables. If the random effects assumption holds, the random effects model is more efficient than the fixed effects model. However, if this assumption does not hold (i.e., if the Durbin-Watson test fails), the random effects model is not consistent. 1 Thanks Wikipedia. D.J. Flynn PS406 – Week 8 Section Spring 2014 5 / 17

Panel Methods Panel models in R Panel models in R 2 #get lab data: library(foreign) data<- as.data.frame(dget(file="http://sekhon.berkeley.edu/gov2000/ R/agl1.dpt")) names(data) #Pooled OLS model: pooled<-lm(y~left*imports, data=data) summary(pooled) #PCSEs to correct for contemporaneous correlation across #means (e.g., exogenous change affects X_i and X_j at same time): library(pcse) pcses<-pcse(pooled, groupN=data$country, groupT=data$year) summary(pcses) 2 For more on PCSEs, see http://cran.r-project.org/web/packages/pcse/pcse.pdf . D.J. Flynn PS406 – Week 8 Section Spring 2014 6 / 17

Panel Methods Panel models in R The plm package library(plm) panel<-pdata.frame(data, index=c("country", "year")) #Pooled OLS model (same as above but with plm): pooled2<-plm(y~left*imports, data=panel, model="pooling") summary(pooled2) D.J. Flynn PS406 – Week 8 Section Spring 2014 7 / 17

Panel Methods Panel models in R Fixed effects #Fixed effects for country: within.model<-plm(y~left*imports,data=panel, model="within") summary(within.model) #effects for each country: summary(fixef(within.model)) #deviation from overall mean: summary(fixef(within.model,type="dmean")) #Fixed effects for time: fe.time<-plm(y~left*imports,data=panel, model="within", effect="time") summary(fe.time) #Fixed effects for time AND country: fe.time.country<-plm(y~left*imports,data=panel, model="within", effect="twoways") D.J. Flynn PS406 – Week 8 Section Spring 2014 8 / 17

Panel Methods Panel models in R Fixed effects, cont’d summary(fe.time.country) #if you include FE for time and country, you can look at the #coefficients on each: summary(fixef(fe.time.country,type="dfirst",effect="individual")) summary(fixef(fe.time.country,type="dfirst",effect="time")) #Testing whether pooling is OK: estimate a panel variable #coefficient model (each unit has its own intercept/slope) #and compare it to pooled model: summary(pooled2) pvcm<-pvcm(y~left+imports+ I(left*imports), data=panel, model="within") pooltest(pooled2,pvcm) #low p-value=reject null of poolability (i.e., SHOULDN’T pool) D.J. Flynn PS406 – Week 8 Section Spring 2014 9 / 17

Panel Methods Panel models in R Random effects Recall: assumes unit/time effects are random variables drawn from a normal distribution. This is in contrast to fixed effects, which assumes that effects are correlated with regressors (characteristics of each i .) random<-plm(y~left+imports+I(left*imports), data=panel, model="random") summary(random) #RE for time: random2<-plm(y~left+imports+I(left*imports), data=panel, model="random", effect="time") summary(random2) #RE for units/time: random3<-plm(y~left+imports+I(left*imports), data=panel, model="random", effect="twoways") summary(random3) D.J. Flynn PS406 – Week 8 Section Spring 2014 10 / 17

Panel Methods Panel models in R Random effects, cont’d #Comparing fixed/random effects: phtest(within.model,random) #low p-value=one model is inconsistent, in which case we #should go with fixed effects b/c it makes fewer assumptions #(e.g., characteristics of countries uncorrelated with X) #Testing for unit-specific omitted variables: plmtest(pooled2, effect="individual") #low p-value=omitted variables #Testing for time-specific omitted variables: plmtest(pooled2, effect="time") #low p-value=omitted variables #Checking for autocorrelation (have we minimized it?): pdwtest(random) pbgtest(random) #low p-value=autocorrelation D.J. Flynn PS406 – Week 8 Section Spring 2014 11 / 17

Panel Methods Panel models in R SEs and CIs for the interaction effects You’ll need to bootstrap the SEs and CIs. I’ll send some sample code out by Friday. D.J. Flynn PS406 – Week 8 Section Spring 2014 12 / 17

Missing Data Review Types of missing data MCAR : 3 missing and non-missing cases are representative subsets of 1 larger populations; can use listwise deletion: Pr ( R | Y , X ) = Pr ( R ) MAR : 4 missing and non-missing cases differ on some X, but not on the 2 variable that has missing data. Note that this assumption is testable (regress R on Y O and X ). Pr ( R | Y , X ) = Pr ( R | Y O , X ) Non-Ignorable Missingness : 5 probability of missingness depends on 3 unobserved value: Pr ( R | Y , X ) � Pr ( R | Y O , X ) 3 Example: People flip a coin to decide whether to participate in study. 4 Example: Republicans less willing to fill out gov’t housing survey, but missingness is independent of housing status. 5 Example: Racial conservatives refuse to answer questions about racial attitudes. D.J. Flynn PS406 – Week 8 Section Spring 2014 13 / 17

Missing Data Multiple imputation in R The mi package 6 #We’ll use the nes02.csv data (on site): names(nes02) #Probit model of vote choice in 2002: model<-lm(vote.gop~as.factor(pid)+interest+age+white+female, data=nes02) summary(model) #Let’s see how many NAs we have: summary(as.factor(nes02$vote.gop)) summary(as.factor(nes02$pid)) summary(as.factor(nes02$interest)) summary(as.factor(nes02$age)) summary(as.factor(nes02$white)) summary(as.factor(nes02$female)) 6 For more on imputation, see http://cran.r-project.org/web/packages/mi/mi.pdf . D.J. Flynn PS406 – Week 8 Section Spring 2014 14 / 17

Missing Data Multiple imputation in R library(mi) #tell R which variables to impute: nes02.temp<-with(nes02, data.frame(vote.gop=vote.gop, pid=pid, interest=interest, age=age, white=white, female=female)) #info about variables and any NAs: nes02.temp.info<-mi.info(nes02.temp) nes02.temp.info #formulas R will use to impute missings: nes02.temp.info$imp.formula #run the imputation (I do just n.imp=3 here to save time) #(this took me ~4 minutes): nes02mi<-mi(nes02.temp, nes02.temp.info, n.imp=3, n.iter=5000) #re-estimate model on imputed dataset (notice syntax): model.imp<-lm.mi(vote.gop~as.factor(pid)+interest+age+white+female, nes02mi) D.J. Flynn PS406 – Week 8 Section Spring 2014 15 / 17

Missing Data Multiple imputation in R #compare results across original model (with NAs) and #imputed model: summary(model) display(model.imp) D.J. Flynn PS406 – Week 8 Section Spring 2014 16 / 17

Summing up everything .... D.J. Flynn PS406 – Week 8 Section Spring 2014 17 / 17

PS 406 Week 8 Section: Panel Methods and Missing Data D.J. Flynn - PowerPoint PPT Presentation

PS 406 Week 8 Section: Panel Methods and Missing Data D.J. Flynn May 21, 2014 D.J. Flynn PS406 Week 8 Section Spring 2014 1 / 17 Todays plan Panel Methods 1 Review Panel models in R Missing Data 2 Review Multiple

USCG 406 MHz DF Capabilities USCG 406 MHz DF Capabilities 2008 Beacon Manufacturers Workshop

Missing Data and Imputation NINA ORWITZ OCTOBER 30 TH , 2017 Outline Types of missing data

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Multiple Imputation for Missing Data in KLoSA Juwon Song Korea University and UCLA Contents 1.

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

350 Ryman Street P.O. Box 7909 Missoula, Montana 59807-7909 (406) 523-2500 Fax (406) 523-2595

Missing data and data imputation with the Swiss Household Panel Andr Berchtold LIVES, LINES,

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. Flynn May 14, 2014 1 1

PS 406 Week 4 Section: Matching and GLMs for Binary Outcomes D.J. Flynn April 23, 2014 D.J.

PS 406 Week 1 Section: Review of OLS and Matrix Algebra D.J. Flynn April 4, 2014 D.J. Flynn

Missing Values in SAS Magnus Mengelbier Director PhUSE 2011 1 Topics Introduction

MATH2130-F17 Week 13 Week 14 Week 15, Inner Farid Aliniaeifard Product Space CU BOULDER

PS4000 Assembly Guide Part List: A. 1 x Left Panel B. 1 x Right Panel C. 1 x Bottom Panel

Searching for and replacing missing values Nicholas Tierney Statistician DataCamp Dealing With

Bayesian Generalized linear mixed models with data missing not at random Overview: Two simple

Time Matters Week 7 Week 6 Prototyping + Needfinding Week 7 Week 8 Implementation Week 9

UNIVERSITY OF CALIFORNIA Economics 134 DEPARTMENT OF ECONOMICS Spring 2018 Professor David

SIR epidemics with stages of infection Matthieu Simon (ULB) Joint work with Claude Lef` evre

THE G LOBAL R ISK N ETWORK , P ART II Boleslaw Szymanski Xin Lin, Xiang Niu, Noemi Derzsy, Alaa

The Benefits of Further Financial Integration in Asia Phurichai Rungcharoenkitkul April, 2012 *

TESTING AND CORRECTING FOR ENDOGENEITY IN NONLINEAR UNOBSERVED EFFECTS MODELS IAAE Lecture 21st

L ECTURE 3 The Effects of Monetary Changes: Statistical Identification September 5, 2018 I. S

Estimation and Inference of Linear Trend Slope Ratios with an Application to Global Temperature

ENSO-Europe teleconnections: a modelling approach using an intermediate complexity AGCM Ivana