PS 406 Week 4 Section: Matching and GLMs for Binary Outcomes D.J. - PowerPoint PPT Presentation

PS 406 – Week 4 Section: Matching and GLMs for Binary Outcomes D.J. Flynn April 23, 2014 D.J. Flynn PS406 – Week 4 Section Spring 2014 1 / 21

Matching Intuitive solution to the problem of confounding? Kind of. Relative to experiments... We lose efficiency because we need to estimate more parameters (e.g., when calculating p-scores) SEs aren’t straightforward (bootstrapping) Regression ofen does a better job of replicating experimental findings than matching estimators (Peikes et al. 2008) (My two cents:) A lot of decisions to make. Which covariates to match on? Which matching estimator? etc.. Really good overview: sekhon.polisci.berkeley.edu/papers/annualreview.pdf D.J. Flynn PS406 – Week 4 Section Spring 2014 2 / 21

Setting up the data framing<-read.csv("~/Downloads/framing-exp-data.csv") #Running example: effect of PID on support for renewables: framing$support.renew<-recode(framing$renewables, "1:4=0;5:7=1",as.factor.result=FALSE) framing$dem<-recode(framing$party,"1=1;else=0", as.factor.result=FALSE) framing.new<-na.omit(data.frame(TA=framing$TA,condition= framing$condition,renewables=framing$renewables,gmf= framing$gmf,sex=framing$sex,year=framing$year,party= framing$party,understand=framing$understand,interest= framing$interest,dem=framing$dem,support.renew= framing$support.renew)) D.J. Flynn PS406 – Week 4 Section Spring 2014 3 / 21

Estimating GLMs in R #Let’s use logit: logit<-glm(support.renew~as.factor(TA)+understand+interest, family=binomial(link=logit),data=framing.new) summary(logit) Coefficients: Estimate Std. Error z value Pr(>|z|) (Intercept) 1.4905 2.1122 0.706 0.4804 as.factor(TA)2 -0.5258 0.9769 -0.538 0.5905 as.factor(TA)3 -0.6315 0.9904 -0.638 0.5237 understand 0.9575 0.4500 2.128 0.0334 * interest -0.7423 0.7192 -1.032 0.3020 --- Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1 D.J. Flynn PS406 – Week 4 Section Spring 2014 4 / 21

Propensity scores Recall that a propensity score is the probability that a given unit is assigned to treatment, conditional on covariates: Pr ( D i = T | X i ) framing.new$pscore<-logit$fitted.values summary(framing.new$pscore) Min. 1st Qu. Median Mean 3rd Qu. Max. 0.6335 0.9214 0.9288 0.9200 0.9619 0.9829 hist(framing.new$pscore) plot(framing.new$pscore) D.J. Flynn PS406 – Week 4 Section Spring 2014 5 / 21

Now we can proceed with matching... install.packages("Matching") library(Matching) D.J. Flynn PS406 – Week 4 Section Spring 2014 6 / 21

Pairwise matching Pairwise matching matches controls to treatment cases with closest p-score. match<-with(framing.new,Match(Y=support.renew,Tr=dem, X=pscore,est="ATT")) D.J. Flynn PS406 – Week 4 Section Spring 2014 7 / 21

summary(match) Estimate... 0.20402 AI SE...... 0.056228 T-stat..... 3.6285 p.val...... 0.00028506 Original number of observations.............. 100 Original number of treated obs............... 58 Matched number of observations............... 58 Matched number of observations (unweighted). 221 D.J. Flynn PS406 – Week 4 Section Spring 2014 8 / 21

Let’s check the quality of our matches... with(framing.new, MatchBalance(support.renew~as.factor(TA)+ understand+interest,match.out=logit)) [Long List of Info] Successful matching = similar means for treated and control cases Use p -values and KS Bootstrap to gauge balance The Kolmogorov-Smirnov stat is a non-parametric test for the equality of sample distributions (cf., t-test) D.J. Flynn PS406 – Week 4 Section Spring 2014 9 / 21

Caliper matching Caliper matching specifies a maximum acceptable distance between propensity scores (e.g., don’t want to match two cases that are very different – even if the match is the “best” possible in the data) caliper<-with(framing.new,Match(Y=support.renew,Tr=dem, X=pscore,est="ATT",caliper=0.10)) D.J. Flynn PS406 – Week 4 Section Spring 2014 10 / 21

summary(caliper) Estimate... 0.1891 AI SE...... 0.049145 T-stat..... 3.8479 p.val...... 0.00011916 Original number of observations.............. 100 Original number of treated obs............... 58 Matched number of observations............... 52 Matched number of observations (unweighted). 215 Caliper (SDs)........................................ 0.1 Number of obs dropped by ’exact’ or ’caliper’ 6 D.J. Flynn PS406 – Week 4 Section Spring 2014 11 / 21

Common support matching Common support matching creates a range where propensity scores for treated and control cases overlap. Cases with p-scores outside this range will be dropped. Generally, caliper matching is better at dealing with outliers (bc it throws away inliers too). CRAN: “Seriously, don’t use it [common support matching].” common <- with(framing.new, Match(Y=support.renew,Tr=dem, X=pscore, est="ATT", CommonSupport=TRUE)) D.J. Flynn PS406 – Week 4 Section Spring 2014 12 / 21

summary(common) Estimate... 0.17879 AI SE...... 0.052167 T-stat..... 3.4272 p.val...... 0.00060973 Original number of observations.............. 96 Original number of treated obs............... 55 Matched number of observations............... 55 Matched number of observations (unweighted). 217 D.J. Flynn PS406 – Week 4 Section Spring 2014 13 / 21

Bias Adjustment Bias adjusted matching uses regression adjustment to improve the consistency of the matching estimator. (Not all estimators are consistent in expectation like OLS.) bias.adj <- with(framing.new, Match(Y=support.renew, Tr=dem, X=pscore, est="ATT", BiasAdjust=TRUE)) D.J. Flynn PS406 – Week 4 Section Spring 2014 14 / 21

summary(bias.adj) Estimate... 0.21874 AI SE...... 0.05897 T-stat..... 3.7094 p.val...... 0.00020776 Original number of observations.............. 100 Original number of treated obs............... 58 Matched number of observations............... 58 Matched number of observations (unweighted). 221 D.J. Flynn PS406 – Week 4 Section Spring 2014 15 / 21

Exact matching Under exact matching , only cases with the same p-scores will be matched; others will be discarded. You can specify which covariates to use for exact matches (e.g., no continuous covariates). exact <- with(framing.new, Match(Y=support.renew, Tr=dem, X=pscore, est="ATT", exact=TRUE)) D.J. Flynn PS406 – Week 4 Section Spring 2014 16 / 21

summary(exact) Estimate... 0.16667 AI SE...... 0.043309 T-stat..... 3.8483 p.val...... 0.00011893 Original number of observations.............. 100 Original number of treated obs............... 58 Matched number of observations............... 47 Matched number of observations (unweighted). 204 Number of obs dropped by ’exact’ or ’caliper’ 11 D.J. Flynn PS406 – Week 4 Section Spring 2014 17 / 21

Rosenbaum sensitivity analysis Matching relies on the propensity score, which was estimated using a vector of covariates X that we specified a priori . Thus, we want to check how sensitive our effect estimates are to potential confounders – that is, any variable(s) that affects assignment to treatment. Rosenbaum sensitivity analysis helps us do this. It shows us how our results might change given different values of a sensitivity parameter. A short, readable paper on RSA is here: www.personal.psu.edu/ljk20/rbounds%20vignette.pdf D.J. Flynn PS406 – Week 4 Section Spring 2014 18 / 21

Sensitivity analysis For binary outcomes: library(rbounds) binarysens() For continuous outcomes: library(rbounds) psens() D.J. Flynn PS406 – Week 4 Section Spring 2014 19 / 21

Back to our data: What do we see? binarysens(bias.adj) Rosenbaum Sensitivity Test Unconfounded estimate .... 0 Gamma Lower bound Upper bound 1 0 0.00000 2 0 0.00000 3 0 0.00001 4 0 0.00011 5 0 0.00057 6 0 0.00180 Note: Gamma is Odds of Differential Assignment To Treatment Due to Unobserved Factors D.J. Flynn PS406 – Week 4 Section Spring 2014 20 / 21

Closing thoughts: GLMs for binary outcomes We used logit to estimate propensity scores. We also could’ve used a linear probability model (OLS), probit, clog-log, others... Key takeaway is to always use one of these models (not OLS) when your DV is binary to prevent out-of-sample predictions (e.g., Pr(turnout)=1.23???) . When you estimate one of these, you’re no longer modeling E [ Y ] . Instead you’re modeling: [ Pr ( Y i = 1 ) | X i ] I stick with one (usually logit), that way I can use to “divide by 4 rule” to get rough ideas of effects. But always present substantive effects for readers. Otherwise, who cares about the effect of a given X on the log odds of Y?? D.J. Flynn PS406 – Week 4 Section Spring 2014 21 / 21

PS 406 Week 4 Section: Matching and GLMs for Binary Outcomes D.J. - PowerPoint PPT Presentation

PS 406 Week 4 Section: Matching and GLMs for Binary Outcomes D.J. Flynn April 23, 2014 D.J. Flynn PS406 Week 4 Section Spring 2014 1 / 21 Matching Intuitive solution to the problem of confounding? Kind of. Relative to

USCG 406 MHz DF Capabilities USCG 406 MHz DF Capabilities 2008 Beacon Manufacturers Workshop

7.5 Bipartite Matching Matching Matching. Input: undirected graph G = (V, E). M E

Lecture 15 GPs for GLMs + Spatial Data 3/20/2018 1 GPs and GLMs 2 Bern (

Binary Numbers Binary numbers look like this Binary Numbers or Binary Code Binary numbers or

A Quick Review Decimal to binary Binary to decimal Binary to hexadecimal

PS 406 Week 3 Section: Bootstrapping D.J. Flynn April 21, 2014 D.J. Flynn PS406 Week 3

Global Shape Matching Section 3.3: Articulated Matching using Graph Cuts Global Shape Matching:

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

Binary Trees, Heaps Binary Trees, Heaps Binary trees Binary trees A binary tree (

Math 610 Section 700 - Recitation week 3 week 4 week 6 week 8 TA: Peng Wei Office: Blocker

LECTURE 2 Review 1 Binary Math and Assembly BINARY MATH In this section, we review Binary

350 Ryman Street P.O. Box 7909 Missoula, Montana 59807-7909 (406) 523-2500 Fax (406) 523-2595

Matching of Matrix Elements and Parton Showers CKKW matching in e + e collisions Lecture 2:

Balanced Search Trees Binary Search Trees Binary Search Tree Binary Search Tree A binary tree is

Binary Numbers 723 Binary Numbers 723 = 7x100 + 2x10 + 3x1 Binary Numbers 723 = 7x100 + 2x10 +

PS 406 Week 7 Section: Instrumental Variables/2SLS and RDD D.J. Flynn May 14, 2014 1 1

Bloxy: P roviding Transparent and Generic BFT-Based Ordering Services for Blockchains Symposium

Caliper: Pu,ng Performance Data in Context 9 th Scalable

CPSC 490: Problem Solving in Computer Science Select a presentation topic by Friday, March 22.

Goals Status Quo Modelling Tenets ProRail track plans 1. Can mostly be mapped to the

1 Prior Work Stereology Prior Work Stereology Prior Work Texture Synthesis Prior

User-Space Enhancements for Linux Perf Shay Gal-On, Laksono Adhianto, Nathan Tallent, William

Unit Testing Performance in Java Projects: Are We There Yet? Petr Stefan, Vojtch Hork ,

Performance Testing Java Applications Martin Thompson - @mjpt777 What is Performance? Throughput