 
              rdlocrand : Inference in RD Designs Under Local Randomization Matias Cattaneo University of Michigan Roc´ ıo Titiunik Gonzalo Vazquez-Bare University of Michigan University of Michigan https://sites.google.com/site/rdpackages Stata Conference - Chicago July 28, 2016 1 / 16
Introduction Regression discontinuity designs (RDDs) are one of the most popular methods for causal inference. RDDs can be interpreted as a local experiment in a window around the cutoff. rdlocrand analyzes RDDs using tools from classical randomized experiments literature: rdwinselect : window selection rdrandinf : randomization inference rdsensitivity : sensitivity analysis rdrbounds : Rosenbaum bounds 2 / 16
Regression Discontinuity Designs: motivation Many programs or policies are assigned based on whether a score (running variable) X exceeds a threshold c : Scholarship to students above a certain test score. Subsidy to households above a poverty threshold. RDDs exploit the discontinuity in the probability of treatment assignment at the cutoff. Sharp design: D i = ✶ ( X i ≥ c ). Intuition: in a “small” window around the cutoff, units above and below are comparable. 3 / 16
RDD: intuition ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Outcome ● ● τ ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 E[Y 0 |X] ● E[Y 1 |X] 0.0 0.0 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.6 0.7 0.8 0.8 0.9 1.0 1.0 Score 4 / 16
RDD: intuition ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Outcome ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 E[Y 0 |X] ● E[Y 1 |X] 0.0 0.0 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.6 0.7 0.8 0.8 0.9 1.0 1.0 Score 5 / 16
RDD: intuition ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Outcome ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 E[Y 0 |X] ● E[Y 1 |X] 0.0 0.0 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.6 0.7 0.8 0.8 0.9 1.0 1.0 Score 6 / 16
RDD: intuition ● 4 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 3 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● Outcome ● ● ● ● ● ● 2 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 1 ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● ● 0 E[Y 0 |X] ● E[Y 1 |X] 0.0 0.0 0.1 0.2 0.2 0.3 0.4 0.4 0.5 0.6 0.6 0.7 0.8 0.8 0.9 1.0 1.0 Score 7 / 16
Inference in classical randomized experiments RDDs as randomized experiments around the cutoff. Key assumption: existence of a window in which this is true. Inference in classical experiments: Fixed (nonrandom) potential outcomes. Known assignment mechanism. Randomization (finite sample) p-value: Choose a statistic T (e.g. difference in means), Calculate T for all permutations of treatment assignment, Find P ( T ≥ T obs ). In Stata: permute d stat = (r(mu_1)-r(mu_2)): ttest y, by(d) . 8 / 16
Randomization inference with rdrandinf . rdrandinf demvoteshfor2 demmv, wl(-.75) wr(.75) Selected window = [-.75 ; .75] Running permutation test... Permutation test complete. Inference for sharp design Cutoff c = 0.00 Left of c Right of c Number of obs = 1390 Order of poly = 0 Number of obs 595 702 Kernel type = uniform Eff. Number of obs 15 22 Reps = 1000 Mean of outcome 42.808 52.497 Window = set by user S.D. of outcome 7.042 7.742 H0: tau = 0.000 Window -0.750 0.750 Randomization = fixed margins Outcome: demvoteshfor2. Running variable: demmv. Finite sample Large sample Statistic T P>|T| P>|T| Power vs d = 3.52 Diff. in means 9.689 0.001 0.000 0.300 9 / 16
Randomization inference with rdrandinf 70 60 50 40 30 -1 -.5 0 .5 1 Democratic margin of victory 10 / 16
Choosing the window with rdwinselect . rdwinselect demmv $covariates, wmin(.5) wstep(.125) reps(10000) Window selection for RD under local randomization Cutoff c = 0.00 Left of c Right of c Number of obs = 1390 Order of poly = 0 Number of obs 640 750 Kernel type = uniform 1th percentile 6 8 Reps = 10000 5th percentile 32 38 Testing method = rdrandinf 10th percentile 64 75 Balance test = ttest 20th percentile 128 150 Bal. test Var. name Bin. test Window length /2 p-value (min p-value) p-value Obs<c Obs>=c 0.500 0.268 demvoteshlag2 0.230 9 16 0.625 0.435 dopen 0.377 13 19 0.750 0.268 dopen 0.200 15 24 0.875 0.150 dopen 0.211 16 25 1.000 0.069 dopen 0.135 17 28 1.125 0.037 dopen 0.119 19 31 1.250 0.062 dopen 0.105 21 34 1.375 0.141 dmidterm 0.539 30 36 1.500 0.092 dmidterm 0.640 34 39 1.625 0.113 dmidterm 0.734 37 41 Variable used in binomial test (running variable): demmv Covariates used in balance test: presdemvoteshlag1 population demvoteshlag1 demvoteshlag2 > demwinprv1 demwinprv2 dopen dmidterm Largest recommended window is [-.75; .75] with 39 observations (15 below, 24 above). 11 / 16
Choosing the window with rdwinselect Minimum p-value from covariate test .3 .2 P-value .1 0 .5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9.5 10 window length / 2 The dotted line corresponds to p-value=.15 12 / 16
Sensitivity analysis with rdsensitivity 20 1 .95 18 .9 .85 16 .8 .75 14 .7 .65 12 .6 pvalue .55 10 .5 t .45 .4 8 .35 .3 6 .25 .2 4 .15 .1 2 .05 0 0 .75 2 3.25 4.5 5.75 7 8.25 9.5 w 13 / 16
Rosenbaum bounds with rdrbounds . rdrbounds demvoteshfor2 demmv, gammalist(.8 1 1.2) wlist(.5 .75 1) reps(1000) Calculating randomization p-values... w = 0.500 0.750 1.000 Bernoulli p-value 0.012 0.001 0.000 Running sensitivity analysis... gamma exp(gamma) w = 0.500 0.750 1.000 0.80 2.23 lower bound 0.006 0.001 0.000 upper bound 0.068 0.015 0.002 1.00 2.72 lower bound 0.004 0.001 0.000 upper bound 0.106 0.034 0.006 1.20 3.32 lower bound 0.003 0.001 0.000 upper bound 0.168 0.060 0.017 14 / 16
Other features of rdlocrand Alternative statistics: Kolmogorov-Smirnov, rank sum. Polynomial adjustment of potential outcomes. Randomization-based confidence intervals for treatment effect. Companion R functions with same capabilities. See Cattaneo, Titiunik and Vazquez-Bare (2016): Inference in Regression Discontinuity Designs under Local Randomization. Stata Journal 16(2): 331-367 . 15 / 16
Thank you! 16 / 16
Additional material 17 / 16
Other issues: multiple testing rdwinselect performs hypothesis tests for a large set of covariates. Multiple testing leads to overrejection → “err on the safe side” (smaller windows). Local randomization assumption only credible in a small window. rdwinselect can also test all covariates jointly using Hotelling’s T 2 test. Typically leads to much larger windows. 18 / 16
Other issues: outcome model adjustment Strongest version of local randomization assumption states that potential outcomes do not depend on the score inside the window: Exclusion restriction: Y i ( d , x ) = Y i ( d ). This assumption may be too strong in some scenarios. rdlocrand allows the user to state a polynomial model for the potential outcomes to eliminate the dependence on X . E.g. use a linear model to remove the slope. 19 / 16
Recommend
More recommend