SLIDE 1 On Binscatter
Matias D. Cattaneo1, Richard K. Crump2, Max H. Farrell3 and Yingjie Feng4 November 2019
1Princeton University 2Federal Reserve Bank of New York. The views expressed here are those of the authors and do
not necessarily reflect the position of the Federal Reserve Bank of New York or the Federal Reserve System.
3University of Chicago. 4Princeton University.
SLIDE 2 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 3
Introduction
Binscatter is widely used in applied microeconomics. ◮ Popularized by Chetty, Friedman, Hilger, Saez, Schanzenbach, and Yagan (2011). ◮ Previous incarnations:
◮ Regressogram (Tukey, 1961). ◮ Subclassification (Cochran, 1968). ◮ Portfolio Sorting (Fama, 1976). ◮ Regression Trees (Friedman, 1977). ◮ you tell me...
◮ Today: first foundational, thorough study of Binscatter.
◮ Methodology: guidance on valid and invalid current practices, and more. ◮ Theory: novel strong approximation approach, and more. ◮ Practice: new R and Stata software (Binsreg package).
SLIDE 4 What is a binned scatter plot?
Step 1: Start with a familiar scatter plot
Y
SLIDE 5 What is a binned scatter plot?
Step 2: Partition the support of X into bins
Y
SLIDE 6 What is a binned scatter plot?
Step 3: Find the average Y in each bin
Y
SLIDE 7 What is a binned scatter plot?
Step 4: Plot only bin means
Y
SLIDE 8 What is a binned scatter plot?
Step 5: Add a polynomial fit to raw data
X Y
SLIDE 9
Typical Example: Chetty, Friedman and Rockoff (2014, AER)
Note: n = 4, 170, 905 with # of bins J = 20
SLIDE 10 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 11 Overview: Contributions
- 1. Set up formal, general framework for studying Binscatter.
◮ Respects practice: quantile-spaced binning, covariate adjustment. ◮ Generalizations: higher-order polynomial, smoothness-restricted approximations.
- 2. IMSE-Optimal choice of binning structure.
- 3. Valid point estimators, confidence intervals, and confidence bands.
- 4. Valid hypothesis testing of parametric specification and shape restrictions.
- 5. New theoretical results specifically developed for binscatter.
- 6. New R and Stata software resolving valid and invalid current practices.
SLIDE 32 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 33 Framework: Canonical Binscatter
yi = µ(xi) + εi, E[εi|xi] = 0. Binscatter:
b(x)′ β,
β n
(yi − b(xi)′β)2. ◮ Partitioning/Binning:
B1, . . . , BJ},
- x(1), x(⌊n/J⌋)
- if j = 1
- x(⌊n(j−1)/J⌋), x(⌊nj/J⌋)
- if j = 2, . . . , J − 1
- x(⌊n(J−1)/J⌋), x(n)
- if j = J
, ◮ Within-Bin Constant Approximation:
✶
B1(x)
✶
B2(x)
· · · ✶
BJ (x) ′
◮ Dimension: J.
SLIDE 35 Framework: Within-Bin Polynomial Approximation
yi = µ(xi) + εi, E[εi|xi] = 0. Binscatter:
b(v)(x)′ β,
β n
(yi − b(xi)′β)2. ◮ Partitioning/Binning: ∆ = { B1, . . . , BJ}. ◮ Within-Bin Polynomial Approximation:
✶
B1(x)
✶
B2(x)
· · · ✶
BJ (x) ′ ⊗ [ 1
x · · · xp ]′ ,
◮ Dimension: (p + 1) · J.
◮ Restrictions: 0 ≤ v ≤ p.
SLIDE 39 Framework: Across-Bins Smoothness Restriction
yi = µ(xi) + εi, E[εi|xi] = 0. Binscatter:
b(v)
s (x)′
β,
β n
(yi − bs(xi)′β)2. ◮ Partitioning/Binning: ∆ = { B1, . . . , BJ}. ◮ Across-Bins Smoothness Restriction:
Ts b(x),
✶
B1(x)
· · · ✶
BJ (x) ′ ⊗ [ 1
· · · xp ]′ ,
◮ Dimension Ts: [(p + 1)J − (J − 1)s] × (p + 1)J.
◮ Restrictions: 0 ≤ s, v ≤ p.
SLIDE 45 Framework: Covariate Adjustment
yi = µ(xi) + w′
iγ + ǫi,
E[ǫi|xi, wi] = 0 Covariate-Adjusted Binscatter:
b(v)
s (x)′
β, β
β,γ n
(yi − bs(xi)′β − w′
iγ)2.
◮ Partitioning/Binning: { B1, . . . , BJ} — Binscatter Basis: bs(x). ◮ Dimension: [(p + 1)J − (J − 1)s] + d — Restrictions: 0 ≤ s, v ≤ p.
SLIDE 46 Framework: Covariate Adjustment
yi = µ(xi) + w′
iγ + ǫi,
E[ǫi|xi, wi] = 0 Covariate-Adjusted Binscatter:
b(v)
s (x)′
β, β
β,γ n
(yi − bs(xi)′β − w′
iγ)2.
◮ Partitioning/Binning: { B1, . . . , BJ} — Binscatter Basis: bs(x). ◮ Dimension: [(p + 1)J − (J − 1)s] + d — Restrictions: 0 ≤ s, v ≤ p. Residualized Binscatter (a No, No!):
b(x)′ β,
β n
( yi − b( xi)′β)2. where
δy.w and
δx.w
SLIDE 49 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 50 IMSE-Optimal Partitioning/Binning
b(v)
s (x)′
β, β
β,γ n
(yi − bs(xi)′β − w′
iγ)2.
◮ Partitioning/Binning: { B1, . . . , BJ}, with Bj =
- x(⌊n(j−1)/J⌋), x(⌊nj/J⌋)
- .
◮ IMSE Expansion:
2 f(x)dx ≈P J1+2v n Vn(p, s, v) + J−2(p+1−v)Bn(p, s, v). ◮ IMSE-optimal choice: JIMSE = 2(p − v + 1)Bn(p, s, v) (1 + 2v)Vn(p, s, v)
2p+3
n
1 2p+3
◮ Result handles estimated quantiles. Evenly-Spaced binning also studied.
SLIDE 51 Pointwise Inference: Confidence Intervals
µ(v)(x) − µ(v)(x)
, 0 ≤ v, s ≤ p,
b(v)
s (x)′
Q−1 Σ Q−1 b(v)
s (x),
n
n
bs(xi)′(yi − bs(xi)′ β − w′
i
γ)2. ◮ Distributional Approximation: sup
u∈R
Tp(x) ≤ u
for each x ∈ X. ◮ Valid Confidence Intervals: J = JIMSE for p, then for q ≥ 1, P
Ip+q(x)
for all x ∈ X, where
- Ip(x) =
- µ(v)(x) ± c ·
- Ω(x)/n
- ,
c = Φ−1(1 − α/2).
SLIDE 53 Uniform Inference
Main Goal: Approximate the “distribution” of the stochastic process
µ(v)(x) − µ(v)(x)
: x ∈ X , 0 ≤ v, s ≤ p, ◮ Useful to approximate distribution of statistics such as sup
x∈X
| Tp(x)|, sup
x∈X
inf
x∈X
etc. ◮ New strong approximation approach (based on Hungarian construction): sup
x∈X
Zp(x) =
0 (x)′T′ sQ−1Σ1/2NK
, where NK ∼ N(0, IK),
- Q ≈P Q,
- Ts ≈P Ts,
- Ω(x) ≈P Ω(x),
etc.
SLIDE 54 Uniform Inference: Heuristics of Technical Idea (4 Steps)
- 1. Hats off, except non-uniform-controlled partitioning scheme:
sup
x∈X
| Tp(x) − tp(x)| = oP(rn), tp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)ǫi]
SLIDE 55 Uniform Inference: Heuristics of Technical Idea (4 Steps)
- 1. Hats off, except non-uniform-controlled partitioning scheme:
sup
x∈X
| Tp(x) − tp(x)| = oP(rn), tp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)ǫi]
- Ω(x)
- 2. Coupling to conditional Gaussian Process (Hungarian construction):
sup
x∈X
|tp(x) − zp(x)| = oP(rn), zp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)σ(xi)ηi]
SLIDE 56 Uniform Inference: Heuristics of Technical Idea (4 Steps)
- 1. Hats off, except non-uniform-controlled partitioning scheme:
sup
x∈X
| Tp(x) − tp(x)| = oP(rn), tp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)ǫi]
- Ω(x)
- 2. Coupling to conditional Gaussian Process (Hungarian construction):
sup
x∈X
|tp(x) − zp(x)| = oP(rn), zp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)σ(xi)ηi]
- Ω(x)
- 3. Coupling to unconditional (up to non-uniform partitioning) Gaussian Process:
sup
x∈X
|zp(x) − Zp(x)| = oP(rn), Zp(x) =
0 (x)′T′ sQ−1Ση
, η ∼ N(0, IK)
SLIDE 57 Uniform Inference: Heuristics of Technical Idea (4 Steps)
- 1. Hats off, except non-uniform-controlled partitioning scheme:
sup
x∈X
| Tp(x) − tp(x)| = oP(rn), tp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)ǫi]
- Ω(x)
- 2. Coupling to conditional Gaussian Process (Hungarian construction):
sup
x∈X
|tp(x) − zp(x)| = oP(rn), zp(x) =
0 (x)′T′ sQ−1Gn[bs(xi)σ(xi)ηi]
- Ω(x)
- 3. Coupling to unconditional (up to non-uniform partitioning) Gaussian Process:
sup
x∈X
|zp(x) − Zp(x)| = oP(rn), Zp(x) =
0 (x)′T′ sQ−1Ση
, η ∼ N(0, IK)
- 4. For example, supremum approximation (with hats back on):
sup
u∈R
x∈X
| Tp(x)| ≤ u
sup
x∈X
| Zp(x)| ≤ u
SLIDE 58 Uniform Inference: Confidence Bands
sup
u∈R
x∈X
| Tp(x)| ≤ u
sup
x∈X
| Zp(x)| ≤ u
Zp(x) =
s (x)′
Q−1 Σ1/2
NK, NK ∼ N(0, IK) ◮ Valid Confidence Band: J = JIMSE for p, then for q ≥ 1, P
Ip+q(x), for all x ∈ X
where
- Ip(x) =
- µ(v)(x) ± c ·
- Ω(x)/n
- ,
c = inf
sup
x∈X
SLIDE 62 Uniform Inference: Parametric Specification Testing
¨ H0 : sup
x∈X
vs. ¨ HA : sup
x∈X
for some θ ∈ Θ for all θ ∈ Θ ◮ Test statistic: for θ and m(·) “well-behaved” under ¨ H0 and ¨ HA, ¨ Tp(x) = µ(v)(x) − m(v)(x, θ)
, 0 ≤ v, s ≤ p,
◮ For given p set J = JIMSE, and for q ≥ 1 set c = inf
sup
x∈X
| Zp+q(x)| ≤ c
H0, then lim
n→∞ P
x∈X
Tp+q(x)
◮ Under ¨ HA, then lim
n→∞ P
x∈X
Tp+q(x)
SLIDE 63 Uniform Inference: Shape Restriction Testing
˙ H0 : sup
x∈X
µ(v)(x) ≤ 0 vs. ˙ HA : sup
x∈X
µ(v)(x) > 0 ◮ Test statistic: ˙ Tp(x) =
, 0 ≤ v, s ≤ p,
◮ For given p set J = JIMSE, and for q ≥ 1 set c = inf
sup
x∈X
- Zp+q(x) ≤ c
- ≥ 1 − α
- ◮ Under ˙
H0, then lim
n→∞ P
x∈X
˙ Tp+q(x) > c
◮ Under ˙ HA, then lim
n→∞ P
x∈X
˙ Tp+q(x) > c
SLIDE 64
Y
constant linear quadratic
SLIDE 65
Y
constant linear quadratic
SLIDE 66
Half Support (n = 482) Full Support (n = 1000) Test Statistic P-value J Test Statistic P-value J Parametric Specification Constant 11.716 0.000 12 11.607 0.000 24 Linear 2.994 0.092 12 4.968 0.000 24 Quadratic 2.392 0.384 12 4.300 0.002 24 Shape Restrictions Negativity 4.069 0.000 12 12.226 0.000 24 Increasing −1.964 0.536 13 −2.168 0.394 13 Concavity 2.269 0.316 14 2.544 0.180 14
SLIDE 67 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 68
Software Implementation: the binsreg Package
https://sites.google.com/site/nppackages/binscatter/ ◮ Implements all estimation, inference, and graphical presentation methods developed in our paper for binscatter and generalizations thereof. ◮ Available in Stata and R. ◮ Companion software article: CCFF (2019, “Binscatter Regressions”). ◮ Three commands/functions:
◮ binsreg: point estimation, confidence intervals, confidence band, global polynomial approximations, and more. Main purpose is to generate Binned Scatter Plots. ◮ binsregtest: parametric specification and nonparametric shape hypothesis testing. ◮ binsregselect: data-driven, IMSE-optimal binning/partitioning selection.
SLIDE 69
Upcoming Upgrades and Extensions
◮ L2 and other metrics for hypothesis testing. ◮ New command/function binsreglincom for testing of linear combinations across subgroups (e.g., H0 : µ1(x) = µ2(x) for all x). For now, see option by() for joint plotting of marginal confidence bands. ◮ New command/function binsxtreg for panel data estimation, inference and binned scatter plots. For now, in Stata use the command i. or ib(). for incorporating fixed effects (as with the regress command). ◮ Handling of formulas in R package. ◮ Recentering of binscatter estimate of µ(x) to account for additional covariates. For now, the package sets additional covariates at zero. ◮ Backwards compatibility with Stata 13. For now, Stata 14 or better is needed.
SLIDE 70 Outline
- 1. Introduction
- 2. Overview
- 3. Methodological Contributions
- 4. Theoretical Contributions
- 5. Practical Contributions
- 6. Final Remarks
SLIDE 71
Overview
◮ Binscatter is widely used in applied microeconomics. ◮ Methodological and formal results lagging behind its popularity. ◮ We offer a through treatment of canonical binscatter and its generalizations.
◮ Formal framework: covariate-adjustment, smoothness restrictions, and more. ◮ Optimal choice of partitioning/binning. ◮ Confidence intervals and confidence bands. ◮ Hypothesis testing for shape restrictions and for parametric specifications.
◮ New theoretical results for partitioning-based estimators with random partitions. ◮ Binsreg Package for Stata and R.