Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey - PowerPoint PPT Presentation

Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey David Hirshberg Guido Imbens Stefan Wager JSM. August 3rd, 2020. 1

When Berkeley implemented the first soda tax, we compared to San Francisco. While Berkeley, the first U.S. city to pass a “soda tax,” saw a substantial decline of 0.13 times/day in the consumption of soda in the months following implementation of the tax in March 2015, neighboring San Francisco, where a soda-tax measure was defeated, and Oakland, saw a 0.03 times/day increase, according to a study published today in the American Journal of Public Health. 2

This is how we did it. San Francisco } } Hallucinated Parallel Berkeley Berkeley 3

This is a “Difference-in-Differences” estimate • We compare Berkeley’s change in consumption to San Francisco’s. τ = Y (1) BK,post − Y (0) BK,post . τ = [ Y (1) BK,post − Y (0) BK,pre ] − [ Y (0) SF,post − Y (0) ˆ SF,pre ] • Subtracting SF’s change adjusts for a trend in absence of intervention. • It works if the cities follow parallel trends in absence of intervention. Y (0) city,time ≈ α city + β · 1 { time = post } . • This assumption is strong, but we need it (or more data) • We can’t distinguish a treatment effect from a difference in trend 4

Difference in Differences Things get interesting when we observe many units over many time periods . We focus on simultaneous adoption . 1 , . . . , T 0 T 0 + 1 , . . . , T = T 0 + T 1 1 . . . no treatment no treatment N 0 N 0 + 1 . . . no treatment treatment N = N 0 + N 1 • We could still use a parallel trends model: Y ( w ) ∼ α i + β t + wτ it • Least squares in this model is equivalent to 2 × 2 diff-in-diff applied to the averages our 4 ‘blocks’ • But we can see that trends in absence of treatment aren’t parallel 5

California’s anti-smoking legislation (Proposition 99) Average Control California 120 90 60 1970 1980 1990 2000 A 25 cents/pack excise tax increase took effect in 1989. 1 1 California �≈ 49 Alaska + 49 Alabama + . . . 6

California’s anti-smoking legislation: Difference-in-Differences Average Control California 120 } 𝝊 90 60 1970 1980 1990 2000 If we average and hallucinate a line, it obviously doesn’t fit. 7

Synthetic Controls • If California’s pre-treatment trend doesn’t match the average state’s, compare it to something else. • For example, a weighted average of states with a trend that does match. • This weighted average of units is called a synthetic control [Abadie, Diamond, and Hainmueller, 2010] • Construction: weight the control units to match pre-treatment outcomes, � ¯ ω n Y nt ˆ ≈ Y treated,t for all t ≤ T 0 . � �� n ≤ N 0 treated unit average at time t � �� control unit average at time t • Treatment effects are typically estimated by cross-sectional comparison: the mean post-treatment difference between treated and synthetic control. � � τ = 1 � � ¯ ˆ Y treated,t − ω n Y nt ˆ . T 1 t>T 0 n ≤ N 0 8

California’s anti-smoking legislation: Synthetic Control Synthetic Control 120 California 100 80 60 40 1970 1980 1990 2000 When comparing to a synthetic control, trends line up better. California ≈ . 3 Utah + . 2 Nevada + . 15 Montana + . . . 9

Improving on Synthetic Control Instead of constructing a unit for a cross-sectional comparison, construct a unit and time period for a diff-in-diff comparison. This is a double robust version of synthetic control. If the before/after comparison is good, the unit comparison doesn’t have to be. And it’s easier to make them good. Constants shifts get differenced out, so constructed parallel trends are as good as overlaid. 10

California’s anti-smoking legislation: Constructed Parallel Trends 160 ● treated sdid ● ● sc ● 120 ● ● ● ● 80 ● ● ● ● ● 40 1970 1980 1990 2000 11

California’s anti-smoking legislation: Constructed Parallel Trends 160 ● treated sdid ● ● sc 120 ● ● ● ● 80 ● ● ● ● ● ● 40 1970 1980 1990 2000 11

California’s anti-smoking legislation: Constructed Parallel Trends 160 ● treated sdid ● ● sc 120 ● ● 80 ● ● ● ● 40 1970 1980 1990 2000 11

California’s anti-smoking legislation: Constructed Parallel Trends 160 ● treated sdid ● ● sc 120 ● ● 80 ● ● ● ● ● ● 40 1970 1980 1990 2000 11

California’s anti-smoking legislation: Constructed Parallel Trends 160 ● treated sdid ● ● sc 120 ● ● 80 ● ● ● ● ● ● ● 40 1970 1980 1990 2000 11

−80 −40 California’s anti-smoking legislation: Double Robustness 0 ● ● Alabama ● ● ● Arkansas ● ● Colorado ● Connecticut ● ● Delaware ● ● Georgia ● ● Idaho ● ● Illinois ● ● Indiana ● ● Iowa ● ● Kansas ● ● Kentucky ● ● Louisiana ● ● ● Maine ● Minnesota ● ● Mississippi ● ● ● ● Missouri ● ● Montana Nebraska ● ● ● Nevada ● New Hampshire ● ● New Mexico ● ● North Carolina ● ● ● North Dakota ● Ohio ● ● Oklahoma ● ● Pennsylvania ● ● Rhode Island ● ● South Carolina ● ● South Dakota ● ● Tennessee ● ● Texas ● ● Utah ● ● Vermont ● ● Virginia ● ● West Virginia ● ● Wisconsin ● ● Wyoming estimator unit.weight ● ● ● ● ● ● ● sdid sc 0.25 0.20 0.15 0.10 0.05 12

Implementation 1. Estimate synthetic control weights ˆ ω by simplex-constrained least squares on the pre-treatment data. � � 2 � ω 0 + ω T Y control,t − ¯ + ζ 2 T 0 � ω � 2 ˆ ω = arg min Y treated,t ω 0 ,ω t ≤ T 0 N 0 � subject to ω 1 . . . ω N 0 ≥ 0 , ω n = 1 n =1 Use an intercept. We want parallel lines, not overlaid ones. Use a ridge penalty; multicollinearity is typical. Shrinkage helps control variance and own-observation bias. 2. Estimate time series regression weights, ˆ λ , on the control units. 3. Estimate τ by ( 2 × 2 ) diff-in-diff on weighted block averages. 4. Form confidence intervals using the jackknife estimate of standard error. 13

Synthetic Difference-in-Differences synthetic pre-treatment average post-treatment � � � � ω n ˆ ω n T − 1 synthetic control ˆ λ t Y nt ˆ Y nt 1 n ≤ N 0 t ≤ T 0 n ≤ N 0 t>T 0 � � � � N − 1 ˆ N − 1 T − 1 average treated λ t Y nt Y nt 1 1 1 n>N 0 t ≤ T 0 n>N 0 t>T 0 DID uses equal weights ω n = 1 /N 0 , λ t = 1 /T 0 . SC only take one difference (uses zero time weights λ t = 0 ). 14

Theory

A General Setting Y nt = L nt + W nt τ nt + ε nt , E [ ε | W ] = 0 • L : Matrix of noiseless control potential outcomes • τ : Matrix of treatment effects • ε : Noise matrix with iid subgaussian rows • We have autocorrelation over time • But no correlation between units. • W indicates the treated block We estimate the ATT 1 � τ = ¯ W nt τ nt N 1 T 1 nt • Typical sample sizes are small, but the setting is ‘high dimensional’. • We see various dimension ratios T/N , T 1 /T 0 , N 1 /N 0 . • We lose the essence in asymptotics with too many fixed dimensions. • The signal L tends to be multicollinear: no restricted eigenvalue condition! � • For simplicity, we’ll assume rank( L ) ≪ min( N 0 , T 0 ) . 15

What can go wrong? Underfitting We don’t create parallel trends in pre-treatment outcomes. Overfitting We do, but by predicting signal from noise. Failed identification We adjust as intended, but we’re still confounded. 16

Underfitting It happens, but it tends to be something we can see. e.g., California cigarette consumption with southeastern states as controls. 150 synth. california california 100 50 1970 1980 1990 2000 California �≈ . 82 Louisiana + . 10 Mississippi + . . . 17

Overfitting We prove concentration around an oracle estimator to rule out overfitting. 1. Consider the limits of our unit and time weights: the minimizers of expected (as opposed to empirical) mean squared error. � � 2 � ω 0 + ω T L control,t − ¯ + [trace(Σ) + ζ 2 T 0 ] � ω � 2 ω = arg min ˜ L treated,t ω 0 ,ω ∈ R ×S t ≤ T 0 � � 2 + N 0 � Σ 1 / 2 ( λ − ψ ) � 2 . � ˜ λ 0 + L n,pre λ − ¯ λ = arg min L n,post λ 0 ,λ ∈ R ×S n ≤ N 0 We’re in an error-in-variables model, so implicit ridge penalty terms arise as the expectation of quadratics in the noise matrix ε . Σ = E ε T n,pre ε n,pre pre-treatment autocovariance matrix ε n,post ) 2 ψ = arg min E ( ε n,pre v − ¯ post-on-pre autoregression vector v ∈ R T 0 2. The oracle estimator ˜ τ uses these in place of the empirical minimizers. 3. Its error is easy to characterize because these weights are non-random. 18

Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey - PowerPoint PPT Presentation

Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey David Hirshberg Guido Imbens Stefan Wager JSM. August 3rd, 2020. 1 When Berkeley implemented the first soda tax, we compared to San Francisco. While Berkeley, the first

Friendship amidst differences Friendship amidst differences Friendship amidst differences

Difference-in-Differences Brady Neal causalcourse.com Motivation and Preliminaries

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Unpacking the Differences: Unpacking the Differences: Unpacking the Differences: Unpacking the

6. Individual Differences Differences: Big Questions Are some differences changeable and

Synthetic Biology and Rational Design Keith Shearwin University of Adelaide Synthetic biology

Modular Synthetic Receptor System Interfaced with Nano Breadboard Synthetic receptor scheme

Causal inference Part II: Difference In Difference and Instrumental Variables Difference in

Difference-in-Difference estimator Presented at Summer School 2015 by Ziyodullo Parpiev, PhD

Singer difference sets and difference system of sets Akihiro Munemasa Graduate School of

String Algae Spear Moss seeded and seedless Sword Fern See notes on differences between

The Semi Synthetic Artemisinin Project Industrialization of a Synthetic Biology derived product

What are synthetic drugs and how do we combat them? How to deal with the current trends of

Synthetic biology Arnold Driessen Molecular Microbiology, University of Groningen

DNA-based synthetic DNA-based synthetic lectins lectins to inv to investigate estigate clathrin

Synthetic Aperture Radar Image Compression By Magesh Valliappan Guner Arslan 1 Synthetic

Lattice Synergy Curtis A. Meyer Carnegie Mellon University May 15 th , 2009 Lattice QCD

Radio recombination lines: the synergy between a big dish and dipoles Pedro Salas The Big

The Peter Wall Institute for Advanced Studies: Opening the Dialogue Jon Beasley-Murray

Rural-urban synergies in development and propensity to migrate Andrea Cattaneo (FAO) UN-WIDER

CS533 benchmark v. trans. To subject (a system) to a series of tests Modeling and Performance In

What deep generative models can do for you: Opportunities, challenges, and open questions Giulia

High-frequency imaging of a moving object Clifford Nolan University of Limerick Conference in

Ytterbium quantum gases in Florence Leonardo Fallani University of Florence & LENS Credits

Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey - PowerPoint PPT Presentation

Synthetic Difference in Differences Dmitry Arkhangelsky Susan Athey David Hirshberg Guido Imbens Stefan Wager JSM. August 3rd, 2020. 1 When Berkeley implemented the first soda tax, we compared to San Francisco. While Berkeley, the first

Friendship amidst differences Friendship amidst differences Friendship amidst differences

Difference-in-Differences Brady Neal causalcourse.com Motivation and Preliminaries

Synthetic Biology Considerations in Synthetic Biology Considerations in Synthetic Biology

Unpacking the Differences: Unpacking the Differences: Unpacking the Differences: Unpacking the

6. Individual Differences Differences: Big Questions Are some differences changeable and

Synthetic Biology and Rational Design Keith Shearwin University of Adelaide Synthetic biology

Modular Synthetic Receptor System Interfaced with Nano Breadboard Synthetic receptor scheme

Causal inference Part II: Difference In Difference and Instrumental Variables Difference in

Difference-in-Difference estimator Presented at Summer School 2015 by Ziyodullo Parpiev, PhD

Singer difference sets and difference system of sets Akihiro Munemasa Graduate School of

String Algae Spear Moss seeded and seedless Sword Fern See notes on differences between

The Semi Synthetic Artemisinin Project Industrialization of a Synthetic Biology derived product

What are synthetic drugs and how do we combat them? How to deal with the current trends of

Synthetic biology Arnold Driessen Molecular Microbiology, University of Groningen

DNA-based synthetic DNA-based synthetic lectins lectins to inv to investigate estigate clathrin

Synthetic Aperture Radar Image Compression By Magesh Valliappan Guner Arslan 1 Synthetic

Lattice Synergy Curtis A. Meyer Carnegie Mellon University May 15 th , 2009 Lattice QCD

Radio recombination lines: the synergy between a big dish and dipoles Pedro Salas The Big

The Peter Wall Institute for Advanced Studies: Opening the Dialogue Jon Beasley-Murray

Rural-urban synergies in development and propensity to migrate Andrea Cattaneo (FAO) UN-WIDER

CS533 benchmark v. trans. To subject (a system) to a series of tests Modeling and Performance In

What deep generative models can do for you: Opportunities, challenges, and open questions Giulia

High-frequency imaging of a moving object Clifford Nolan University of Limerick Conference in

Ytterbium quantum gases in Florence Leonardo Fallani University of Florence &amp; LENS Credits

Ytterbium quantum gases in Florence Leonardo Fallani University of Florence & LENS Credits