Estimating average causal effects under general interference between - PowerPoint PPT Presentation

Estimating average causal effects under general interference between units Peter M. Aronow and Cyrus Samii Yale University and New York University May 23, 2012 1 / 44

Randomized experiments often involve treatments that may induce “interference between units” Interference: the outcome for unit i depends on the treatment assigned to unit j . If we administer a treatment to unit j , what are the effects on unit i ? Recent work in non-parametric inference focuses on hypothesis testing or estimation in hierarchical (i.e., multilevel) interference settings. We develop a theory of estimation under general forms of interference. 2 / 44

We provide a nonparametric design-based (c.f. Neyman 1923) method for estimating average causal effects, including, but not limited to: Direct effect of assigning a unit to treatment Indirect effects of, e.g., a unit’s peer being assigned to treatment More complex effects (e.g., effect of having a majority of proximal peers treated) In so doing, we highlight how equal probability of treatment assignment does not imply equal probability of indirect exposure to treatment (e.g., proximity to treated units) We develop our main results drawing on classical sampling theory, though model-assisted refinements are possible 3 / 44

Method summary: Design information gives probability distribution for treatment, Z s.t. supp ( Z ) = Ω . Specify an exposure model that converts assigned treatment vectors z ∈ Ω to exposures based on unit attributes (e.g., network degree), f ( Z , θ i ) ≡ D i Implies the exact probabilities of exposure: � π i ( d k ) = p z I ( f ( z , θ i ) = d k ) z ∈ Ω Average causal effects are the average difference between the potential outcomes under exposure d k vs. those under d l . Estimate average causal effects accounting for varying probability of exposures (via some variant of inverse probability weighting). 4 / 44

Roadmap: Simple running example. Some technical details. Application. Anticipating some concerns. 5 / 44

Simple running example. Consider a randomized experiment performed on a finite population of four units in a simple, fixed network: 6 / 44

1 2 3 4 7 / 44

One of these units is assigned to receive an advertisement and the other three are assigned to control, equal probability We want to estimate the effects of advertising on opinion There are four possible randomizations z : 8 / 44

1 2 3 4 9 / 44

1 2 3 4 10 / 44

1 2 3 4 11 / 44

1 2 3 4 12 / 44

So we have exact knowledge of the randomization scheme. But what of the exposure model? This requires researcher discretion. How do we model exposure to a treatment? One example. 13 / 44

Direct exposure means that you have been treated. Indirect exposure means that a peer has been treated.   Di(rect) : z i = 1  D i = In(direct) : z i ± 1 = 1   Co(ntrol) : z i = Z i ± 1 = 0 . There is nothing particularly special about this model, except for its parsimony. Arbitrarily complex exposure models are possible. Let’s visualize this. 14 / 44

1 2 3 4 15 / 44

1 2 3 4 16 / 44

1 2 3 4 17 / 44

1 2 3 4 18 / 44

Summarizing: Unit # Unit # 1 2 3 4 1 2 3 4 1 1 0 0 0 1 Di In Co Co Rand. # Rand. # 2 0 1 0 0 − → 2 In Di In Co 3 0 0 1 0 3 Co In Di In 4 0 0 0 1 4 Co Co In Di Design Z i Exposure D i 19 / 44

We can figure out the exact probabilities that each of the four units would be in each of the exposure conditions: Unit # 1 2 3 4 1 Di In Co Co Rand. # 2 In Di In Co 3 Co In Di In 4 Co Co In Di Exposure D i Unit # 1 2 3 4 Direct 0.25 0.25 0.25 0.25 Indirect 0.25 0.50 0.50 0.25 Control 0.50 0.25 0.25 0.50 Probabilties π i ( D i ) 20 / 44

Let’s make up some potential outcomes associated with each exposure: Unit # 1 2 3 4 Mean Direct 5 10 10 3 7 Indirect 0 3 3 2 2 Control 1 3 6 2 3 Potential outcomes Y i ( D i ) � N Average causal effect: τ ( d k , d l ) = 1 i = 1 [ Y i ( d k ) − Y i ( d l )] . N � N E.g., τ ( Direct , Control ) = 1 i = 1 [ Y i ( Direct ) − Y i ( Control )] = 4. N 21 / 44

Unequal probability design provides a natural and design-unbiased estimator. Assuming π i ( d k ) > 0 and π i ( d l ) > 0, the Horvitz-Thompson (HT) estimator: � I ( D i = d k ) � N � τ HT ( d k , d l ) = 1 Y i ( d k ) − I ( D i = d l ) ˆ Y i ( d l ) N π i ( d k ) π i ( d l ) i = 1 Unbiasedness follows from E [ I ( D i = d k )] = π i ( d k ) . Note: when, for some i , π i ( d k ) = 0 or π i ( d j ) = 0, τ ( d k , d l ) must be estimated only for units with some probability of receiving both exposures. 22 / 44

Applying estimators to this setup: Diff. in Means OLS w/ cov. adj. τ HT ( d k , d l ) � 1 1.00 -1.00 3.00 -3.00 -2.00 -5.50 2 8.00 -0.50 5.00 -2.00 9.00 0.50 Rand. # 3 9.00 1.50 8.00 1.00 9.50 3.00 4 1.00 1.00 2.00 -5.44 -0.50 -2.00 E[.] 4.75 0.25 4.50 -1.00 4.00 -1.00 Bias 0.75 1.25 0.50 0.00 0.00 0.00 τ ( Di , Co ) τ ( In , Co ) τ ( Di , Co ) τ ( In , Co ) τ ( Di , Co ) τ ( In , Co ) Other approaches are biased and inconsistent (i.e., this is not just a small sample problem). Bias can go any number of ways depending on nature of confounding and effect heterogeneity. Another crucial point is that the variance of HT estimator is straightforward. We cannot rely on standard methods to compute standard errors or confidence intervals: 23 / 44

Exact variance: � τ HT ( d k , d l )) = 1 Var [ � HT ( d k )] + Var [ � Y T Y T Var ( � HT ( d l )] N 2 � − 2 Cov [ � HT ( d k ) , � Y T Y T HT ( d l )] , where � N � N Cov [ I ( D i = d k ) , I ( D j = d k )] Y i ( d k ) Y j ( d k ) Var [ � Y T HT ( d k )] = π i ( d k ) π j ( d k ) i = 1 j = 1 N N � � Cov [ I ( D i = d k ) , I ( D j = d l )] Y i ( d k ) Y j ( d l ) Cov [ � HT ( d k ) , � Y T Y T HT ( d l )] = π i ( d k ) π j ( d l ) i = 1 j = 1 24 / 44

Conservative variance estimator: Via Young’s inequality (c.f., Aronow and Samii 2012), given π ij ( d k , d l ) > 0 , ∀ i � = j , ��  � � 2  � Y i ( d k ) Var [ � 1 τ HT ( d k , d l )] = i ∈ U I ( D i = d k )[ 1 − π i ( d k )] � N 2 π i ( d k ) Var [ � µ HT ( d l )] + � �  π ij ( d k ) − π i ( d k ) π j ( d k ) Y j ( d k ) Y i ( d k ) j ∈ U \ i I ( D i = d k ) I ( D j = d k ) i ∈ U π ij ( d k ) π i ( d k ) π j ( d k )  � � 2 + �  Y i ( d l ) i ∈ U I ( D i = d l )[ 1 − π i ( d l )] � π i ( d l ) Var [ � µ HT ( d k )] + � �  π ij ( d l ) − π i ( d l ) π j ( d l ) Y j ( d l ) Y i ( d l ) j ∈ U \ i I ( D i = d l ) I ( D j = d l ) i ∈ U π ij ( d l ) π i ( d l ) π j ( d l ) − 2 � � � I ( D i = d k ) I ( D j = d l ) Y i ( d k ) Y j ( d l ) � �� i ∈ U j ∈ U \ i π ij ( d k , d l ) π i ( d k ) π j ( d l ) − 2 � + 2 � Cov C [ � µ HT ( d l ) , � µ HT ( d k )] . I ( D i = d k ) Y i ( d k ) 2 + I ( D i = d l ) Y i ( d l ) 2 i ∈ U 2 π i ( d k ) 2 π i ( d l ) Unbiased under sharp null hypothesis of no effect, given π ij ( d k , d l ) > 0. (More) conservative variance estimator when ∃ i , j , k , l s.t. π ij ( d k , d l ) = 0. 25 / 44

Asymptotics and intervals: We adopt Brewer (1979)’s large sample scaling, analogous to obtaining estimates by aggregating results from repeated experimentation on a fixed finite population. Consistency and asymptotic normality of � τ HT ( d k , d l ) follow from the WLLN and classical CLT respectively. By the WLLN, p N � Var [ � τ HT ( d k , d l )] − → N Var [ � τ HT ( d k , d l )] + c 1 , where c 1 ≥ 0. Then � � d ( � τ HT ( d k , d l ) − τ HT ( d k , d l )) / Var [ � τ HT ( d k , d l )] − → N ( 0 , 1 − c 2 ) , where 0 ≤ c 2 < 1. Intervals constructed as � � τ HT ( d k , d l ) ± z 1 − α/ 2 � Var [ � τ HT ( d k , d l )] will asymptotically cover τ HT ( d k , d l ) at least 100 ( 1 − α )% of the time. We’ve also proven consistency of estimators and variance under a generalized m -dependence set-up. Restrictions on clustering are key. 26 / 44

Paper proposes refinements for covariate adjustment, weight stabilization, and variance approximation under a constant effect assumption. Further refinements include modeling outcomes based on determinants of exposure probabilities, using HT results to determine appropriate variance approximation. Regardless of the method used, the implied inverse probability weights are fundamental for the consistency of any estimator of average causal effects. Under proper specification, this weighting can be reproduced by regression estimators (in particular, interaction with centered fixed effects for all unique values of probability of exposure) in the limit. 27 / 44

Let’s consider a richer example. Goal is to estimate direct and indirect effects of a treatment offered to a randomly selected set of individuals on a complex, undirected network (e.g., an anti-prejudice curriculum in schools – Paluck and Shepherd 2012) 28 / 44

Network 29 / 44

Suppose complete random assignment of M = . 2 N units to treatment. � indicator � N Design implies Z has uniform probability over Ω , an N × M matrix, where z is a realization a Z , e.g., z = ( z 1 , z 2 , z 3 , ..., z N − 1 , z N ) ′ = ( 0 , 1 , 0 , ..., 1 , 0 ) ′ . 30 / 44

Estimating average causal effects under general interference between - PowerPoint PPT Presentation

Estimating average causal effects under general interference between units Peter M. Aronow and Cyrus Samii Yale University and New York University May 23, 2012 1 / 44 Randomized experiments often involve treatments that may induce

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Estimating average causal effects under general interference between units Peter M. Aronow and

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Estimating Estimating Covariance . . . Statistical Characteristics Estimating . . . Proof of

Sense and sensitivity when estimating causal effects in clinical trials Mid-Atlantic Causal

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

Efficient Least Squares for Estimating Total Causal Effects Richard Guo, Emilija Perkovi c

Identification and Estimation of Dynamic Causal Effects in Macroeconomics Jim Stock and Mark

Estimating time-varying causal effect Introduction moderation in mobile health with binary

Planning III-A: Planning III-A: Estimating Software Size - Estimating Software Size -

Q3 17 Earnings Presentation Bernard Charls, Vice-Chairman of the Board of Directors & CEO

Equity and Excellence in Basic Education: Challenges in the Netherlands Jo Ritzen, former

Facebook Q4 2017 Results investor.fb.com Daily Active Users (DAUs) In Millions Rest of World

INTOSAI KNOWLEDGE SHARING AND KNOWLEDGE SERVICES COMMITTEE Goal Chair: SAI India Working

Ecosystem Services in Forest Sector Models: A Review Carl Nolander, Robert Lundmark Introduction

Share Capital -1 ASJ Division of share capital into fixed amounts The share capital of a company

Intensive Course on FEMA Chamber of Tax Consultants MUMBAI 20 December 2019 Isha Sekhri Ajay

20:20 Investor Seminar Best of the West 16 J l 16 July 2010, Sydney 2010 S d Forward Looking

Estimating average causal effects under general interference between - PowerPoint PPT Presentation

Estimating average causal effects under general interference between units Peter M. Aronow and Cyrus Samii Yale University and New York University May 23, 2012 1 / 44 Randomized experiments often involve treatments that may induce

Estimating Variance under Estimating Mean . . . Interval and Fuzzy Estimating Variance . . .

Causal Effect Evaluation and Causal Network Learning Zhi Geng Peking University, China June

Estimating average causal effects under general interference between units Peter M. Aronow and

Political Science 209 - Fall 2018 Causal Inference Florian Hollenbach 7th September 2018 Causal

Foundations of Causal Discovery Frederick Eberhardt KDD Causality Workshop 2016 Causal Discovery

Estimating Estimating Covariance . . . Statistical Characteristics Estimating . . . Proof of

Sense and sensitivity when estimating causal effects in clinical trials Mid-Atlantic Causal

A Brief Introduction to Causal Inference Brady Neal causalcourse.com What is causal inference?

Causal Inference By: Miguel A. Hern an and James M. Robins Part I: Causal inference without

Causal Programming Causal Programming Joshua Brul Joshua Brul

Few-shot Domain Adaptation 1/12 by Causal Mechanism Transfer Domain adaptation Causal mechanism

Causal Discovery from Observational Data Brady Neal causalcourse.com What if we dont have

Efficient Least Squares for Estimating Total Causal Effects Richard Guo, Emilija Perkovi c

Identification and Estimation of Dynamic Causal Effects in Macroeconomics Jim Stock and Mark

Estimating time-varying causal effect Introduction moderation in mobile health with binary

Planning III-A: Planning III-A: Estimating Software Size - Estimating Software Size -

Q3 17 Earnings Presentation Bernard Charls, Vice-Chairman of the Board of Directors &amp; CEO

Equity and Excellence in Basic Education: Challenges in the Netherlands Jo Ritzen, former

Facebook Q4 2017 Results investor.fb.com Daily Active Users (DAUs) In Millions Rest of World

INTOSAI KNOWLEDGE SHARING AND KNOWLEDGE SERVICES COMMITTEE Goal Chair: SAI India Working

Ecosystem Services in Forest Sector Models: A Review Carl Nolander, Robert Lundmark Introduction

Share Capital -1 ASJ Division of share capital into fixed amounts The share capital of a company

Intensive Course on FEMA Chamber of Tax Consultants MUMBAI 20 December 2019 Isha Sekhri Ajay

20:20 Investor Seminar Best of the West 16 J l 16 July 2010, Sydney 2010 S d Forward Looking

Q3 17 Earnings Presentation Bernard Charls, Vice-Chairman of the Board of Directors & CEO