Detecting Network Effects
Randomizing Over Randomized Experiments
Martin Saveski
(@msaveski)
Detecting Network Effects Randomizing Over Randomized Experiments - - PowerPoint PPT Presentation
Detecting Network Effects Randomizing Over Randomized Experiments Martin Saveski (@msaveski) MIT Detecting Network Effects Randomizing Over Randomized Experiments Guillaume Saint Jacques Martin Saveski Jean Pouget-Abadie MIT MIT
(@msaveski)
Martin Saveski
MIT
Guillaume Saint‑Jacques
MIT
Jean Pouget-Abadie
Harvard
Weitao Duan
Souvik Ghosh
Ya Xu
Edo Airoldi
Harvard
Treatment
New Feed Ranking Algorithm
Zi = 1
Treatment
New Feed Ranking Algorithm
Zi = 1
Old Feed Ranking Algorithm
Control
Zj = 0
Treatment
New Feed Ranking Algorithm
Zi = 1
Old Feed Ranking Algorithm
Control
Zj = 0
Treatment
New Feed Ranking Algorithm
Zi = 1
Engagement
Yi
Old Feed Ranking Algorithm
Control
Zj = 0
Treatment
New Feed Ranking Algorithm
Zi = 1
Engagement
Yi
Old Feed Ranking Algorithm
Control
Zj = 0
Treatment
New Feed Ranking Algorithm
Zi = 1
Engagement
Yi
Old Feed Ranking Algorithm
Control
Zj = 0
Engagement
Yj
Completely-randomized Experiment
Treatment (B)
Completely-randomized Experiment
Control (A) Treatment (B)
Completely-randomized Experiment
Control (A) Treatment (B)
Completely-randomized Experiment
completely-randomized
Control (A) Treatment (B)
Completely-randomized Experiment
completely-randomized
Every user’s behavior is affected only by their treatment and NOT by the treatment of any other user
Cluster-based Randomized Experiment
Cluster-based Randomized Experiment
Cluster-based Randomized Experiment
Cluster-based Randomized Experiment
Control (A) Treatment (B)
Completely-randomized Experiment Cluster-based Randomized Experiment
OR
Completely-randomized Experiment Cluster-based Randomized Experiment
Lower Variance More Spillovers Higher Variance Less Spillovers OR
Completely Randomized Experiment
Cluster-based Randomized Experiment Completely Randomized Experiment
μcompletely-randomized μcluster-based
Cluster-based Randomized Experiment Completely Randomized Experiment
=
?
Reject the null when:
Reject the null when:
Reject the null when: Type I error is no greater than α
– Constants VS random variables
– Constants VS random variables
– Constants VS random variables
– Variance reduction
– Constants VS random variables
– Variance reduction – Balance on pre-treatment covariates
(homophily => large homogenous clusters)
Most clustering methods find skewed distributions of cluster sizes
(Leskovec, 2009; Fortunato, 2010)
=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes
(Leskovec, 2009; Fortunato, 2010)
=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes
(Leskovec, 2009; Fortunato, 2010)
Restreaming Linear Deterministic Greedy
(Nishimura & Ugander, 2013)
=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes
(Leskovec, 2009; Fortunato, 2010)
Restreaming Linear Deterministic Greedy
(Nishimura & Ugander, 2013)
– Streaming – Parallelizable – Stable
– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency
35.6% 28.5% 26.2% 22.8% 21.1%
0% 10% 20% 30% 40%
k = 1000 k = 3000 k = 5000 k = 7000 k = 10000
% edges within clusters
– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency
35.6%
0% 10% 20% 30% 40%
k = 1000 k = 3000 k = 5000 k = 7000 k = 10000
% edges within clusters
– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency
35.6% 28.5% 26.2% 22.8% 21.1%
0% 10% 20% 30% 40%
k = 1000 k = 3000 k = 5000 k = 7000 k = 10000
% edges within clusters
– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency
small k large k
small k large k large clusters small clusters
small k large k large clusters small clusters large network effect large variance small network effect small variance
Understanding the Type II error
Assuming an interference model
Understanding the Type II error
Assuming an interference model
ρi : fraction of treated friends
Understanding the Type II error
Assuming an interference model
ρi : fraction of treated friends
: average fraction of a unit's neighbors contained in the cluster
ρi
Understanding the Type II error
Assuming an interference model
ρi : fraction of treated friends
: average fraction of a unit's neighbors contained in the cluster
ρi
Choose number of clusters M and clustering C such that
M,C
C
Understanding the Type II error
μcompletely-randomized μcluster-based
Cluster-based Randomized Experiment Completely Randomized Experiment
=
?
(μbernoulli)
Bernoulli Randomized Experiment
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%]
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)
0.0265
p-value: 0.4246
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)
0.0265
p-value: 0.4246
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)
0.0265
p-value: 0.4246
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)
0.0265
p-value: 0.4246
– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)
0.0265
p-value: 0.4246
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%]
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)
0.5712
p-value: 0.0483
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)
0.5712
p-value: 0.0483
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)
0.5712
p-value: 0.0483
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)
0.5712
p-value: 0.0483
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)
0.5712
p-value: 0.0483
– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement
Test SUTVA null
Test SUTVA null reject
Test SUTVA null Use cluster-based experiment to estimate treatment effects reject
Test SUTVA null Use cluster-based experiment to estimate treatment effects reject (higher variance)
Test SUTVA null Use cluster-based experiment to estimate treatment effects reject fail to reject (higher variance)
Test SUTVA null Use cluster-based experiment to estimate treatment effects reject Use Bernoulli experiment to estimate treatment effects fail to reject (higher variance)
Test SUTVA null Use cluster-based experiment to estimate treatment effects reject Use Bernoulli experiment to estimate treatment effects fail to reject (higher variance) (lower variance)
KDD’17 Arxiv
(@msaveski)