Detecting Network Effects Randomizing Over Randomized Experiments - - PowerPoint PPT Presentation

detecting network effects
SMART_READER_LITE
LIVE PREVIEW

Detecting Network Effects Randomizing Over Randomized Experiments - - PowerPoint PPT Presentation

Detecting Network Effects Randomizing Over Randomized Experiments Martin Saveski (@msaveski) MIT Detecting Network Effects Randomizing Over Randomized Experiments Guillaume Saint Jacques Martin Saveski Jean Pouget-Abadie MIT MIT


slide-1
SLIDE 1

Detecting Network Effects

Randomizing Over Randomized Experiments

Martin Saveski

(@msaveski)

MIT

slide-2
SLIDE 2

Martin Saveski

MIT

Guillaume Saint‑Jacques

MIT

Jean Pouget-Abadie

Harvard

Weitao Duan

LinkedIn

Souvik Ghosh

LinkedIn

Ya Xu

LinkedIn

Edo Airoldi

Harvard

Detecting Network Effects

Randomizing Over Randomized Experiments

slide-3
SLIDE 3

Treatment

New Feed Ranking Algorithm

Zi = 1

slide-4
SLIDE 4

Treatment

New Feed Ranking Algorithm

Zi = 1

Old Feed Ranking Algorithm

Control

Zj = 0

slide-5
SLIDE 5

Treatment

New Feed Ranking Algorithm

Zi = 1

Old Feed Ranking Algorithm

Control

Zj = 0

slide-6
SLIDE 6

Treatment

New Feed Ranking Algorithm

Zi = 1

Engagement

Yi

Old Feed Ranking Algorithm

Control

Zj = 0

slide-7
SLIDE 7

Treatment

New Feed Ranking Algorithm

Zi = 1

Engagement

Yi

Old Feed Ranking Algorithm

Control

Zj = 0

slide-8
SLIDE 8

Treatment

New Feed Ranking Algorithm

Zi = 1

Engagement

Yi

Old Feed Ranking Algorithm

Control

Zj = 0

Engagement

Yj

slide-9
SLIDE 9

Completely-randomized Experiment

slide-10
SLIDE 10

Treatment (B)

Completely-randomized Experiment

slide-11
SLIDE 11

Control (A) Treatment (B)

Completely-randomized Experiment

slide-12
SLIDE 12

Control (A) Treatment (B)

Completely-randomized Experiment

μ | | ( )

Σ Y

=

  • | |

( )

Σ Y

completely-randomized

slide-13
SLIDE 13

Control (A) Treatment (B)

Completely-randomized Experiment

μ | | ( )

Σ Y

=

  • | |

( )

Σ Y

completely-randomized

SUTVA: Stable Unit Treatment Value Assumption

Every user’s behavior is affected only by their treatment and NOT by the treatment of any other user

slide-14
SLIDE 14

Cluster-based Randomized Experiment

slide-15
SLIDE 15

Cluster-based Randomized Experiment

slide-16
SLIDE 16

Cluster-based Randomized Experiment

slide-17
SLIDE 17

Cluster-based Randomized Experiment

slide-18
SLIDE 18

Cluster-based Randomized Experiment

Control (A) Treatment (B)

slide-19
SLIDE 19

Completely-randomized Experiment Cluster-based Randomized Experiment

OR

slide-20
SLIDE 20

Completely-randomized Experiment Cluster-based Randomized Experiment

Lower Variance More Spillovers Higher Variance Less Spillovers OR

slide-21
SLIDE 21

Design for Detecting Network Effects

slide-22
SLIDE 22
slide-23
SLIDE 23
slide-24
SLIDE 24
slide-25
SLIDE 25

Completely Randomized Experiment

slide-26
SLIDE 26

Cluster-based Randomized Experiment Completely Randomized Experiment

slide-27
SLIDE 27

μcompletely-randomized μcluster-based

Cluster-based Randomized Experiment Completely Randomized Experiment

=

?

slide-28
SLIDE 28

Hypothesis Test

slide-29
SLIDE 29

Hypothesis Test

H0: SUTVA Holds

slide-30
SLIDE 30

Hypothesis Test

EW,Z [ˆ µcbr − ˆ µcr] = 0

H0: SUTVA Holds

slide-31
SLIDE 31

Hypothesis Test

varW,Z [ˆ µcr − ˆ µcbr] ≤ EW,Z[ˆ σ2] EW,Z [ˆ µcbr − ˆ µcr] = 0

H0: SUTVA Holds

slide-32
SLIDE 32

Hypothesis Test

varW,Z [ˆ µcr − ˆ µcbr] ≤ EW,Z[ˆ σ2] EW,Z [ˆ µcbr − ˆ µcr] = 0

H0: SUTVA Holds

Reject the null when:

slide-33
SLIDE 33

Hypothesis Test

varW,Z [ˆ µcr − ˆ µcbr] ≤ EW,Z[ˆ σ2] EW,Z [ˆ µcbr − ˆ µcr] = 0

H0: SUTVA Holds

|ˆ µcr − ˆ µcbr| √ ˆ σ2 ≥ 1 √α

Reject the null when:

slide-34
SLIDE 34

Hypothesis Test

varW,Z [ˆ µcr − ˆ µcbr] ≤ EW,Z[ˆ σ2] EW,Z [ˆ µcbr − ˆ µcr] = 0

H0: SUTVA Holds

|ˆ µcr − ˆ µcbr| √ ˆ σ2 ≥ 1 √α

Reject the null when: Type I error is no greater than α

slide-35
SLIDE 35

Nuts and Bolts of Running Cluster-based Randomized Experiments

slide-36
SLIDE 36

Why Balanced Clustering?

slide-37
SLIDE 37

Why Balanced Clustering?

  • Theoretical Motivation

– Constants VS random variables

slide-38
SLIDE 38

Why Balanced Clustering?

  • Theoretical Motivation

– Constants VS random variables

  • Practical Motivations
slide-39
SLIDE 39

Why Balanced Clustering?

  • Theoretical Motivation

– Constants VS random variables

  • Practical Motivations

– Variance reduction

slide-40
SLIDE 40

Why Balanced Clustering?

  • Theoretical Motivation

– Constants VS random variables

  • Practical Motivations

– Variance reduction – Balance on pre-treatment covariates

(homophily => large homogenous clusters)

slide-41
SLIDE 41

Algorithms for Balanced Clustering

slide-42
SLIDE 42

Most clustering methods find skewed distributions of cluster sizes

(Leskovec, 2009; Fortunato, 2010)

Algorithms for Balanced Clustering

slide-43
SLIDE 43

=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes

(Leskovec, 2009; Fortunato, 2010)

Algorithms for Balanced Clustering

slide-44
SLIDE 44

=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes

(Leskovec, 2009; Fortunato, 2010)

Restreaming Linear Deterministic Greedy

(Nishimura & Ugander, 2013)

Algorithms for Balanced Clustering

slide-45
SLIDE 45

=> Algorithms that enforce equal cluster sizes Most clustering methods find skewed distributions of cluster sizes

(Leskovec, 2009; Fortunato, 2010)

Restreaming Linear Deterministic Greedy

(Nishimura & Ugander, 2013)

– Streaming – Parallelizable – Stable

Algorithms for Balanced Clustering

slide-46
SLIDE 46

Clustering the LinkedIn Graph

– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency

slide-47
SLIDE 47

Clustering the LinkedIn Graph

35.6% 28.5% 26.2% 22.8% 21.1%

0% 10% 20% 30% 40%

k = 1000 k = 3000 k = 5000 k = 7000 k = 10000

% edges within clusters

– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency

slide-48
SLIDE 48

Clustering the LinkedIn Graph

35.6%

0% 10% 20% 30% 40%

k = 1000 k = 3000 k = 5000 k = 7000 k = 10000

% edges within clusters

– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency

slide-49
SLIDE 49

Clustering the LinkedIn Graph

35.6% 28.5% 26.2% 22.8% 21.1%

0% 10% 20% 30% 40%

k = 1000 k = 3000 k = 5000 k = 7000 k = 10000

% edges within clusters

– Graph: >100M nodes, >10B edges – 350 Hadoop nodes – 1% leniency

slide-50
SLIDE 50

Choosing the Number of Clusters

slide-51
SLIDE 51

Choosing the Number of Clusters

small k large k

slide-52
SLIDE 52

Choosing the Number of Clusters

small k large k large clusters small clusters

slide-53
SLIDE 53

Choosing the Number of Clusters

small k large k large clusters small clusters large network effect large variance small network effect small variance

slide-54
SLIDE 54

Choosing the Number of Clusters

Understanding the Type II error

slide-55
SLIDE 55

Assuming an interference model

Choosing the Number of Clusters

Understanding the Type II error

slide-56
SLIDE 56

Assuming an interference model

Yi = 0 + 1Zi + 2⇢i + ✏i

ρi : fraction of treated friends

Choosing the Number of Clusters

Understanding the Type II error

slide-57
SLIDE 57

Assuming an interference model

Yi = 0 + 1Zi + 2⇢i + ✏i

ρi : fraction of treated friends

: average fraction of a unit's neighbors contained in the cluster

ρi

E [ˆ µcbr − ˆ µcr] ≈ ρ · β2

Choosing the Number of Clusters

Understanding the Type II error

slide-58
SLIDE 58

Assuming an interference model

Yi = 0 + 1Zi + 2⇢i + ✏i

ρi : fraction of treated friends

: average fraction of a unit's neighbors contained in the cluster

ρi

E [ˆ µcbr − ˆ µcr] ≈ ρ · β2

Choose number of clusters M and clustering C such that

max

M,C

ρ p ˆ σ2

C

Choosing the Number of Clusters

Understanding the Type II error

slide-59
SLIDE 59

Experiments on LinkedIn

slide-60
SLIDE 60

μcompletely-randomized μcluster-based

Cluster-based Randomized Experiment Completely Randomized Experiment

=

?

(μbernoulli)

Bernoulli Randomized Experiment

slide-61
SLIDE 61

Experiment 1

slide-62
SLIDE 62

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%]

slide-63
SLIDE 63

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks

slide-64
SLIDE 64

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000

slide-65
SLIDE 65

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement

slide-66
SLIDE 66

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)

  • 0.0211

0.0265

p-value: 0.4246

slide-67
SLIDE 67

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)

  • 0.0211

0.0265

p-value: 0.4246

slide-68
SLIDE 68

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)

  • 0.0211

0.0265

p-value: 0.4246

slide-69
SLIDE 69

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)

  • 0.0211

0.0265

p-value: 0.4246

slide-70
SLIDE 70

Experiment 1

– Population: 20% of all LinkedIn users [Bernoulli: 10%, Cluster-based: 10%] – Time period: 2 weeks – Number of clusters: k = 3,000 – Outcome: feed engagement Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.0559 0.0050 Cluster-based Randomization (CBR) 0.0771 0.0260 Delta (CBR – BR)

  • 0.0211

0.0265

p-value: 0.4246

slide-71
SLIDE 71

Experiment 2

slide-72
SLIDE 72

Experiment 2

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%]

slide-73
SLIDE 73

Experiment 2

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks

slide-74
SLIDE 74

Experiment 2

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000

slide-75
SLIDE 75

Experiment 2

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-76
SLIDE 76

Experiment 2

Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)

  • 0.3281

0.5712

p-value: 0.0483

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-77
SLIDE 77

Experiment 2

Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)

  • 0.3281

0.5712

p-value: 0.0483

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-78
SLIDE 78

Experiment 2

Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)

  • 0.3281

0.5712

p-value: 0.0483

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-79
SLIDE 79

Experiment 2

Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)

  • 0.3281

0.5712

p-value: 0.0483

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-80
SLIDE 80

Experiment 2

Treatment effect Standard Deviation Bernoulli Randomization (BR) 0.2108 0.2911 Cluster-based Randomization (CBR) 0.5390 0.5613 Delta (CBR – BR)

  • 0.3281

0.5712

p-value: 0.0483

– Population: 36% of all LinkedIn users [Bernoulli: 20%, Cluster-based: 16%] – Time period: 4 weeks – Number of clusters: k = 10,000 – Outcome: feed engagement

slide-81
SLIDE 81

Test SUTVA null

slide-82
SLIDE 82

Test SUTVA null reject

slide-83
SLIDE 83

Test SUTVA null Use cluster-based experiment to estimate treatment effects reject

slide-84
SLIDE 84

Test SUTVA null Use cluster-based experiment to estimate treatment effects reject (higher variance)

slide-85
SLIDE 85

Test SUTVA null Use cluster-based experiment to estimate treatment effects reject fail to reject (higher variance)

slide-86
SLIDE 86

Test SUTVA null Use cluster-based experiment to estimate treatment effects reject Use Bernoulli experiment to estimate treatment effects fail to reject (higher variance)

slide-87
SLIDE 87

Test SUTVA null Use cluster-based experiment to estimate treatment effects reject Use Bernoulli experiment to estimate treatment effects fail to reject (higher variance) (lower variance)

slide-88
SLIDE 88

Papers available online

KDD’17 Arxiv

slide-89
SLIDE 89

Detecting Network Effects

Randomizing Over Randomized Experiments

Martin Saveski

(@msaveski)

MIT