[PPT] - Reproducibility & Generalizability @ Twitter Strengthening PowerPoint Presentation

SLIDE 1

Reproducibility & Generalizability @ Twitter

Strengthening Reproducibility in Network Science workshop NetSci 2017

Brandon Roy @bcroygbiv June 19, 2017

SLIDE 2

Twitter is a real-time information network – it’s what’s happening right now

What is Twitter?

SLIDE 3

Twitter is a real-time information network – it’s what’s happening right now I choose other users to follow All tweets by those users render into my timeline A tweet can be retweeted If some users I follow in turn follow me, it’s a mutual follow

What is Twitter?

SLIDE 4

Health, Usage and Behavior

HUB team

Define and model user “health” at individual and population level
Identify causal factors for health and usage
Characterize user interests
Translate insights into experiments and build prototype systems
...

SLIDE 5

Analytics & Machine Learning
Machine learning infrastructure / platforms
User metrics and revenue modeling
Content understanding (text, images, video)
Data services and integration
User modeling
…

(and friends)

Twitter Science

SLIDE 6

The systematic study of the structure and behavior of the physical and natural world through observation and experiment Newton, observing apple falling from a tree develops a theory:

Apples are attracted toward the Earth?
Fruit is attracted toward the Earth?
Unobserved force attracts all masses to one another

Develops Law of Universal Gravitation Depends on a minimal set of conditions Scientific findings are reproducible under appropriate conditions Assumption: laws of physics are stable

Science

SLIDE 7

Developmental psychology – how do children learn words? Study through observation and experiment Observational study preserves natural system. Can correlate features of objects & environment (e.g. shape, color, salience) with words learned Experimental study can isolate and test factors, may be more easily repeatable. But may also lose important aspects of system under analysis Assumption: human nature / behavior is relatively stable

Science

Medina et. al., 2011

SLIDE 8

Studying Twitter

Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content

SLIDE 9

Awareness Trial Regular use

Studying Twitter

Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content How can we make Twitter better? How can we grow the platform?

SLIDE 10

Awareness Trial Regular use

Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content How can we make Twitter better? How can we grow the platform?

Studying Twitter

SLIDE 11

Awareness Trial Regular use Regular use

Studying Twitter

Twitter is both social and a technical system. Parts are simple, but system is complex Consists of millions of users producing, sharing, and consuming content How can we make Twitter better? How can we grow the platform?

SLIDE 12

Randomly assign users into control / treatment groups Record key metrics and look for stat.

sig. differences between groups

Product experimentation

A/B testing

SLIDE 13

Randomly assign users into control / treatment groups Record key metrics and look for stat.

sig. differences between groups

In this example, we learn green button is “better” than blue… But we don’t necessarily have a “theory” of button color

Product experimentation

A/B testing

If we are able to replicate on other websites, with other text, with potentially other background colors, we’ll start to feel more confident about green buttons

SLIDE 14

We’ve been experimenting with account recommendations to new users Change recommendation algorithm for subset of new users and compare to control group Feel confident finding would be valid (on avg) for all users due to random sampling strategy If things looks good we expect reproducibility and will “ship it” to all users Caveat: other parts of system may change, could affect these findings!

Product experimentation

DDG = “Duck Duck Goose”

SLIDE 15

Graph state Consumption Passive engagements Rich media Graph actions Production Active engagements Social interaction Many questions we would like to answer but cannot (easily) manipulate through experiment But we can try to study these questions using other methods Example: what makes a user “healthy”?

Observational data analysis

SLIDE 16

Graph state Consumption Passive engagements Rich media Graph actions Production Active engagements Social interaction Many questions we would like to answer but cannot (easily) manipulate through experiment But we can try to study these questions using other methods Example: what makes a user “healthy”?

Observational data analysis

SLIDE 17

Characterizing graph state

Link type

0 - 60 61 - 500 501 - 3,000 3,001 - 25,000 25,001 - 200,000 200,001 - 2,000,000 2,000,000+

B’s # followers

Near zero Very light Light Medium Non-Tweeter Medium Tweeter Heavy Non-Tweeter Heavy Tweeter

B’s usage state

SLIDE 18

Characterizing graph state

SLIDE 19

Hypothesis User’s graph supports their activity, and only certain types of links are important for driving heavy usage Analysis Match users with same covariates except variable in question Compare matched users who differ on variable in question For example, find pair of users who have same graph summary counts except for # of small, heavy tweeter accounts followed and look for different health outcomes

Analysis

SLIDE 20

SLIDE 21

SLIDE 22

SLIDE 23

Very excited when we first got this result Was intuitive, suggests ingredients for a great Twitter experience But would be more convincing if we could reproduce analysis with different data. Better yet, reproduce effect with controlled experiment. But how to implement this change?

Observational data analysis

SLIDE 24

1. For every result, keep track of how it was produced 2. Avoid manual data manipulation steps 3. Archive the exact versions of all external programs used 4. Version control all custom scripts 5. Record all intermediate results, when possible in standardized formats 6. For analyses that include randomness, note underlying random seeds 7. Always store raw data behind plots 8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected 9. Connect textual statements to underlying results 10. Provide public access to scripts, runs, and results

Reproducibility recommendations

Sandve et. al., 2013

from Sandve et. al., 2013

SLIDE 25

1. For every result, keep track of how it was produced 2. Avoid manual data manipulation steps 3. Archive the exact versions of all external programs used 4. Version control all custom scripts 5. Record all intermediate results, when possible in standardized formats 6. For analyses that include randomness, note underlying random seeds 7. Always store raw data behind plots 8. Generate hierarchical analysis output, allowing layers of increasing detail to be inspected 9. Connect textual statements to underlying results 10. Provide public access to scripts, runs, and results

Reproducibility recommendations

from Sandve et. al., 2013

Sandve et. al., 2013

Great to reproduce analysis… even better to reproduce the effect!

SLIDE 26

Sandve GK, Nekrutenko A, Taylor J, Hovig E (2013) Ten Simple Rules for Reproducible Computational Research. PLoS Comput Biol 9(10): e1003285. https://doi.org/10.1371/journal.pcbi.1003285 Medina, T., Snedeker, J., Trueswell, J., & Gleitman, L (2011). How words can and cannot be learned by observation. Proceedings of the National Academy of Sciences, 108(22), 9014.

Reproducibility & Generalizability @ Twitter Strengthening - - PowerPoint PPT Presentation

Reproducibility & Generalizability @ Twitter

What is Twitter?

What is Twitter?

HUB team

Twitter Science

Science

Science

Studying Twitter

Studying Twitter

Studying Twitter

Studying Twitter

Product experimentation

Product experimentation

Product experimentation

Observational data analysis

Observational data analysis

Characterizing graph state

Characterizing graph state

Analysis

Observational data analysis

Reproducibility recommendations

Reproducibility recommendations

References