Influence and Homophily in Networked User Behavior
Eytan Bakshy Facebook mining social network dynamics workshop @ www2012 April 16, 2012
Influence and Homophily in Networked User Behavior Eytan Bakshy - - PowerPoint PPT Presentation
Influence and Homophily in Networked User Behavior Eytan Bakshy Facebook mining social network dynamics workshop @ www 2012 April 16 , 2012 Motivation To what extent do social networks shape our behaviors online? Homophily and
Eytan Bakshy Facebook mining social network dynamics workshop @ www2012 April 16, 2012
▪ To what extent do social networks shape our behaviors online? ▪ Homophily and heterogeneity confound social influence effects. ▪ Online behavior resembles well-studied forms of contagion ▪ Statistical controls are not enough (Shalizi & Thomas, 2011) ▪ How do we measure influence? ▪ Experiments.
▪ What is a reasonable model of social contagion on the Web? ▪ The homophily confound ▪ Study 1: Influence in information diffusion ▪ Study 2: Influence in sharing decisions ▪ Implications
▪ Standard models assume
constant probability of infection
▪ Standard models assume
constant probability of infection
▪ Interesting things happen
when reproduction rates are high
▪ Standard models assume
constant probability of infection
▪ Interesting things happen
when reproduction rates are high
▪ On the web, most
information doesn’t appear to spread
!"#$ %$&'"()
*+!, *+!- *+!. *+!/ *+!0 *+!1 *+!*
! ! ! ! ! ! ! ! ! ! ! ! ! !*++ *+* *+1 *+0 *+/
!"#$% &'"()"*+,
. / 1 3 5
Bakshy, Hofman, Mason, Watts 2011
▪ Threshold models: become
activated after k contacts are activated
▪ Not clear that local
consensus factors into individual decisions in sharing content
▪ Threshold models: become
activated after k contacts are activated
▪ Not clear that local
consensus factors into individual decisions in sharing content
▪ Positive externalities: e.g.
adoption of a technology
▪ Utility of visiting to a page
is often unrelated to number of visiting friends
▪ Focuses on the spread of
ideas and technologies
▪ Entail costly decisions
▪ Focuses on the spread of
ideas and technologies
▪ Entail costly decisions ▪ Embeddedness, authority,
interpersonal trust, play important role
▪ Focuses on the spread of
ideas and technologies
▪ Entail costly decisions ▪ Embeddedness, authority,
interpersonal trust, play important role
▪ Much of online activity is
cheap and informal
Watts & Dodds 2007
0.005 0.01 0.015 0.02 0.025 5 10 15 20 25 30 35 40 45 50 probability k Probability of joining a community when k friends are already membersBackstrom et al 2006
10 20 30 40 50 60 0.02 0.04 0.06 0.08 Incoming Recommendations Probability of Buying
5 10 15 0.00 0.10 0.20
number of neighbors (k) rate of adoption
small assets large assets
Bakshy et al 2009
Leskovec et al 2007
Watts & Dodds 2007
Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior
Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)
figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
Other forms of influence Mechanism (e.g. News Feed, social cues) Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)
figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
with Itamar Rosenn, Cameron Marlow, and Lada Adamic published as The Role of Social Networks in Information Diffusion. WWW 2012.
▪ Field experiment tests how much sharing would occur in the absence of
exposure via the Facebook feed
▪ Answers causal questions about influence & diffusion: ▪ To what extent does feed increase sharing? ▪ Are weak ties responsible for disseminating information?* ▪ How is tie strength predictive of user activity?* *to be continued on April 19th, The Role of Social Networks in Information Diffusion
regularly visit same site
RSS
visit sites that link to the same content
Adar et al, 2009
Web revisitation Blogs News Aggregators
mass + interpersonal media interpersonal communication
Face-to-face Telephone IM Email
Other forms of influence Facebook news feed Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)
figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012
▪ Assignment procedure: ▪ (viewer, URL) pairs are deterministically assigned into the feed and no
feed condition
▪ Directed shares (via messages, wall posts) are not subject treatment
and are removed from experiment
▪ Evaluating outcomes: ▪ Compare the likelihood of sharing in the feed (treatment) with the no
feed (control) condition
▪ Random sample of all (user, URL) pairs eligible to be shown in the
Facebook news feed between a 7 week period in 2010
▪ 253,238,367 subjects ▪ 75,888,466 URLs ▪ 1,168,633,941 distinct subject-URL pairs (random trials)
share time - alter's share time (days) cumulative density
0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 30 c
share time - exposure time (days) cumulative density
0.0 0.2 0.4 0.6 0.8 1.0
5 10 15 20 25 30
condition feed no feed
users shared at exact same time within one day within one week shared within the first hour of exposure shared before seeing story on feed
Absolute time Relative to first exposure
▪ Two methods for comparing probabilities: ▪ Average treatment effect of the treated: pfeed - pno feed ▪ Relative risk ratio: pfeed / pno feed ▪ Average effect: +0.2047% increase in sharing ▪ Risk Ratio: 7.3x more likely to share
number of sharing friends probability of sharing
0.000 0.005 0.010 0.015 0.020 0.025
! ! ! ! ! !
1 2 3 4 5 6
! ! ! ! ! !
feed no feed
Influence on feed + external correlation External correlation
number of sharing friends pfeed − pno feed
0.000 0.005 0.010 0.015 0.020 0.025 0.030 1 2 3 4 5 6
▪ Experiments are necessary to disentangle influence from other factors ▪ Significant temporal clustering exists even for unexposed users ▪ Probability of sharing increases with number of friends ▪ Even you don’t see those friends! ▪ Influence appears stronger when more friends are shown
Chapter IV, Information Diffusion and Social Influence in Online Networks (dissertation chapter)
▪ Social influence in information diffusion occurs via two stages ▪ 1. Exposure (study 1) ▪ 2. Decision to share ▪ Trend in previous experiment is not causal ▪ Need a way to experimentally manipulate
the number of social signals received by the user
number of sharing friends pfeed − pno feed
0.000 0.005 0.010 0.015 0.020 0.025 0.030 1 2 3 4 5 6
▪ Field experiment tests how the number of friends shown (social cues)
increases sharing via randomization of cues
▪ Answers causal questions about influence: ▪ How does seeing a certain number of peers effect information
diffusion?
▪ How is tie strength predictive of user activity? ▪ Are strong ties more influential?
▪ Subjects: Users that arrive at pages independent of Facebook ▪ Are or would have been assigned to the no feed condition in Study 1 ▪ Not arriving via Facebook ▪ Assignment procedure: randomly assign (viewer, URL) to a number of
cues
▪ Same 7 week period as Study 1 ▪ 1,891,768 randomized trials (unique subject-page pairs) consisting of: ▪ 1,156,608 unique subjects ▪ 470,089 distinct web pages ▪ Record demographic features, tie strength measures between subjects
and their alters for each impression and click event
number of sharing friends (k) probability of sharing
0.00 0.02 0.04 0.06 0.08 1 2 3 4
Not causal! ▪ Probability when number of friends liking = number shown
number of friends shown (o) probability of sharing
0.00 0.02 0.04 0.06 0.08 0 1 2 3
1
0 1 2 3
2
0 1 2 3
3
0 1 2 3
number of actual liking friends
baseline: homophily + heterogeneity (zero friends shown)
tie strength probability of sharing
0.020 0.022 0.024 0.026 0.028 5 10 15 cue not shown cue shown ▪ Consider cases where 1 friend liked a page
tie strength relative risk ratio
1.14 1.15 1.16 1.17 1.18 5 10 15
▪ Experiments are necessary to understand the effect of cues on user
behavior
▪ We introduce the cue-response function which shows how the number
▪ Tie strength is predictive of sharing decisions ▪ Strong ties appear more influential
▪ Correlated activity cannot be attributed to influence alone! ▪ Viral marketing and identifying “influencers” ▪ If it weren’t for the influential, would users still acquire the
information?
▪ Do probability increases justify targeting clusters of individuals? ▪ Relevance ▪ Social data (tie strength, number of friends) allows us to identify
relevant content
▪ Integrating social into Web products ▪ Social cues increase engagement rates
▪ Similar to Study 2, but with ads:
Social Influence in Social Advertising: Evidence from Field Experiments
Eytan Bakshy, Dean Eckles, Rong Yan, Itamar Rosenn EC 2012.
number of associated friends normalized click rate
1.00 1.05 1.10 1.15 1.20 1.25 1.30 1 2 3 4 5 6
1 friend shown
▪ Information diffusion: ▪ Constructed observational study ▪ Effect of cues during exposure via feed ▪ Social cues: Individual differences in persuasion ▪ Do certain users respond differently to social cues? ▪ Simple or complex contagion? ▪ Are strong ties actually more influential?
▪ Collaborators: Lada Adamic, Dean Eckles, Cameron Marlow, Itamar
Rosenn
▪ More to come on Thursday! ▪ Find out more about Data Science at Facebook here: ▪ http://www.facebook.com/data ▪ Questions?