Influence and Homophily in Networked User Behavior Eytan Bakshy - - PowerPoint PPT Presentation

influence and homophily in networked user behavior
SMART_READER_LITE
LIVE PREVIEW

Influence and Homophily in Networked User Behavior Eytan Bakshy - - PowerPoint PPT Presentation

Influence and Homophily in Networked User Behavior Eytan Bakshy Facebook mining social network dynamics workshop @ www 2012 April 16 , 2012 Motivation To what extent do social networks shape our behaviors online? Homophily and


slide-1
SLIDE 1

Influence and Homophily in Networked User Behavior

Eytan Bakshy Facebook mining social network dynamics workshop @ www2012 April 16, 2012

slide-2
SLIDE 2

Motivation

▪ To what extent do social networks shape our behaviors online? ▪ Homophily and heterogeneity confound social influence effects. ▪ Online behavior resembles well-studied forms of contagion ▪ Statistical controls are not enough (Shalizi & Thomas, 2011) ▪ How do we measure influence? ▪ Experiments.

slide-3
SLIDE 3

Outline

▪ What is a reasonable model of social contagion on the Web? ▪ The homophily confound ▪ Study 1: Influence in information diffusion ▪ Study 2: Influence in sharing decisions ▪ Implications

slide-4
SLIDE 4

Information as biological contagion

▪ Standard models assume

constant probability of infection

slide-5
SLIDE 5

Information as biological contagion

▪ Standard models assume

constant probability of infection

▪ Interesting things happen

when reproduction rates are high

R ≥ β/γ

slide-6
SLIDE 6

Information as biological contagion

▪ Standard models assume

constant probability of infection

▪ Interesting things happen

when reproduction rates are high

▪ On the web, most

information doesn’t appear to spread

R ≥ β/γ

!"#$ %$&'"()

*+!, *+!- *+!. *+!/ *+!0 *+!1 *+!*

! ! ! ! ! ! ! ! ! ! ! ! ! !

*++ *+* *+1 *+0 *+/

!"#$% &'"()"*+,

  • .-
  • ./
  • .0
  • .1
  • .2
  • .3
  • .4
! ! ! ! ! ! ! ! ! !

. / 1 3 5

Bakshy, Hofman, Mason, Watts 2011

slide-7
SLIDE 7

Threshold models of social contagion

▪ Threshold models: become

activated after k contacts are activated

▪ Not clear that local

consensus factors into individual decisions in sharing content

slide-8
SLIDE 8

Threshold models of social contagion

▪ Threshold models: become

activated after k contacts are activated

▪ Not clear that local

consensus factors into individual decisions in sharing content

▪ Positive externalities: e.g.

adoption of a technology

▪ Utility of visiting to a page

is often unrelated to number of visiting friends

slide-9
SLIDE 9

Diffusion of innovations

▪ Focuses on the spread of

ideas and technologies

▪ Entail costly decisions

slide-10
SLIDE 10

Diffusion of innovations

▪ Focuses on the spread of

ideas and technologies

▪ Entail costly decisions ▪ Embeddedness, authority,

interpersonal trust, play important role

slide-11
SLIDE 11

Diffusion of innovations

▪ Focuses on the spread of

ideas and technologies

▪ Entail costly decisions ▪ Embeddedness, authority,

interpersonal trust, play important role

▪ Much of online activity is

cheap and informal

slide-12
SLIDE 12

Some Similarities

Watts & Dodds 2007

0.005 0.01 0.015 0.02 0.025 5 10 15 20 25 30 35 40 45 50 probability k Probability of joining a community when k friends are already members

Backstrom et al 2006

10 20 30 40 50 60 0.02 0.04 0.06 0.08 Incoming Recommendations Probability of Buying

  • Wei et al 2010

5 10 15 0.00 0.10 0.20

number of neighbors (k) rate of adoption

small assets large assets

Bakshy et al 2009

Theory Data

Leskovec et al 2007

Watts & Dodds 2007

slide-13
SLIDE 13

The Homophily Confound

Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior

Xi Yia(t1) Ui Yja(t0) Uj Xj Dija

Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)

figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012

slide-14
SLIDE 14

Influence (and homophily)

Xi Yia(t1) Ui Yja(t0) Uj Xj Dija Xi Yia(t1) Ui Yja(t0) Uj Xj Dija

Other forms of influence Mechanism (e.g. News Feed, social cues) Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)

figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012

slide-15
SLIDE 15

Study 1: Effect of Feed on Information Diffusion

with Itamar Rosenn, Cameron Marlow, and Lada Adamic published as The Role of Social Networks in Information Diffusion. WWW 2012.

slide-16
SLIDE 16

Study 1: Outline

▪ Field experiment tests how much sharing would occur in the absence of

exposure via the Facebook feed

▪ Answers causal questions about influence & diffusion: ▪ To what extent does feed increase sharing? ▪ Are weak ties responsible for disseminating information?* ▪ How is tie strength predictive of user activity?* *to be continued on April 19th, The Role of Social Networks in Information Diffusion

slide-17
SLIDE 17

regularly visit same site

Correlated Information Sources

RSS

visit sites that link to the same content

External Influence

Adar et al, 2009

Web revisitation Blogs News Aggregators

mass + interpersonal media interpersonal communication

Face-to-face Telephone IM Email

slide-18
SLIDE 18

Influence on Feed

Xi Yia(t1) Ui Yja(t0) Uj Xj Dija Xi Yia(t1) Ui Yja(t0) Uj Xj Dija

Other forms of influence Facebook news feed Unknown correlation between friends’ characteristics (expected to be stronger for closer friends) Ego’s sharing behavior Alter’s sharing behavior Known characteristics Unknown characteristics (e.g. Web browsing behavior, interests)

figure stolen from Bakshy, Eckles, Yan & Rosenn, 2012

slide-19
SLIDE 19

Details

▪ Assignment procedure: ▪ (viewer, URL) pairs are deterministically assigned into the feed and no

feed condition

▪ Directed shares (via messages, wall posts) are not subject treatment

and are removed from experiment

▪ Evaluating outcomes: ▪ Compare the likelihood of sharing in the feed (treatment) with the no

feed (control) condition

slide-20
SLIDE 20

Data

▪ Random sample of all (user, URL) pairs eligible to be shown in the

Facebook news feed between a 7 week period in 2010

▪ 253,238,367 subjects ▪ 75,888,466 URLs ▪ 1,168,633,941 distinct subject-URL pairs (random trials)

slide-21
SLIDE 21

Temporal Clustering

share time - alter's share time (days) cumulative density

0.0 0.2 0.4 0.6 0.8 1.0 5 10 15 20 25 30 c

share time - exposure time (days) cumulative density

0.0 0.2 0.4 0.6 0.8 1.0

  • 5

5 10 15 20 25 30

condition feed no feed

users shared at exact same time within one day within one week shared within the first hour of exposure shared before seeing story on feed

Absolute time Relative to first exposure

slide-22
SLIDE 22

What is the overall effect of feed on sharing?

▪ Two methods for comparing probabilities: ▪ Average treatment effect of the treated: pfeed - pno feed ▪ Relative risk ratio: pfeed / pno feed ▪ Average effect: +0.2047% increase in sharing ▪ Risk Ratio: 7.3x more likely to share

slide-23
SLIDE 23

How does sharing increase with exposure?

number of sharing friends probability of sharing

0.000 0.005 0.010 0.015 0.020 0.025

! ! ! ! ! !

1 2 3 4 5 6

! ! ! ! ! !

feed no feed

Influence on feed + external correlation External correlation

slide-24
SLIDE 24

How does sharing increase with exposure?

number of sharing friends pfeed − pno feed

0.000 0.005 0.010 0.015 0.020 0.025 0.030 1 2 3 4 5 6

slide-25
SLIDE 25

Study 1: Recap

▪ Experiments are necessary to disentangle influence from other factors ▪ Significant temporal clustering exists even for unexposed users ▪ Probability of sharing increases with number of friends ▪ Even you don’t see those friends! ▪ Influence appears stronger when more friends are shown

slide-26
SLIDE 26

Study 2: Effect of Social Cues on Sharing Decisions

Chapter IV, Information Diffusion and Social Influence in Online Networks (dissertation chapter)

slide-27
SLIDE 27

Motivation

▪ Social influence in information diffusion occurs via two stages ▪ 1. Exposure (study 1) ▪ 2. Decision to share ▪ Trend in previous experiment is not causal ▪ Need a way to experimentally manipulate

the number of social signals received by the user

number of sharing friends pfeed − pno feed

0.000 0.005 0.010 0.015 0.020 0.025 0.030 1 2 3 4 5 6

slide-28
SLIDE 28

Study 2: Outline

▪ Field experiment tests how the number of friends shown (social cues)

increases sharing via randomization of cues

▪ Answers causal questions about influence: ▪ How does seeing a certain number of peers effect information

diffusion?

▪ How is tie strength predictive of user activity? ▪ Are strong ties more influential?

slide-29
SLIDE 29
slide-30
SLIDE 30
slide-31
SLIDE 31
slide-32
SLIDE 32
slide-33
SLIDE 33
slide-34
SLIDE 34
slide-35
SLIDE 35

Experimental Design

▪ Subjects: Users that arrive at pages independent of Facebook ▪ Are or would have been assigned to the no feed condition in Study 1 ▪ Not arriving via Facebook ▪ Assignment procedure: randomly assign (viewer, URL) to a number of

cues

slide-36
SLIDE 36

Data

▪ Same 7 week period as Study 1 ▪ 1,891,768 randomized trials (unique subject-page pairs) consisting of: ▪ 1,156,608 unique subjects ▪ 470,089 distinct web pages ▪ Record demographic features, tie strength measures between subjects

and their alters for each impression and click event

slide-37
SLIDE 37

Social Correlation

number of sharing friends (k) probability of sharing

0.00 0.02 0.04 0.06 0.08 1 2 3 4

Not causal! ▪ Probability when number of friends liking = number shown

slide-38
SLIDE 38

number of friends shown (o) probability of sharing

0.00 0.02 0.04 0.06 0.08 0 1 2 3

1

0 1 2 3

2

0 1 2 3

3

0 1 2 3

}

number of actual liking friends

  • bserved effect

baseline: homophily + heterogeneity (zero friends shown)

slide-39
SLIDE 39

Predictive Power of Strong Ties

tie strength probability of sharing

0.020 0.022 0.024 0.026 0.028 5 10 15 cue not shown cue shown ▪ Consider cases where 1 friend liked a page

slide-40
SLIDE 40

Cues Matter More for Strong Ties

tie strength relative risk ratio

1.14 1.15 1.16 1.17 1.18 5 10 15

slide-41
SLIDE 41

Study 2: Recap

▪ Experiments are necessary to understand the effect of cues on user

behavior

▪ We introduce the cue-response function which shows how the number

  • f social signals received influences user behavior

▪ Tie strength is predictive of sharing decisions ▪ Strong ties appear more influential

slide-42
SLIDE 42

Implications

▪ Correlated activity cannot be attributed to influence alone! ▪ Viral marketing and identifying “influencers” ▪ If it weren’t for the influential, would users still acquire the

information?

▪ Do probability increases justify targeting clusters of individuals? ▪ Relevance ▪ Social data (tie strength, number of friends) allows us to identify

relevant content

▪ Integrating social into Web products ▪ Social cues increase engagement rates

slide-43
SLIDE 43

Upcoming Work

▪ Similar to Study 2, but with ads:

Social Influence in Social Advertising: Evidence from Field Experiments

Eytan Bakshy, Dean Eckles, Rong Yan, Itamar Rosenn EC 2012.

number of associated friends normalized click rate

1.00 1.05 1.10 1.15 1.20 1.25 1.30 1 2 3 4 5 6

1 friend shown

slide-44
SLIDE 44

Future Work

▪ Information diffusion: ▪ Constructed observational study ▪ Effect of cues during exposure via feed ▪ Social cues: Individual differences in persuasion ▪ Do certain users respond differently to social cues? ▪ Simple or complex contagion? ▪ Are strong ties actually more influential?

slide-45
SLIDE 45

Thanks!

▪ Collaborators: Lada Adamic, Dean Eckles, Cameron Marlow, Itamar

Rosenn

▪ More to come on Thursday! ▪ Find out more about Data Science at Facebook here: ▪ http://www.facebook.com/data ▪ Questions?