Global Diffusion via Cascading Invitations: Structure, Growth, and - - PowerPoint PPT Presentation

global diffusion via cascading invitations structure
SMART_READER_LITE
LIVE PREVIEW

Global Diffusion via Cascading Invitations: Structure, Growth, and - - PowerPoint PPT Presentation

Global Diffusion via Cascading Invitations: Structure, Growth, and Homophily Ashton Anderson Stanford Daniel Huttenlocher, Jon Kleinberg, Jure Leskovec, Mitul Tiwari Cornell Cornell Stanford LinkedIn Friday, May 1, 15 growth via cascading


slide-1
SLIDE 1

Global Diffusion via Cascading Invitations: Structure, Growth, and Homophily

Daniel Huttenlocher, Jon Kleinberg, Jure Leskovec, Mitul Tiwari Stanford Stanford Cornell Cornell Ashton Anderson LinkedIn

Friday, May 1, 15

slide-2
SLIDE 2

growth via cascading signups

2

many successful websites grow by their members inviting non-members to join e.g., Gmail, Facebook, LinkedIn, etc. billions of accounts, huge fraction of all web traffic

Friday, May 1, 15

slide-3
SLIDE 3

questions

3

what types of people transmit to what types of people? how do cascades grow over time? what’s the structure of this growth? (is it “viral”?)

Friday, May 1, 15

slide-4
SLIDE 4

guest invitations

4

LinkedIn: 332M members significant fraction are warm signups largest product diffusion event ever analyzed

Friday, May 1, 15

slide-5
SLIDE 5

guest invitations

5

u v

u invites v and v accepts u’s invitation we construct a graph as follows: and v accepts u’s invitation

Friday, May 1, 15

slide-6
SLIDE 6

6

guest invitations

these invitations link together and form cascades

Friday, May 1, 15

slide-7
SLIDE 7

guest invitations

7

every cold signup is the root of a signup cascade and v accepts u’s invitation all non-root nodes are warm signups cascades are trees

Friday, May 1, 15

slide-8
SLIDE 8

guest invitations

8

and v accepts u’s invitation time Text

Friday, May 1, 15

slide-9
SLIDE 9
  • 1. structure
  • 2. growth
  • 3. homophily

global diffusion via cascading invitations

9

Friday, May 1, 15

slide-10
SLIDE 10

cascade structure

10

prior work found little evidence of real multi-step, person-to-person diffusion and v accepts u’s invitation vast majority of “diffusion” cascades:

Friday, May 1, 15

slide-11
SLIDE 11
  • 1. structure
  • 2. growth
  • 3. homophily

global diffusion via cascading invitations

11

Friday, May 1, 15

slide-12
SLIDE 12

cascade structure

12

is there evidence of “viral transmission” on LI? and v accepts u’s invitation

  • ne way to quantify: how many of the adopters

are far from the root?

Friday, May 1, 15

slide-13
SLIDE 13

cascade structure

13

and v accepts u’s invitation adoptions are much deeper on LI than in previous datasets

Friday, May 1, 15

slide-14
SLIDE 14

cascade structure

14

and v accepts u’s invitation another measure: what fraction of adoptions are accounted for in large/deep cascades? than in previous datasets

Friday, May 1, 15

slide-15
SLIDE 15

cascade structure

15

and v accepts u’s invitation so much more viral transmission that we’re

  • bserving qualitatively different behavior

another measure: what fraction of adoptions are accounted for in large/deep cascades? than in previous datasets

Friday, May 1, 15

slide-16
SLIDE 16

cascade structure

16

and v accepts u’s invitation structural virality of a cascade: rigorous measure to interpolate between broadcast and viral diffusion than in previous datasets broadcast (low SV) viral (high SV)

Friday, May 1, 15

slide-17
SLIDE 17

cascade structure

17

and v accepts u’s invitation important question: what’s the relationship between cascade size and structural virality? than in previous datasets if strongly negative or positive, knowing cascade size tells you mechanism by which it grew if close to 0, cascades grow in structurally different ways

Friday, May 1, 15

slide-18
SLIDE 18

cascade structure

18

and v accepts u’s invitation than in previous datasets prior work: Twitter information cascades correlations range from 0.0 to 0.2

Friday, May 1, 15

slide-19
SLIDE 19

cascade structure

19

and v accepts u’s invitation than in previous datasets

  • ur work: LinkedIn signup cascades

strikingly high correlation: 0.72!

Friday, May 1, 15

slide-20
SLIDE 20

cascade structure

20

and v accepts u’s invitation than in previous datasets LinkedIn signup cascades are qualitatively different than previously studied online diffusion datasets direct evidence of a large-scale, multi-step diffusion process ...in contrast with previous work

Friday, May 1, 15

slide-21
SLIDE 21
  • 1. structure
  • 2. growth
  • 3. homophily

global diffusion via cascading invitations

21

Friday, May 1, 15

slide-22
SLIDE 22

growth dynamics

22

and v accepts u’s invitation than in previous datasets information cascades grow and flame out very quickly (think news, etc.) what timescales do LI cascades operate over?

Friday, May 1, 15

slide-23
SLIDE 23

growth dynamics

23

and v accepts u’s invitation than in previous datasets time gap between inviter, invitee signups months and years, not hours!

Friday, May 1, 15

slide-24
SLIDE 24

growth dynamics

24

and v accepts u’s invitation than in previous datasets invites accepted quickly invites sent later LI cascades are extremely persistent

Friday, May 1, 15

slide-25
SLIDE 25

growth dynamics

25

and v accepts u’s invitation than in previous datasets

information cascades grow quickly then stagnate LI cascades are much more persistent: what is the growth trajectory of a LI cascade?

Friday, May 1, 15

slide-26
SLIDE 26

growth dynamics

26

and v accepts u’s invitation than in previous datasets tree growth over time for 1K biggest trees surprisingly linear!

Friday, May 1, 15

slide-27
SLIDE 27

growth dynamics

27

and v accepts u’s invitation than in previous datasets LI signup cascades accruing members at a steady, persistent, constant rate not the “burn through the network” picture of information diffusion

Friday, May 1, 15

slide-28
SLIDE 28
  • 1. structure
  • 2. growth
  • 3. homophily

global diffusion via cascading invitations

28

Friday, May 1, 15

slide-29
SLIDE 29

homophily

29

and v accepts u’s invitation than in previous datasets homophily: the tendency for people to associate with others like themselves (“birds of a feather flock together”) extremely rich user-level data: we can now see how diffusion relates to underlying node attributes

Friday, May 1, 15

slide-30
SLIDE 30

homophily

30

and v accepts u’s invitation than in previous datasets we consider all cascades with >= 100 nodes (n > 100K of them) every cascade defines a set of members look at distributions of attributes in individual cascades

Friday, May 1, 15

slide-31
SLIDE 31

homophily

31

and v accepts u’s invitation than in previous datasets within-similarity: probability that two randomly chosen nodes match on attribute between-similarity: probability that a randomly drawn node from group 1 matches on attribute with randomly drawn node from group 2 the difference between the two is a measure of homophily

Friday, May 1, 15

slide-32
SLIDE 32

homophily

32

and v accepts u’s invitation than in previous datasets

Friday, May 1, 15

slide-33
SLIDE 33

homophily

33

and v accepts u’s invitation than in previous datasets extreme homophily on geography significant homophily on industry minimal homophily on engagement, max seniority level, and age

Friday, May 1, 15

slide-34
SLIDE 34

homophily

34

and v accepts u’s invitation than in previous datasets

Friday, May 1, 15

slide-35
SLIDE 35

homophily

35

and v accepts u’s invitation than in previous datasets clearly, there is strong homophily on country but does this cascade homophily follow from the

  • bvious edge homophily?

Friday, May 1, 15

slide-36
SLIDE 36

homophily

36

and v accepts u’s invitation than in previous datasets model edge homophily with a first-order Markov chain

Friday, May 1, 15

slide-37
SLIDE 37

homophily

37

and v accepts u’s invitation than in previous datasets model edge homophily with a first-order Markov chain

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87 BR CA CA FR IN US BR FR IN US

empirically derived transition matrix:

Friday, May 1, 15

slide-38
SLIDE 38

homophily

38

and v accepts u’s invitation than in previous datasets model edge homophily with a first-order Markov chain

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87 BR CA CA FR IN US BR FR IN US

edge homophily

Friday, May 1, 15

slide-39
SLIDE 39

homophily

39

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US US US CA IN US IN US US CA

Friday, May 1, 15

slide-40
SLIDE 40

homophily

40

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US

Friday, May 1, 15

slide-41
SLIDE 41

homophily

41

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-42
SLIDE 42

homophily

42

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-43
SLIDE 43

homophily

43

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US US BR

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-44
SLIDE 44

homophily

44

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US US BR

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-45
SLIDE 45

homophily

45

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US US US BR US BR

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-46
SLIDE 46

homophily

46

and v accepts u’s invitation than in previous datasets simulate signup diffusion with first-order Markov chain

US US US BR US BR US US IN BR

0.85 0.01 0.01 0.02 0.11 0.03 0.60 0.06 0.06 0.25 0.02 0.10 0.65 0.03 0.20 0.03 0.02 0.01 0.82 0.12 0.05 0.02 0.01 0.05 0.87

BR CA CA FR IN US BR FR IN US

Friday, May 1, 15

slide-47
SLIDE 47

homophily

47

and v accepts u’s invitation than in previous datasets keep all cascade structures the same run this first-order Markov chain process to generate simulated attribute distributions compute within-similarity as before if distribution over similarities is similar, then cascade homophily follows from edge homophily

Friday, May 1, 15

slide-48
SLIDE 48

homophily

48

and v accepts u’s invitation than in previous datasets Markov-generated similarities much lower than observed values!

Friday, May 1, 15

slide-49
SLIDE 49

homophily

49

and v accepts u’s invitation than in previous datasets this reveals a deep fact: LI signup cascades are not arbitrary sets of members that there is cascade homophily above and beyond the already-high edge homophily means that there is higher-order structure in the cascades

Friday, May 1, 15

slide-50
SLIDE 50

homophily

50

and v accepts u’s invitation than in previous datasets repeat the same experiment with second-order Markov chain instead of considering just the parent, consider grandparent and parent

Friday, May 1, 15

slide-51
SLIDE 51

homophily

51

and v accepts u’s invitation than in previous datasets “second-order effects” very large here

Friday, May 1, 15

slide-52
SLIDE 52

homophily

52

and v accepts u’s invitation than in previous datasets how long-range is the dependence? root-guessing experiment borrowed from genetics given node attributes at depth d, does plurality attribute match root attribute?

Friday, May 1, 15

slide-53
SLIDE 53

homophily

53

and v accepts u’s invitation than in previous datasets

US US US BR US BR US US IN BR US BR

Friday, May 1, 15

slide-54
SLIDE 54

homophily

54

and v accepts u’s invitation than in previous datasets

US US US BR US BR US US IN BR US BR

US US

Friday, May 1, 15

slide-55
SLIDE 55

homophily

55

and v accepts u’s invitation than in previous datasets

US US US BR US BR US US IN BR US BR

US US US

Friday, May 1, 15

slide-56
SLIDE 56

homophily

56

and v accepts u’s invitation than in previous datasets

US US US BR US BR US US IN BR US BR

US US BR US

Friday, May 1, 15

slide-57
SLIDE 57

homophily

57

and v accepts u’s invitation than in previous datasets — real attributes — first-order Markov generated attributes — second-order Markov generated attributes run this experiment on:

Friday, May 1, 15

slide-58
SLIDE 58

homophily

58

and v accepts u’s invitation than in previous datasets

Friday, May 1, 15

slide-59
SLIDE 59

homophily

59

and v accepts u’s invitation than in previous datasets genetic processes are first-order by definition higher-order dependencies in our setting is thus analogous to phenotypes, not genotypes a member profile is like a social phenotype what would a social genotype look like?

Friday, May 1, 15

slide-60
SLIDE 60

conclusion

60

LI cascades much more structurally viral than previously studied diffusion datasets they grow persistently over time significant homophily patterns at cascade level, meaning cascades are coherent sets of members

Friday, May 1, 15

slide-61
SLIDE 61

thank you!

61

Friday, May 1, 15

slide-62
SLIDE 62

62

status effects

Friday, May 1, 15

slide-63
SLIDE 63

63

status effects

Friday, May 1, 15