Romantic Partnerships and the Dispersion of Social Ties Lars - - PowerPoint PPT Presentation

romantic partnerships and the dispersion of social ties
SMART_READER_LITE
LIVE PREVIEW

Romantic Partnerships and the Dispersion of Social Ties Lars - - PowerPoint PPT Presentation

Introduction Embeddedness and Dispersion Evaluation Combining Features Romantic Partnerships and the Dispersion of Social Ties Lars Backstrom Jon Kleinberg presented by Yehonatan Cohen 2014-11-12 Lars Backstrom, Jon Kleinberg Romantic


slide-1
SLIDE 1

Introduction Embeddedness and Dispersion Evaluation Combining Features

Romantic Partnerships and the Dispersion of Social Ties

Lars Backstrom Jon Kleinberg

presented by

Yehonatan Cohen

2014-11-12

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-2
SLIDE 2

Introduction Embeddedness and Dispersion Evaluation Combining Features

1 Introduction

Problem Statement Dataset

2 Embeddedness and Dispersion

Embeddedness Dispersion

3 Evaluation

Take 1 Take 2 Time and Space

4 Combining Features

Machine Learning Performance Over Time

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-3
SLIDE 3

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Problem Statement

Consider a social network user, a

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-4
SLIDE 4

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Problem Statement

Consider a social network user, a, and its neighborhood...

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-5
SLIDE 5

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Problem Statement

Consider a social network user, a, and its neighborhood... Also, let us assume that a is married. Can we identify his wife?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-6
SLIDE 6

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Problem Statement

Formally, our problem is defined as follows: Spouse Detection Let a an ego Facebook node and denote by Ga its set of all friends and the links among them. Given a declared a relationship partner (’married’, ’engaged’ or ’in a relationship’). Can we identify a’s spouse?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-7
SLIDE 7

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Motivation

Such relationships detection is important for several reasons: Romantic relationships are singular type of social ties that play powerful roles in social processes over a person’s whole life course. They also form an important aspect of the everyday practices and uses of social media. They are among the very strongest ties, but is has not been clear whether standard structural features are sufficient to characterize them.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-8
SLIDE 8

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Facebook Semantcis

Facebook is the most popular on-line social network.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-9
SLIDE 9

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Facebook Semantcis

Facebook is the most popular on-line social network. A user is represented by a node.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-10
SLIDE 10

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Facebook Semantcis

Facebook is the most popular on-line social network. A user is represented by a node. Facebook’s friendship relation is undirected.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-11
SLIDE 11

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Facebook Semantcis

Facebook is the most popular on-line social network. A user is represented by a node. Facebook’s friendship relation is undirected. An edge between two nodes represents a friendship between the corresponding users.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-12
SLIDE 12

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Facebook Semantics

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-13
SLIDE 13

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Datasets Description

Two datasets were used by the authors: The first consists of the network neighborhoods of approximately 1.3 million Facebook users. Users were selected uniformly at random from among: Users of age at least 20. Users with between 50 and 2000 friends. Users who list a spouse or relationship partner in their profile.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-14
SLIDE 14

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Datasets Description

The second is a sample of approximately 73,000 neighborhoods from the first dataset selected uniformly at random from among all neighborhoods with at most 25,000 links.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-15
SLIDE 15

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

Datasets Dimensions

The datasets contains 379 million nodes. Overall there are 8.6 billion links. An average of 291 nodes and 6652 links per node’s neighborhood.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-16
SLIDE 16

Introduction Embeddedness and Dispersion Evaluation Combining Features Problem Statement Dataset

1 Introduction

Problem Statement Dataset

2 Embeddedness and Dispersion

Embeddedness Dispersion

3 Evaluation

Take 1 Take 2 Time and Space

4 Combining Features

Machine Learning Performance Over Time

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-17
SLIDE 17

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Embeddedness

Embeddedness Given an edge (u, v), its embeddedness is the number of mutual friends shared by its endpoints. Traditionally, embeddedness is associated with tie strength, and will be used as a baseline predictor.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-18
SLIDE 18

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Embeddedness

What is the embeddedness of (b, c)?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-19
SLIDE 19

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Embeddedness

What is the embeddedness of (b, c)?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-20
SLIDE 20

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Embeddedness

Can you determine the strongest tie in the network below?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-21
SLIDE 21

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Embeddedness

Can you determine the strongest tie in the network below?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-22
SLIDE 22

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Many individuals have large clusters of friends corresponding to well-defined foci of interaction in their lives: Co-workers. People with whom they attended college. Family members. Etc. Since many people within these clusters know each other, the clusters contain links of very high embeddedness even though they do not necessarily correspond to particularly strong ties.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-23
SLIDE 23

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

In contrast, the links to a person’s relationship partner may have lower embeddedness, but they will often involve mutual neighbors from several different foci, reflecting the fact that the social orbits

  • f these friends are not bounded within any one focus.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-24
SLIDE 24

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Consider the following example: A husband who knows several of his wife’s co-workers, family members, and former classmates, even though these people belong to different foci and do not know each other.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-25
SLIDE 25

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

The mutual neighbors of a married couple are not well-connected to one another.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-26
SLIDE 26

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Dispersion We take the subgraph Gu induced on u and all neighbors of u, and for a node v in Gu we define Cuv to be the set of common neighbors of u and v. Then disp(u, v) = Σs,t∈Cuv dv(s, t)

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-27
SLIDE 27

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Dispersion We take the subgraph Gu induced on u and all neighbors of u, and for a node v in Gu we define Cuv to be the set of common neighbors of u and v. Then disp(u, v) = Σs,t∈Cuv dv(s, t) dv is a distance function on the nodes of Cuv.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-28
SLIDE 28

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Dispersion We take the subgraph Gu induced on u and all neighbors of u, and for a node v in Gu we define Cuv to be the set of common neighbors of u and v. Then disp(u, v) = Σs,t∈Cuv dv(s, t) dv is a distance function on the nodes of Cuv. We do not consider the two-step paths through u and v themselves.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-29
SLIDE 29

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

The function dv need not be the standard graph-theoretic distance.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-30
SLIDE 30

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

The function dv need not be the standard graph-theoretic distance. A function equal to 1 when s and t are not directly linked and also have no common neighbors in Gu other than u and v.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-31
SLIDE 31

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

The function dv need not be the standard graph-theoretic distance. A function equal to 1 when s and t are not directly linked and also have no common neighbors in Gu other than u and v. A function equal to 1 when the distance between s and t is greater than a pre-defined threshold T.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-32
SLIDE 32

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

The function dv need not be the standard graph-theoretic distance. A function equal to 1 when s and t are not directly linked and also have no common neighbors in Gu other than u and v. A function equal to 1 when the distance between s and t is greater than a pre-defined threshold T. A function equal to 1 when there are less than T disjoint paths between s and t.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-33
SLIDE 33

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

Let us practice dispersion under the assumption that dv(s, t) equal to 1 when s and t are not directly linked and also have no common neighbors in Gu...

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-34
SLIDE 34

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

disp(u, b) = ?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-35
SLIDE 35

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

disp(u, b) = ?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-36
SLIDE 36

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

disp(u, b) = ?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-37
SLIDE 37

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

disp(u, h) = ?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-38
SLIDE 38

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

disp(u, h) = ?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-39
SLIDE 39

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Strengthenings of Dispersion

We can detect a’s romantic partner based of the two functions (dispersion and embeddedness). It has been empirically found that performance is highest for functions that are monotonically increasing in dispersion and monotonically decreasing in embeddedness. E.g. disp(u, v)/emb(u, v). Can you find the logic behind this finding?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-40
SLIDE 40

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

Dispersion

One can strengthen performance by applying the idea of dispersion recursively as follows: We initially define Xv = 1 for all neighbors v of u, and then iteratively update each Xv to be: Recursive Dispersion Xv =

  • w∈Cuv

X 2

w+2

  • s,t∈Cuv

dv(s,t)XsXt emb(u,v)

The authors found that the highest performance is achieved when they rank nodes by the values of Xv after the third iteration.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-41
SLIDE 41

Introduction Embeddedness and Dispersion Evaluation Combining Features Embeddedness Dispersion

1 Introduction

Problem Statement Dataset

2 Embeddedness and Dispersion

Embeddedness Dispersion

3 Evaluation

Take 1 Take 2 Time and Space

4 Combining Features

Machine Learning Performance Over Time

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-42
SLIDE 42

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 1

Accuracy is defined as the fraction of instances on which the highest-ranked node under the examined measure is in fact the partner. Results can be compared to other standard measures: The number of photos in which u appear with v. The total number of times that u has viewed v’s profile page in the previous 90 days.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-43
SLIDE 43

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 1

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-44
SLIDE 44

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 1

When the user v who scores highest under one of these measures is not the partner of u, what role does v play among u’s network neighbors?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-45
SLIDE 45

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 1

When the user v who scores highest under one of these measures is not the partner of u, what role does v play among u’s network neighbors? v is often a family member of u.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-46
SLIDE 46

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 2

Let us consider dispersion measures based on other definitions of dv: dv(s, t) = 1 when s and t are at least r hops apart in Gu \ {u, v}, and dv(s, t) = 0 otherwise.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-47
SLIDE 47

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 2

Let us consider dispersion measures based on other definitions of dv: dv(s, t) = 1 when s and t are at least r hops apart in Gu \ {u, v}, and dv(s, t) = 0 otherwise. dv(s, t) = 1 if s and t belong to different connected components of Gu \ {u, v}, and dv(s, t) = 0 otherwise.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-48
SLIDE 48

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 2

Let us consider dispersion measures based on other definitions of dv: dv(s, t) = 1 when s and t are at least r hops apart in Gu \ {u, v}, and dv(s, t) = 0 otherwise. dv(s, t) = 1 if s and t belong to different connected components of Gu \ {u, v}, and dv(s, t) = 0 otherwise. dv(s, t) = 1 if and only if s and t belong to different communities based on Louvain method.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-49
SLIDE 49

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Take 2

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-50
SLIDE 50

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Time and Space

An important source of variation among users is in the size of their network neighborhoods and the amount of time since they joined

  • Facebook. Time effects:

The neighborhood’s complexity. The extent to which the network reflects the user’s off-line relationships.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-51
SLIDE 51

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Time and Space

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-52
SLIDE 52

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

Evaluation - Time and Space

Why the interaction features performance increases as a function of neighborhood size?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-53
SLIDE 53

Introduction Embeddedness and Dispersion Evaluation Combining Features Take 1 Take 2 Time and Space

1 Introduction

Problem Statement Dataset

2 Embeddedness and Dispersion

Embeddedness Dispersion

3 Evaluation

Take 1 Take 2 Time and Space

4 Combining Features

Machine Learning Performance Over Time

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-54
SLIDE 54

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Machine Learning

Why focus on one aspect of the user’s neighborhood? Let’s combine information! Structural features: Absolute and normalized dispersion based on six distinct distance functions. Recursive versions using iterations 2 through 7. Interaction features: Number of common photos. Number of profile views over the last 30, 60 and 90 days. Number of messages sent. Etc.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-55
SLIDE 55

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Machine Learning

The combination of structural and interaction features improves the accuracy.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-56
SLIDE 56

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Machine Learning

Recall our focus is on partner detection, given the user is in a romantic relation. Can we determine whether or not a user is single? Learn demographic features (age, gender and country). Include structural features. Train a classifier.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-57
SLIDE 57

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Time Dependency

Notice the decrease of profile viewing...

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-58
SLIDE 58

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Time Dependency

Say a’s romantic partner is b. Is the dispersion of their link correlated with their transition probability over a 60 day period?

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-59
SLIDE 59

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Time Dependency

Relationships on which recursive dispersion fails to correctly identify the partner are significantly more likely to transition to ’single’ status over a 60 day period.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-60
SLIDE 60

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Conclusions

Understanding the structural roles of significant people in

  • n-line social network neighborhoods is a broad question that

requires a combination of different approaches.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-61
SLIDE 61

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Conclusions

Understanding the structural roles of significant people in

  • n-line social network neighborhoods is a broad question that

requires a combination of different approaches. Dispersion provides a powerful method for recognizing the structural positions occupied by romantic partners from network data alone.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-62
SLIDE 62

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time

Conclusions

Understanding the structural roles of significant people in

  • n-line social network neighborhoods is a broad question that

requires a combination of different approaches. Dispersion provides a powerful method for recognizing the structural positions occupied by romantic partners from network data alone. Romantic relations connect us to people who belong to multiple parts of our social neighborhood, producing a set of shared friends that is not simply large but also diverse.

Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties

slide-63
SLIDE 63

Introduction Embeddedness and Dispersion Evaluation Combining Features Machine Learning Performance Over Time Lars Backstrom, Jon Kleinberg Romantic Partnerships and the Dispersion of Social Ties