Quantify fying Social In Influence in in Epin inions Akshay - - PowerPoint PPT Presentation

quantify fying social
SMART_READER_LITE
LIVE PREVIEW

Quantify fying Social In Influence in in Epin inions Akshay - - PowerPoint PPT Presentation

Quantify fying Social In Influence in in Epin inions Akshay Patil, Golnaz Ghasemiesfeh, Roozbeh Ebrahimi & Jie Gao In Introduction Social Network : Structure made up of entities & their relationships . i.e.: Facebook, G+, Y!,


slide-1
SLIDE 1

Quantify fying Social In Influence in in Epin inions

Akshay Patil, Golnaz Ghasemiesfeh, Roozbeh Ebrahimi & Jie Gao

slide-2
SLIDE 2

In Introduction

Social Network:

  • Structure made up of entities & their relationships.

i.e.: Facebook, G+, Y!, etc.

Content Generation Websites:

  • i.e.: Wikipedia, Youtube, Epinions, Instagram, etc.

Explosion in “online social activity”+ “content generation”:

  • Large-Scale data availability.
  • Quantitative research into dynamics.

Overlay of “content generation”+ “social structure”

  • Study the mutual influence of content and

social structure on each other.

slide-3
SLIDE 3

: : A Consumer Review Sit

ite

Epinions.com incorporates a social structure into its rating system:

  • Rating system
  • Users write reviews
  • Users rate other user’s reviews [1-5] (1:bad ->5:Excellent)
  • Social structure
  • Trust other users to form “web of trust” (public)
  • Distrust other users to form “block list” (private)
  • We are interested in the interplay of “rating system data”

and “social structure data”.

slide-4
SLIDE 4

Epinions

Dataset

Statistic

#Users 131,828 #Reviews 1,197,816 #Trust Edges 717,667 (85%) #Distrust Edges 123,705 (15%) #Ratings 12,943,546 Time Range Jan’01 to Aug’03

0.01% 2.13% 4.63% 14.30 %

78.93 %

Ratings

1 2 3 4 5

slide-5
SLIDE 5

1. . Relationship Formation

Scenario:

  • User A has a couple of trustees

(his “web of trust”, or “friends”)

  • A’s friends have a trust/distrust

relationship with user B. Classification:

  • If the majority of A’s friends trust B, we

say they collectively trust B.

  • If the majority of A’s friends distrust B, we

say the collectively distrust B.

  • Other wise they are neutral or in disagreement.

1 2 … n A B

slide-6
SLIDE 6

1. . Relationship Formation

Question 1: Is there a correlation between the collective opinion of A’s friends about B and his future relationship with B?

  • If A’s friends collectively trust B,

is A more likely to trust B as well?

  • If A’s friends collectively distrust B,

is A more likely to distrust B as well?

1 2 … n A B

slide-7
SLIDE 7

Raw Observations: How meaningful (statistically significant) are these results?

1. . Relationship Formation

95.09% 85.93% 93.85%

80.00% 85.00% 90.00% 95.00% 100.00%

Collectively Trust Collectively Distrust Opinionated

slide-8
SLIDE 8

1. . Relationship Formation, Random Shuffle

Measure over/under representation compared to mere chance (approach by Leskovec et al.’10):

  • Randomly shuffle trust/distrust edges, while maintaining

the same percentages.

  • Redo “relationship formation” analysis.
  • Compute “Surprise” value: 𝒕 =

𝑹−𝑭 𝑹 𝑭 𝑹 𝟐−𝒒𝟏

  • Q: actual quantity of a scenario,

E[Q]: expected quantity under shuffling, p0: priori probability of the scenario.

slide-9
SLIDE 9

1. . Relationship Formation, Random Sh Shuffle

Surprise is number of standard deviations by which the actual quantity differs from the expected number under the random shuffling model.

  • s > 0  overrepresentation
  • s < 0  underrepresentation
  • s = 6  p-value ≈ 10-8
  • A value of s > 6 results in excellent statistical significance.
slide-10
SLIDE 10

“Agreeing with Friends”: surprise values in excess of 70! Strong correlation between a user’s friends opinions and formation of his future relationships (distrust is hidden).

1. . Relationship Formation, Random Sh Shuffle

  • 120
  • 90
  • 60
  • 30

30 60 90 Agreeing with Friends Disagreeing with Friends Neutral Friends Surprise

slide-11
SLIDE 11

1. . Relationship Formation: Li Linking Habits

The dataset exhibits users with very different linking habits

  • Some are very trustworthy/ trustful compared to others.

Analysis should not overlook

  • The quality of the reviews (trustworthiness) written by the user.
  • The degree of trustfulness of the person creating the link.

Looking through the lenses of linking habits (Leskovec et al.’10). Receptive Baseline (trustworthiness):

  • Fraction of received trust links.

Generative Baseline (trustfulness):

  • Fraction of given trust links.
slide-12
SLIDE 12

1. . Relationship Formation: Li Linking Habits

Receptive/Generative Surprise: Number of standard deviations the quantity is above the expected number.

  • If B was trusted/distrusted based solely on his trustworthiness,

Receptive Surprise = 0.

  • If A made his decision based solely on his trustfulness,

Generative Surprise = 0. Observations: Receptive Surprise Generative Surprise Collectively Trust 96.76 34.99 Collectively Distrust

  • 104.15
  • 56.31
slide-13
SLIDE 13

1. . Relationship Formation: Li Linking Habits

Collectively Trust

  • Users exceed both generative and receptive baselines in

trusting and being trusted.

  • This can be explained by homophily or influence of

friends. Collectively Distrust

  • Users fall behind generative/receptive baselines.
  • This can be explained by heterophobia or lack of context

by friends (distrust edges are hidden).

slide-14
SLIDE 14

2. . Fri riend of f Friend (FoF) Dynamics

Scenario:

  • User A trusts user B and user B trusts

user C.

  • A does not have a trust/distrust edge

to C.

  • A rates a review by C.

Question 2: Is A more likely to give a favorable rating to C’s review? Why is this about the influence of the social structure on the rating system data?

B A C

Rate

slide-15
SLIDE 15

2. . Fri riend of f Friend Dynamics

B A C

Rate

B A C

Rate

B A C

Rate

B A C

Rate

FoF FoE EoF EoE

slide-16
SLIDE 16

2. . FoF Dynamics: Random Shuffling

  • 1500
  • 1000
  • 500

500 1000 1500 2000

1 2 3 4 5

Surprise

FoF

  • 400
  • 200

200 400 600 800 1000

1 2 3 4 5

Surprise

EoF

slide-17
SLIDE 17

2. . FoF Dynamics: Random Shuffling

  • 400
  • 300
  • 200
  • 100

100 200 300 400 500

1 2 3 4 5

Surprise

FoE

  • 200
  • 150
  • 100
  • 50

50 100 150 200

1 2 3 4 5

Surprise

EoE

slide-18
SLIDE 18

2. . FoF Dynamics: Rating Habits Analysis

Rating Gen./Rec. Surprise FoF EoF FoE EoE 1

  • Gen. sur.
  • Rec. sur.
  • 43.77
  • 10.91

66.03 19.80

  • 8.36

57.13

  • 1.43

2.76 2

  • Gen. sur.
  • Rec. sur.
  • 627.54
  • 527.77

789.36 89.58

  • 108.83

206.16 26.54 4.03 3

  • Gen. sur.
  • Rec. sur.
  • 360.72
  • 181.17

2.01 10.16

  • 304.42

65.53

  • 124.23

5.89 4

  • Gen. sur.
  • Rec. sur.
  • 847.21
  • 370.22
  • 115.23
  • 3.57
  • 381.94

81.06

  • 190.69
  • 4.27

5

  • Gen. sur.
  • Rec. sur.

1065.09 519.91

  • 189.93
  • 61.03

531.88

  • 173.88

214.75

  • 1.36
slide-19
SLIDE 19

2. . Fri riend of f Friend Dynamics : : Summary

Distinct Trends in 2 (out of 4) scenarios,

  • FoF: Shift towards assigning higher ratings to C’s review

(Specially 5).

  • Homophily/Influence
  • EoF: Shift towards assigning lower ratings to C’s review

(Specially 1 &2).

  • Heterophobia/Influence?

In the remaining 2 scenarios (FoE & EoE), it is hard to get a solid interpretation.

slide-20
SLIDE 20

3. . Building a Predictor: Corr. Analysis

Utilize FoF dynamics as features and “actual rating” as target value.

Dynamics Correlation Coefficient FoF 0.1112 EoF

  • 0.0918

FoE 0.0105 EoE

  • 0.0001
slide-21
SLIDE 21

3. . Building a a Predictor: Pic icking Features

Class Feature Information Gain

Trust

A’s Generative Baseline 0.1595 C’s Generative Baseline 0.2291 A’s Receptive Baseline 0.1943 C’s Receptive Baseline 0.4496

Rating

  • Avg. Rating given by A

0.3316

  • Avg. Rating given by C

0.3776

  • Avg. Rating received by A’s Reviews

0.2453

  • Avg. Rating received by C’s Reviews

0.5362

FoF Dynamics

Number of FoF Paths 0.3813 Number of EoF Paths 0.1894 Number of FoE Paths 0.0119 Number of EoE Paths 0.0198

slide-22
SLIDE 22

3. . Prediction Results (B (Bootstrap Aggregating)

0.65 0.7 0.75 0.8 0.85

Low (1, 2) Medium (3) High (4, 5)

ROC Area (AUC) = 0.91 Overall Accuracy = 76%

Precision Recall F-Score

slide-23
SLIDE 23

Conclusion

Relationship Formation

  • Random Shuffle and Linking Habits: Strong Correlation.

Exceeding both generative and receptive baselines in trusting and being trusted.

Friend of Friend Dynamics

  • FoF: Shift towards assigning higher ratings to C’s review.

EoF: Shift towards assigning lower ratings to C’s review.

Building a Predictor

  • This alignment can be used to predict (recommend) content that

would be likeable by a user.

  • We achieve good prediction accuracy with a simple feature set.
slide-24
SLIDE 24

Thank You!