Sockpuppets in Online Discussions: Use and Abuse Srijan Kumar Jure - - PowerPoint PPT Presentation

sockpuppets in online discussions use and abuse
SMART_READER_LITE
LIVE PREVIEW

Sockpuppets in Online Discussions: Use and Abuse Srijan Kumar Jure - - PowerPoint PPT Presentation

Sockpuppets in Online Discussions: Use and Abuse Srijan Kumar Jure Leskovec Justin Cheng V.S. Subrahmanian An Army of Me: Sockpuppets in Online Discussion Communities. S. Kumar, J. Cheng, J. Leskovec and V.S. Subrahmanian. Proceedings of


slide-1
SLIDE 1

Sockpuppets in Online Discussions: Use and Abuse

An Army of Me: Sockpuppets in Online Discussion Communities. S. Kumar, J. Cheng, J. Leskovec and V.S. Subrahmanian. Proceedings of World Wide Web Conference, 2017 (WWW 2017). Best Paper Award Honorable Mention. Srijan Kumar Justin Cheng Jure Leskovec V.S. Subrahmanian

slide-2
SLIDE 2

2

slide-3
SLIDE 3

3

Eric_17 April 28 2013, 12AM

  • Thanks. I knew Marvel fans would try to flame me, but they

have nothing other than “oh that’s your opinion” instead of coming up with their own argument

Fellstrike April 29 2013, 6PM

Quit talking to yourself, *******. Get back on your meds if you’re going to do that

bdiaz209 April 28 2013, 11PM

Possibly the best blog I’ve ever read major props to you

bdiaz209 posts only on this discussion to support and defend Eric_17

Example

slide-4
SLIDE 4

4

slide-5
SLIDE 5

5

Sockpuppets in Wikipedia

USE ABUSE

slide-6
SLIDE 6

6

Sockpuppets in online discussions

slide-7
SLIDE 7

Data: Sockpuppets

7

2.9M Users 2.1M Articles 62M Posts

slide-8
SLIDE 8

Defining sockpuppets

8

No ground truth sockpuppet labels! (Surprise?!) We adopt currently used definition from Wikipedia, after statistical validation for our task, as follows:

Sockpuppets are accounts that post from the same IP address in the same discussion very close in time (15 min), in at least 3 different instances.

Note: we use the IP addresses for definition, but not detection

3,656 Sockpuppets 1,623 Puppetmasters

slide-9
SLIDE 9

9

Characteristics of sockpuppets

slide-10
SLIDE 10

How to compare sockpuppets and ordinary users?

10

For each sockpuppet, match an

  • rdinary user that makes

similar number of posts

  • n

similar discussions We have to match!

slide-11
SLIDE 11

Where do sockpuppets post?

11

slide-12
SLIDE 12

12

Smoothzilla Feb 5 2013, 3PM

Thanks for your support!!!!

Falcon-X32 Feb 5 2013, 3PM

I agree. You are absolutely right!

jakey008 Feb 5 2013, 2PM

should have read the reviews first :(

ricobeans27 Feb 5 2013, 3PM

Couldn’t agree more. Interact more with each other p < 10-3 Upvote each other more p < 10-3

Relation between pair of sockpuppets

slide-13
SLIDE 13

Do puppetmasters lead double lives?

13

Double life hypothesis: Puppetmaster maintains distinct personality for the two sockpuppets

More simiar Less similar

Ordinary Sockpuppet 1 Sockpuppet 2

Similarity is measured as cosine similarity between user posts’ features: LIWC, sentiment, number of words, etc.

slide-14
SLIDE 14

Do puppetmasters lead double lives?

14

Alternate hypothesis: Puppetmaster operates both sockpuppets similarly

Less similar More similar

Ordinary Sockpuppet 1 Sockpuppet 2

Similarity is measured as cosine similarity between user posts’ features: LIWC, sentiment, number of words, etc.

slide-15
SLIDE 15

Do puppetmasters lead double lives?

15

Both sockpuppets are more similar to each other p < 10-3 “Good sock/Bad sock” not common

Non-sockpuppet Sockpuppet 1 Sockpuppet 2

slide-16
SLIDE 16

16

Why are sockpuppets created? Only for deception?

slide-17
SLIDE 17

Deceptiveness

17

Levenshtein distance between usernames Number of pairs

5 10 15 20 0 100 200 300

Non-Pretenders Pretenders

Sock pairs Random pairs

srijan srijan2 srijan theRealBatman

2/3 1/3

Hypothesis: Deceptive sockpuppets of the same master have very different usernames.

slide-18
SLIDE 18

18

srijan Feb 5 2013, 3PM

i agree.. these morons dont know a thing

theRealBatman Feb 5 2013, 3PM

YOU ARE STUPID AND A *****

srijan Feb 5 2013, 2PM

best article i have read!!!

ricobeans27 Feb 5 2013, 3PM

But this article doesn’t make any sense More opinionated p < 10-3 Swear more p < 10-3 Downvoted and reported more p < 10-3

Pretender vs Non-pretender Sockpuppets

slide-19
SLIDE 19

19

How are sockpuppets used? Do sockpuppets always support

  • ne another?
slide-20
SLIDE 20

Neutral sockpuppets

20

60%

Neutral

theRealBatman Feb 5 2013, 3PM

why so?

srijan Feb 5 2013, 3PM

best article ever! We quantify the amount of support by counting assenting, negation and dissenting words from LIWC

slide-21
SLIDE 21

Supporter sockpuppets

21

60%

Neutral

30%

Supporter

theRealBatman Feb 5 2013, 3PM

Totally agree!!

srijan Feb 5 2013, 3PM

best article ever! We quantify the amount of support by counting assenting, negation and dissenting words from LIWC

slide-22
SLIDE 22

Dissenter sockpuppets

22

60%

Neutral

30%

Supporter

10%

Dissenter

theRealBatman Feb 5 2013, 3PM

I don’t think so

srijan Feb 5 2013, 3PM

best article ever! We quantify the amount of support by counting assenting, negation and dissenting words from LIWC

slide-23
SLIDE 23

Probability of being a pretender

Supportiveness and Deceptiveness

23

0.5 0.0 1.0

Dissenter

0.58 0.42

Neutral

0.70 0.30

Supporter

0.74 0.26 Pretender Non-pretender

Deception is important to create an illusion of public consensus

slide-24
SLIDE 24

24

Detecting sockpuppets

slide-25
SLIDE 25

25

Features

Post

Number of words, characters, etc., LIWC counts, Readability, Sentiment, …

Community

Number of upvotes and downvotes, Fraction of reported posts, Is account reported, …

Activity

Number of posts, number of replies, reciprocity of posts, age of account, …

Note: we are not using the IP based features

slide-26
SLIDE 26

26

Is an account a sockpuppet?

slide-27
SLIDE 27

Is an account a sockpuppet?

27

0.5 0.6 0.8 1.0 0.7 0.9 0.57 0.54 0.59 0.68

Post Community Activity All

AUC

Baseline

slide-28
SLIDE 28

Do two accounts belong to the same person?

28

slide-29
SLIDE 29

Do two accounts belong to the same person?

29

0.5 0.6 0.8 1.0 0.7 0.9 0.80 0.56 0.86 0.91

AUC

Post Community Activity All

Baseline

slide-30
SLIDE 30

What’s next?

  • Being implemented at Reddit and

Wikipedia

  • Creating algorithmic models for detection

(random walks, deep learning, etc.)

30

slide-31
SLIDE 31

You may also be interested in

  • Tutorials on misbehavior and misinformation:

– Data-Driven Approaches towards Malicious Behavior Modeling. Jiang et al., SIGKDD 2017 – Antisocial Behavior on the Web: Characterization and Detection. Kumar et al., WWW 2017

  • Hoaxes in Wikipedia

– Disinformation on the Web: Impact, Characterisitics and Detection of Wikipedia Hoaxes. Kumar et al., WWW 2016

  • Vandals in Wikipedia

– VEWS: A Wikipedia Vandal Early Warning System. Kumar et al., SIGKDD 2015

  • Language and deception

– Linguisitic Harbingers of Betrayal: A Case Study on an Online Strategic Game. Niculae et al., ACL 2015

  • Social network algorithm for troll detection

– Accurately Detecting Trolls in Slashdot Zoo via Decluttering. Kumar et al., ASONAM 2014

31

More details at: http://cs.stanford.edu/~srijan

slide-32
SLIDE 32

Upcoming workshop at WSDM 2018

Please submit your papers! Completed research papers, short papers, works in progress, extended abstracts are welcome!

MIS2

MIS2: Misinformation and Misbehavior Mining on the Web

Feb 9, 2018 at Los Angeles, CA Held in conjunction with WSDM 2018 Submissions due: Nov 15, 2017 Notifications due: Dec 7, 2017 Best paper awards of USD 1,000 sponsored by