The Wisdom of Crowds: Network effects, and the Importance of - - PowerPoint PPT Presentation

the wisdom of crowds
SMART_READER_LITE
LIVE PREVIEW

The Wisdom of Crowds: Network effects, and the Importance of - - PowerPoint PPT Presentation

The Wisdom of Crowds: Network effects, and the Importance of Experts Aris Anagnostopoulos Sapienza University of Rome Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015 Online collaboration systems


slide-1
SLIDE 1

The Wisdom of Crowds:

Network effects, and the Importance of Experts

Aris Anagnostopoulos Sapienza University of Rome

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-2
SLIDE 2
  • Tagging/geotagging systems:
  • Games with a purpose:
  • Content creation systems:
  • Crowdsourcing:
  • Open source community:
  • Polymath project:

Online collaboration systems

Systems creating knowledge by massive online collaboration:

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-3
SLIDE 3

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-4
SLIDE 4

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-5
SLIDE 5

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-6
SLIDE 6

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-7
SLIDE 7

What does the ox weigh? (1198 pounds)

Wisdom of crowds – First experiment

At a 1906 country fair in Plymouth, UK, Sir Francis Galton made an experiment, asking people to estimate the weight of a slaughtered ox. He asked 800 participants. The answers’ median was 1207 pounds (1% error)

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-8
SLIDE 8

The wisdom of crowds

The premise of the wisdom of crowds is that averaging the

  • pinion of many individuals on a topic can give accurate

answers. Examples and applications:

  • Francis Galton experiment
  • Who wants to be a millionaire
  • Recommendation systems
  • Prediction markets
  • Twitter
  • Democracy
  • The book of James Surowiecki has many examples

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-9
SLIDE 9

This talk

We will look at three dimensions of the problem:

  • Network effect on the wisdom of crowds
  • The role of homophily and polarization in the

spreading of (mis)information

  • How to schedule experts in crowdsourcing

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-10
SLIDE 10

This talk

We will look at three dimensions of the problem:

  • Network effect on the wisdom of crowds
  • The role of homophily and polarization in the

spreading of (mis)information

  • How to schedule experts in crowdsourcing

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-11
SLIDE 11

The wisdom of crowds

Main requirement: Independence of opinions and diversity What happens when we talk and influence each other? Answer: Often bad things – Think about democracy:

  • Italy, USA, Greece, have voters that keep/kept bringing

terrible governments

– GroupThink – Spread of conspiracy theories We want to study the network effect on the wisdom of crowds in a natural setting

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-12
SLIDE 12

Instructions to participants

Instructions: Phase 1:

  • Answer 4 simple questions (5 min)
  • Return the answers
  • Take and wear an RFID tag

Phase 2

  • Discuss the questions with others (20 min)
  • At the end answer the questions again and return the tags

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-13
SLIDE 13

We can use RFID tags to track sustained face-to-face proximity among people.

RFID Reader RFID Tag

Tracking individual interactions

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-14
SLIDE 14

Collection of F2F interactions

550

I think… Bla bla bla… I want a steak! Trust me…

A typical scenario…

Each participant wears an RFID tag

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-15
SLIDE 15

Innate/Learnt Ability (Class 1)

  • How many spaghetti are in the pack?
  • How many points are there in the following picture?

Knowledge and Reasoning (Class 2) Prediction (Class 3)

  • What was the average female population of Italy
  • ver the years 1960–1970?
  • What is the value in EUR of the coins thrown into

the Trevi fountain in 2012?

  • How many goals in total will the following teams score in

the first round (3 games each) of the 2014 Mundial? Brazil, Spain, Greece, Italy, France, Argentina, Germany, Russia (asked before the mundial…  )

Examples of questions

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-16
SLIDE 16

Experiments deployed so far

  • 1. WSDM 2013 Conference,

Feb 2013 (69 attendees)

  • 2. My 2013 data mining class,

May 2013 (37 attendees)

  • 3. Priverno’s town yearly fair,

May 2014 (60 attendees)

  • 4. My 2014 data mining class,

May 2014 (25 attendees)

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-17
SLIDE 17

An interaction graph 𝑯 = 𝑾, 𝑭 represents the interactions between the people.

node edge

฀  E ฀  V

Interaction graph

(interaction)

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-18
SLIDE 18

Priverno fair

Undirected graph Nodes: 60 Edges: 128 Density: 0.072 Network Diameter: 9 Communities: 15

Interaction graphs

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-19
SLIDE 19

Main findings: average improves

Priverno fair (the others are similar): Normalized true value Average in 1st round Average in 2nd round

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-20
SLIDE 20

Main findings: std decreases

Priverno fair (the others are similar):

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 Q1 Q2 Q3 Q4 Round 1 Round 2 Normalized standard deviation (std)

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-21
SLIDE 21

Modeling user interactions

Having all these data we want to design models for

  • pinion formation

Why?

  • Understand the opinion-formation process
  • Understand effect of peer pressure
  • Explain how interaction can lead to improved results

Hard: different people, lots of noise, missing info

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-22
SLIDE 22

Modeling user interactions

DeGroot model:

𝐵′(𝑣) = 𝐵 𝑣 + 𝐵 𝑤1 + 𝐵 𝑤2 + 𝐵 𝑤3 + 𝐵(𝑤4) 1 + 4 𝐵′(𝑣) = 𝛽 𝐵 𝑣 + 𝐵 𝑤1 + 𝐵 𝑤2 + 𝐵 𝑤3 + 𝐵(𝑤4) 𝛽 + 4

Generalized DeGroot model: But how can we explain the improvement?

𝐵(𝑣): answer of u at R1 𝐵′(𝑣): answer of u at R2 u v1 v2 v3 v4

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-23
SLIDE 23

Some reflection

  • Peer interaction can lead to a more accurate crowd
  • … in contrast to previous studies in artificial settings

where interaction was imposed

  • How can we explain it?
  • When does interaction improves and when does it

harm?

  • Models…

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-24
SLIDE 24

This talk

We will look at three dimensions of the problem:

  • Network effect on the wisdom of crowds
  • The role of homophily and polarization in the

spreading of (mis)information

  • How to schedule experts in crowdsourcing

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-25
SLIDE 25

Can we always trust the crowd?

Numerous examples where large part of the population believes false info:

  • Does democracy always work?
  • Conspiracy theories
  • Unsubstantiated science (e.g., homeopathy)
  • How does such info become popular?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-26
SLIDE 26

Facebook study

Posts from 79 italian facebook group pages:

  • 34 science group pages
  • 65K posts
  • 2.5M likes, 1.5M shares
  • 39 conspiracy group pages
  • 200K posts
  • 6.5M likes, 16M shares

Crawled the network of likers and found their connections:

  • 1.2M nodes
  • 35M edges

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-27
SLIDE 27

A facebook post

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

180K likes 26K shares

slide-28
SLIDE 28

User polarization

We have 1.2M users who have liked science/conspiracy posts. Are they consistent with the content they like? For each user 𝑣 define user polarization 𝝇(𝒗): 𝜍 𝑣 = 𝒅𝒑𝒐𝒕𝒒 𝒅𝒑𝒐𝒕𝒒 + 𝒕𝒅𝒋 𝒅𝒑𝒐𝒕𝒒: # conspiracy posts 𝑣 liked 𝒕𝒅𝒋: # science posts 𝑣 liked

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-29
SLIDE 29

User polarization

We have 1.2M users who have liked science/conspiracy posts. Are they consistent with the content they like? For each user 𝑣 define user polarization 𝝇(𝒗): 𝜍 𝑣 = 𝒅𝒑𝒐𝒕𝒒 𝒅𝒑𝒐𝒕𝒒 + 𝒕𝒅𝒋 𝒅𝒑𝒐𝒕𝒒: # conspiracy posts 𝑣 liked 𝒕𝒅𝒋: # science posts 𝑣 liked

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-30
SLIDE 30

User polarization

We can select two subsets of users: Science users: {𝑣: 𝜍 𝑣 ≤ 5%} Conspiracy users: {𝑣: 𝜍 𝑣 ≥ 95%}

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-31
SLIDE 31

User polarization

We can select two subsets of users: Science users: {𝑣: 𝜍 𝑣 ≤ 5%} Conspiracy users: {𝑣: 𝜍 𝑣 ≥ 95%}

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-32
SLIDE 32

Science vs. conspiracy

Post statistics Post lifetime

Science and conspiracy posts and users show very similar behavior:

User lifetime User subgraph statistics

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-33
SLIDE 33

Largest connected component

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-34
SLIDE 34

Homophily

Homophily: tendency of individuals to associate with similar others

𝜔 𝑣  # 𝑚𝑗𝑙𝑓𝑡: Normalized liking activity of 𝑣

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-35
SLIDE 35

Prediction of polarized friends

We can predict the ratio

  • f 𝑣’s friends who have

the same polarization with 𝑣 as a function of 𝑣’s #likes:

𝜄 𝑣 = #𝑚𝑗𝑙𝑓𝑡: Liking activity of 𝑣

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-36
SLIDE 36

How do posts become viral?

How does the average user of a viral post look?

deg (𝑣): # friends of node 𝑣 𝜔 𝑣  # 𝑚𝑗𝑙𝑓𝑡: Normalized liking activity of 𝑣

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-37
SLIDE 37

Troll posts

We also downloaded info about 4.7K troll posts: posts with clearly useless or wrong information:

“The Italian Senate voted and accepted (257 in favor and 165 abstentions) a law proposed by Senator Cirenga aimed at funding with 134 billion Euro the policy makers to find a job in case of defeat in the political competition.”

36K shares 1.1K likes

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-38
SLIDE 38

Troll posts

We also downloaded info about 4.7K troll posts: posts with clearly useless or wrong information:

“The Italian Senate voted and accepted (257 in favor and 165 abstentions) a law proposed by Senator Cirenga aimed at funding with 134 billion Euro the policy makers to find a job in case of defeat in the political competition.”

36K shares 1.1K likes

  • 315+5 members in

Italian senate!

  • Cirenga does not

exist!

  • 134B EUR > 1/20 of

French GDP!

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-39
SLIDE 39

Troll posts: degree and activity

deg (𝑣): # friends of node 𝑣 𝜔 𝑣  # 𝑚𝑗𝑙𝑓𝑡: Normalized liking activity of 𝑣

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-40
SLIDE 40

Troll posts: polarization

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-41
SLIDE 41

Troll posts: polarization at different virality levels .

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-42
SLIDE 42

Some reflection

  • Peer influence can reinforce ones ideas
  • … to the extent that people might believe clearly false

info

  • Clear evidence of psychological phenomena such as
  • Cognitive closure: the human desire to eliminate

ambiguity and arrive at definite conclusions (sometimes irrationally)

  • Confirmation bias: tendency to search for,

believe, and remember info in a way that is aligned with ones beliefs

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-43
SLIDE 43

This talk

We will look at three dimensions of the problem:

  • Network effect on the wisdom of crowds
  • The role of homophily and polarization in the

spreading of (mis)information

  • How to schedule experts in crowdsourcing

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-44
SLIDE 44

Rest of the talk

Wisdom of crowds and wisdom of experts:

  • We saw that in some cases the crowd cannot be

trusted

  • For some problems experts are indispensable!
  • But experts are scarce and expensive
  • What can we do with (lots) of nonexperts?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-45
SLIDE 45
  • Tagging/geotagging systems:
  • Games with a purpose:
  • Content creation systems:
  • Crowdsourcing:
  • Open source community:
  • Polymath project:

Online collaboration systems

Systems creating knowledge by massive online collaboration:

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-46
SLIDE 46
  • Tagging/geotagging systems:
  • Games with a purpose:
  • Content creation systems:
  • Crowdsourcing:
  • Open source community:
  • Polymath project:

Online collaboration systems

Systems creating knowledge by massive online collaboration:

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-47
SLIDE 47

What is crowdsourcing

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

Crowdsourcing: is the process of obtaining information by using contributions from a large group of people.

There are tasks hard for computers but easy for humans (human tasks):

  • Compare 2 photos (to select the best one that represents the

Colosseum)

  • Translate a sentence
  • Choose the best search result to a query

Crowdsourcing platforms: Online services that allow, through APIs, to get answers from humans at a low cost

  • Amazon Mechanical Turk
  • CrowdFlower
slide-48
SLIDE 48

Crowdsourcing – Amazon Mechanical Turk

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-49
SLIDE 49

Crowdsourcing – Amazon Mechanical Turk

Requester Human Intelligent Tasks (HITs) Workers

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-50
SLIDE 50

Crowdsourcing – Amazon Mechanical Turk

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-51
SLIDE 51

Crowdsourcing – Amazon Mechanical Turk

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-52
SLIDE 52

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-53
SLIDE 53

Which photo has more dots?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-54
SLIDE 54

Accuracy vs. number of responses

[ Ma

[relative distance], #questions [relative distance], #questions

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-55
SLIDE 55

Which car is more expensive?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-56
SLIDE 56

Which car is more expensive?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-57
SLIDE 57

Which car is more expensive?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-58
SLIDE 58

Which car is more expensive?

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-59
SLIDE 59

Accuracy vs. number of responses

[relative distance], #questions [relative distance], #questions

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-60
SLIDE 60
  • Consider a set of elements with different values
  • Threshold error model:
  • We present to a worker a pair (𝑓𝑗, 𝑓𝑘 )

– If 𝑓𝑗 −𝑓

𝑘 ≥ 𝜄 worker returns correct answer

– If 𝑓𝑗 −𝑓

𝑘 < 𝜄 worker returns arbitrary answer

Note that if the difference is < 𝜄 no matter how many workers we ask, we cannot obtain a more accurate response

Modeling the error

𝑓7 𝑓6 𝑓5 𝑓4 𝑓8 𝑓1 𝑓2 𝑓3 𝜄

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-61
SLIDE 61

Usually workers are untrained An expert is a more capable worker:

  • May have been trained
  • More scarce
  • More expensive

Experts have started being offered by crowdsourcing systems

  • “Masters,” “skilled,” …

When should we use regular workers and when experts? Think of ‘Who wants to be a millionaire”

Using expert workers

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-62
SLIDE 62
  • Consider a set of elements with different values
  • Threshold model:
  • We present to a worker a pair (𝑓𝑗, 𝑓𝑘 )

– If 𝑓𝑗 −𝑓

𝑘 ≥ 𝜄 worker returns correct answer

– If 𝑓𝑗 −𝑓

𝑘 < 𝜄 worker returns arbitrary answer

Modeling the error

𝑓7 𝑓6 𝑓5 𝑓4 𝑓8 𝑓1 𝑓2 𝑓3 𝜄

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-63
SLIDE 63
  • Consider a set of elements with different values
  • Threshold model:
  • We present to a worker a pair (𝑓𝑗, 𝑓𝑘 )

– If 𝑓𝑗 −𝑓

𝑘 ≥ 𝜄 worker returns correct answer

– If 𝑓𝑗 −𝑓

𝑘 < 𝜄 worker returns arbitrary answer

Experts have a lower error threshold 𝜄𝐹

Modeling the error

𝑓7 𝑓6 𝑓5 𝑓4 𝑓8 𝑓1 𝑓2 𝑓3 𝜄 𝜄𝐹

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-64
SLIDE 64

Simple task: compute the MAX

A model allows us to formalize and analyze the problem

  • We provide an algorithm that finds an element as close to

the max as possible

  • We prove that it makes as few expert comparisons as

possible Feel free to ask for details after the talk.

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-65
SLIDE 65

Tested on Crowdsourcing platform with 3 datasets:

  • 1. 𝑜 = 50 pictures with DOTS

Goal: find more dots

  • 2. 𝑜 = 50 CARS

Goal: find most expensive

  • 3. 𝑜 = 50 QUERY RESULTS

Goal: find most relevant result for a given query

Experiments using the Crowd

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-66
SLIDE 66

Results

In all our 3 sets of experiments: The combination of nonexpert and expert users finds the best results with a low cost.

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-67
SLIDE 67

Future directions

Understand better when we have wisdom or ignorance of the crowds

  • Experiments in more controlled environments
  • Large-scale experiments (twitter)
  • Models
  • Algorithms
  • More detailed analysis of misinformation

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015

slide-68
SLIDE 68

Thanks!

Questions, comments, etc.: http://aris.me

Aris Anagnostopoulos The Wisdom of Crowds School for Advanced Sciences of Luchon, 2015