Aggregating information from the crowd Anirban Dasgupta IIT - - PowerPoint PPT Presentation

aggregating information from the crowd
SMART_READER_LITE
LIVE PREVIEW

Aggregating information from the crowd Anirban Dasgupta IIT - - PowerPoint PPT Presentation

Aggregating information from the crowd Anirban Dasgupta IIT Gandhinagar Joint work with Flavio Chiericetti, Nilesh Dalvi, Vibhor Rastogi, Ravi Kumar, Silvio Lattanzi January 07, 2015 Crowdsourcing Many different modes of crowdsourcing


slide-1
SLIDE 1

Aggregating information from the crowd

Anirban Dasgupta IIT Gandhinagar

Joint work with Flavio Chiericetti, Nilesh Dalvi, Vibhor Rastogi, Ravi Kumar, Silvio Lattanzi

January 07, 2015

slide-2
SLIDE 2

Crowdsourcing

Many different modes of crowdsourcing

slide-3
SLIDE 3

Aggregating information using the Crowd: the expertise issue

Is IISc more than 100 years old? Does IISc have more than UG than PG?

Yes Yes Yes No No No! Yes! Yes Yes No

T ypically, the answers to the crowdsourced tasks are unknown!

slide-4
SLIDE 4

Aggregating information using the Crowd: the effort issue

Does this article have appropriate references at all places?

Yes Yes Yes No No Even expert users need to spend effort to give meaningful answers

slide-5
SLIDE 5
  • How to ensure that information collected is “useful”?

– Assume users are strategic – effort put in when making judgments, truthful opinions – design the right payment mechanism

  • How to aggregate opinions from different agents?

– user behavior stochastic – varying levels of expertise, unknown – might not stick around to develop reputation

Elicitation & Aggregation

slide-6
SLIDE 6

This talk: only aggregation

  • Formalizing a simple crowdsourcing task

– T asks with hidden labels, varying user expertise

  • Aggregation for binary tasks

– stochastic model of user behaviour – algorithms to estimate task labels + expertise

  • Continuous feedback
  • Ranking
slide-7
SLIDE 7

Binary T ask model

  • T

asks have hidden labels:

– {-1, +1} – E.g. labeling whether good quality article

  • Each task is evaluated by a

number of users

– not too many

  • Each user outputs {-1, +1}

per task

  • Users and tasks fixed

n users m tasks

slide-8
SLIDE 8

Simple User model

  • Each user performs set of

tasks assigned to her

  • Users have proficiency

– Indicates probability that the true signal is seen – This is not observable

  • 1

+1

  • 1

+1 +1 +1

[Dawid, Skene, '79] Note: This does not model bias

slide-9
SLIDE 9

Stochastic model

G = user-item graph q = vector of actual qualities = rating on by user j on item i

Given n-by-m matrix U, estimate vectors q and p

+1

  • 1

+1

  • 1
slide-10
SLIDE 10

From users to items

  • If all users are same,

then simple majority/average will do

  • Else, some notion of

weighted majority e.g.

  • We will try to estimate

user reliabilities first

  • 1
  • 1

+1 ??

slide-11
SLIDE 11

Intuition: if G is complete

  • Consider the user x user matrix UU

t

UU

t = (#agreements - #disagreements) between j and k

is a rank one matrix

If we approximate, UUt ≈ E(UUt), w is rank-1 approximation of UUt noise

slide-12
SLIDE 12

Arbitrary assignment graphs

Then Hadamard product:

E[agree – disagree]

  • n each

Number of shared items

slide-13
SLIDE 13

Arbitrary assignment graphs

Then Hadamard product:

E[agree – disagree]

  • n each

Number of shared items

Similar spectral intuitions hold, only slightly more work is needed

slide-14
SLIDE 14

Algorithms

  • Core idea is to recover the “expected” matrix using spectral

techniques

  • Ghosh, Kale, McAfee'11

– compute topmost eigenvector of item x item matrix – proves small error for G dense random graph

  • Karger, Oh, Shah'11

– using belief propagation on U – proof of convergence for G sparse random

  • Dalvi, D., Kumar, Rastogi'13

– for G an “expander”, use eigenvectors of both GG' and UU'

  • EM based recovery Dawid & Skene'79
slide-15
SLIDE 15

Empirical: user proficiency can be more or less estimated

Correlation of predicted and actual proficiency on the Y-axis

[ Aggregating crowdsourced binary ratings, WWW'13 Dalvi, D., Kumar, Rastogi ]

slide-16
SLIDE 16

Aggregation

Formalizing a simple crowdsourcing task

– T

asks with hidden labels, varying user expertise

Aggregation for binary tasks

– stochastic model of user behaviour – algorithms to estimate task labels + expertise

Continuous feedback Ranking

slide-17
SLIDE 17

Continuous feedback model

  • T

asks are continuous: – Quality

  • Each user has a reliability
  • Each user outputs a score per

task

n users m tasks

slide-18
SLIDE 18

Continuous feedback model

  • T

asks are continuous: – Quality

  • Each user has a reliability
  • Each user outputs a score per

task Minimize max

n users m tasks

slide-19
SLIDE 19

Some simpler settings & obstacles

slide-20
SLIDE 20

Suppose that we know the

Single item, known variances

We want to minimize

slide-21
SLIDE 21

Suppose that we know the

Single item, known variances

We want to minimize

it is known that an asymptotically optimal estimate is Loss =

slide-22
SLIDE 22

Single item, unknown variances

We want to minimize

Suppose that we do not know the

Only one sample, so cannot estimate

Cannot compute weighted average

slide-23
SLIDE 23

Arithmetic Mean

In binary case for single item we can obtain the optimum by using a majority rule. In a continuous case using the same approach we would compute the arithmetic mean.

slide-24
SLIDE 24

Arithmetic Mean

In binary case for single item we can obtain the optimum by using a majority rule. In a continuous case using the same approach we would compute the arithmetic mean and hence

slide-25
SLIDE 25

Arithmetic Mean

In binary case for single item we can obtain the optimum by using a majority rule. In a continuous case using the same approach we would compute the arithmetic mean and hence Thus the loss

slide-26
SLIDE 26

Arithmetic Mean

In binary case for single item we can obtain the optimum by using a majority rule. In a continuous case using the same approach we would compute the arithmetic mean and hence Thus the loss

Is this optimal?

slide-27
SLIDE 27

Problem with Arithmetic mean

The AM would have error

slide-28
SLIDE 28

Problem with Arithmetic mean

The AM would have error Same problem with the median algorithm

slide-29
SLIDE 29

Problem with Arithmetic mean

The AM would have error By choosing the nearest pair of points, we have a much better estimate Same problem with the median algorithm

slide-30
SLIDE 30

Shortest gap algorithm

Maybe the optimal algo is to select one of two nearest samples?

In this setting, w.h.p., the two closest points are at distance But arithmetic mean gives loss

slide-31
SLIDE 31

Last obstacle

More is not always better

In this setting, w.h.p., the first two closest points are at distance

Adding bad raters could actually worsen the shortest gap algorithm

Mean is not good here either

But so will be some other pair

slide-32
SLIDE 32

Single Item case

slide-33
SLIDE 33

Results

Theorem 1: There is an algo with expected loss Theorem 2: There is an example where the gap between any algo and the known variance setting is

[Chiericetti, D., Kumar, Lattanzi' 14]

slide-34
SLIDE 34

Algorithm

Combination of two simple algorithms k-median algorithm

return the rate of one of the k central raters

slide-35
SLIDE 35

Algorithm

Combination of two simple algorithms k-median algorithm

return the rate of one of the k central raters

slide-36
SLIDE 36

Algorithm

Combination of two simple algorithms k-median algorithm

return the rate of one of the k central raters k-shortest gap Return one of the k closest points

slide-37
SLIDE 37

Algorithm

Combination of two simple algorithms k-median algorithm

return the rate of one of the k central raters k-shortest gap Return one of the k closest points

slide-38
SLIDE 38

Let be the length of the k-shortest gap Compute the median Find the shortest gap and return a point in it

Algorithm

slide-39
SLIDE 39

Proof Sketch

w.h.p. contains

WHP , length of the k-shortest gap is at most

Select the median points

slide-40
SLIDE 40

Proof Sketch

w.h.p. contains

WHP , length of the k-shortest gap is at most

Select the median points

If we consider points, then WHP there will be no ratings with variance than that are within distance

slide-41
SLIDE 41

Proof Sketch

Thus the distance of the shortest gap points to the truth is bounded

slide-42
SLIDE 42

Lower bound

Instance: μ selected in

variance of j-th user = Optimal algorithm (known variance) has loss

slide-43
SLIDE 43

Lower bound

Instance: μ selected at random in

variance of j-th user = Optimal algorithm (known variance) has loss

We will show that maximum likelihood estimation cannot distinguish between - L and + L → loss

slide-44
SLIDE 44

Lower Bound

Consider the two log-likelihoods Claim: Irrespective of value of μ, can be positive or negative with const prob.

slide-45
SLIDE 45

Lower Bound

Consider the two log-likelihoods Claim: Irrespective of value of μ, can be positive or negative with const prob.

slide-46
SLIDE 46

The idea is to use the same algorithm of constant number of items, but to use a smarter version of the k shortest gap that looks for k points at distance at most in all the items

Multiple items

slide-47
SLIDE 47

The idea is to use the same algorithm of constant number of items, but to use a smarter version of the k shortest gap that looks for k points at distance at most in all the items

Multiple items

slide-48
SLIDE 48

Multiple items

Theorem: For m=o(log n) , complete graph, can get an expected loss of Theorem: For m=Ω(log n), complete or dense random, expected loss almost identical to the known variance case

slide-49
SLIDE 49

Aggregation

Formalizing a simple crowdsourcing task

– T

asks with hidden labels, varying user expertise

Aggregation for binary task

– stochastic model of user behaviour – algorithms to estimate task labels + expertise

Continuous feedback Ranking

slide-50
SLIDE 50

Crowdsourced rankings

slide-51
SLIDE 51

Crowdsourced rankings

How can we aggregate noisy rankings

slide-52
SLIDE 52

Crowdsourced rankings

How can we aggregate noisy rankings

slide-53
SLIDE 53

Mallows Model [Mallows 1957]

There is a hidden permutation σ and a scale parameter β A permutation π is generated as

κ(σ,π) = Kendall-T au distance

Braverman, Mossel'09: Finding the MLE for single parameter Mallows

slide-54
SLIDE 54

Mallows Model

There is a hidden permutation σ and a user specific scale parameter βi

slide-55
SLIDE 55

Single item with known parameters

Theorem: For m samples, if then can recover σ WHP . Theorem: If then cannot recover σ

Approximate reconstruction versions of these theorems also hold

[Chiericetti, D, Kumar, Lattanzi, RANDOM'14]

Algo: Weighted Borda count, weights = thresholded β values

slide-56
SLIDE 56

Summary

  • Host of interesting problems in crowdsourcing

aggregation

Specially for structured outputs

  • For binary tasks

Spectral techniques provide a powerful tool

  • For gaussians

new aggregation problems even for single item

Combination of k-median & k-shortest gap

  • For ranking

Main technical contribution is calculating the swapping probs

aggregation with known parameters is nontrivial

slide-57
SLIDE 57

Open questions

  • Continuous feedback

More natural algorithms for aggregation?

Better algorithms for multiple items

Instance optimal algorithms?

Non-gaussian distributions?

Mixture learning with lots of components and single/constant samples per component?

  • Ranking

Better estimation of Mallows parameters

Multiple items, under partial ranking/pairwise preferences?

  • More realistic complex model of user?

– Incorporating user bias? – different kind of expertise, not just reliability

slide-58
SLIDE 58

Thanks!