Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 - - PowerPoint PPT Presentation

crowdsourcing quality control
SMART_READER_LITE
LIVE PREVIEW

Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 - - PowerPoint PPT Presentation

Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 Motivation What do we do when we want to answer a question? Motivation What do we do when we want to answer a question? We ask the questions to lots of people! Motivation


slide-1
SLIDE 1

Crowdsourcing Quality Control

Madalina and Jao-ke CS 286r '12

slide-2
SLIDE 2

Motivation

What do we do when we want to answer a question?

slide-3
SLIDE 3

Motivation

What do we do when we want to answer a question?

  • We ask the questions to lots of people!
slide-4
SLIDE 4

Motivation

What do we do when we want to answer a question?

  • We ask the questions to lots of people!

What do we want to do when some workers are better than others?

slide-5
SLIDE 5

Motivation

What do we do when we want to answer a question?

  • We ask the questions to lots of people!

What do we want to do when some workers are better than others?

  • We want to weight the good workers' answers more heavily

than those of the bad workers.

slide-6
SLIDE 6

Motivation

What do we do when we want to answer a question?

  • We ask the questions to lots of people!

What do we want to do when some workers are better than others?

  • We want to weight the good workers' answers more heavily

than those of the bad workers. How can we tell which workers are "good" and which workers are "bad"?

  • Proof by majority?
slide-7
SLIDE 7

Motivation

What do we do when we want to answer a question?

  • We ask the questions to lots of people!

What do we want to do when some workers are better than others?

  • We want to weight the good workers' answers more heavily

than those of the bad workers. How can we tell which workers are "good" and which workers are "bad"?

  • Proof by majority?
  • Nope. (Sorry, democracy.)
slide-8
SLIDE 8

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj
slide-9
SLIDE 9

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj or probability of answering

a task correctly, for all tasks

Example:

slide-10
SLIDE 10

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj or probability of answering

a task correctly, for all tasks

Example: spammer-hammer model

Reasonable? How so? How not?

slide-11
SLIDE 11

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj or probability of answering

a task correctly, for all tasks

Example: spammer-hammer model

Reasonable? How so? How not?

  • Crowd has average quality
slide-12
SLIDE 12

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj or probability of answering

a task correctly, for all tasks

Example: spammer-hammer model

Reasonable? How so? How not?

  • Crowd has average quality q := E[(2pj-1)2]
slide-13
SLIDE 13

Set-up

  • Tasks have binary answers
  • Each worker performs same number of tasks; each task performed

by same number of workers

  • Each worker wj has a certain reliability pj or probability of answering

a task correctly, for all tasks

Example: spammer-hammer model

Reasonable? How so? How not?

  • Crowd has average quality q := E[(2pj-1)2]
  • Roles of reliability and quality in algorithm?
slide-14
SLIDE 14

Graph Theory

  • G(V,E) Bipartite Graph:
slide-15
SLIDE 15

Graph Theory

Workers

  • n = 6

r = ?

Tasks

  • m = 4

l = ?

slide-16
SLIDE 16

Graph Theory

Workers

  • Tasks
  • 1.

2. 3. 4. 5. 6. 1. 2. 3. 4.

slide-17
SLIDE 17

Algorithm

slide-18
SLIDE 18

Algorithm Example: Updates

Workers

  • Tasks
  • 1.

2. 3. 4. 5. 6. 1. 2. 3. 4.

slide-19
SLIDE 19

Commercial Break

Say you have a biased coin that with probability p =/= 1/2 comes up Heads and with probability 1-p comes up Tails. How can you estimate p?

slide-20
SLIDE 20

Algorithm Example: Updates

slide-21
SLIDE 21

Optimality Discussion

Oracle error:

The minimax error rate achieved by the best possible graph G in G(m; l) using the best possible inference algorithm is at least Majority vote: Iterative algorithm:

slide-22
SLIDE 22

Algorithm Properties

  • does not require prior, unlike belief propagation
  • performance guarantees, unlike expectation maximization
  • Performance:
slide-23
SLIDE 23

Worker Bias

  • Worker A always reverses answers; Worker B always gives same

answer

slide-24
SLIDE 24

Worker Bias

  • Worker A always reverses answers; Worker B always gives same

answer

  • How can we separate out bias from error?
slide-25
SLIDE 25

Worker Bias

Main idea: Given each worker's error rates, error costs, and priors for the correct answer distributions, we transform workers' "hard" answers into "soft" answers that have minimal (error-associated) costs.

slide-26
SLIDE 26

Thank You!