Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 - - PowerPoint PPT Presentation
Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 - - PowerPoint PPT Presentation
Crowdsourcing Quality Control Madalina and Jao-ke CS 286r '12 Motivation What do we do when we want to answer a question? Motivation What do we do when we want to answer a question? We ask the questions to lots of people! Motivation
Motivation
What do we do when we want to answer a question?
Motivation
What do we do when we want to answer a question?
- We ask the questions to lots of people!
Motivation
What do we do when we want to answer a question?
- We ask the questions to lots of people!
What do we want to do when some workers are better than others?
Motivation
What do we do when we want to answer a question?
- We ask the questions to lots of people!
What do we want to do when some workers are better than others?
- We want to weight the good workers' answers more heavily
than those of the bad workers.
Motivation
What do we do when we want to answer a question?
- We ask the questions to lots of people!
What do we want to do when some workers are better than others?
- We want to weight the good workers' answers more heavily
than those of the bad workers. How can we tell which workers are "good" and which workers are "bad"?
- Proof by majority?
Motivation
What do we do when we want to answer a question?
- We ask the questions to lots of people!
What do we want to do when some workers are better than others?
- We want to weight the good workers' answers more heavily
than those of the bad workers. How can we tell which workers are "good" and which workers are "bad"?
- Proof by majority?
- Nope. (Sorry, democracy.)
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj or probability of answering
a task correctly, for all tasks
○
Example:
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj or probability of answering
a task correctly, for all tasks
○
Example: spammer-hammer model
○
Reasonable? How so? How not?
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj or probability of answering
a task correctly, for all tasks
○
Example: spammer-hammer model
○
Reasonable? How so? How not?
- Crowd has average quality
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj or probability of answering
a task correctly, for all tasks
○
Example: spammer-hammer model
○
Reasonable? How so? How not?
- Crowd has average quality q := E[(2pj-1)2]
Set-up
- Tasks have binary answers
- Each worker performs same number of tasks; each task performed
by same number of workers
- Each worker wj has a certain reliability pj or probability of answering
a task correctly, for all tasks
○
Example: spammer-hammer model
○
Reasonable? How so? How not?
- Crowd has average quality q := E[(2pj-1)2]
- Roles of reliability and quality in algorithm?
Graph Theory
- G(V,E) Bipartite Graph:
Graph Theory
Workers
- n = 6
r = ?
Tasks
- m = 4
l = ?
Graph Theory
Workers
- Tasks
- 1.
2. 3. 4. 5. 6. 1. 2. 3. 4.
Algorithm
Algorithm Example: Updates
Workers
- Tasks
- 1.
2. 3. 4. 5. 6. 1. 2. 3. 4.
Commercial Break
Say you have a biased coin that with probability p =/= 1/2 comes up Heads and with probability 1-p comes up Tails. How can you estimate p?
Algorithm Example: Updates
Optimality Discussion
Oracle error:
The minimax error rate achieved by the best possible graph G in G(m; l) using the best possible inference algorithm is at least Majority vote: Iterative algorithm:
Algorithm Properties
- does not require prior, unlike belief propagation
- performance guarantees, unlike expectation maximization
- Performance:
Worker Bias
- Worker A always reverses answers; Worker B always gives same
answer
Worker Bias
- Worker A always reverses answers; Worker B always gives same
answer
- How can we separate out bias from error?
Worker Bias
Main idea: Given each worker's error rates, error costs, and priors for the correct answer distributions, we transform workers' "hard" answers into "soft" answers that have minimal (error-associated) costs.