University of California, Berkeley
Bayesian Bias Mitigation for Crowdsourcing
Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1
Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC - - PowerPoint PPT Presentation
University of California, Berkeley Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1 The Problem of Bias in
University of California, Berkeley
Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1
◮ Crowdsourcing: collect data from crowd and learn a model.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 2
◮ Crowdsourcing: collect data from crowd and learn a model. ◮ E.g. Amazon Mechanical Turk
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 2
◮ Crowdsourcing: collect data from crowd and learn a model. ◮ E.g. Amazon Mechanical Turk ◮ Labelers may be malicious/unhelpful or tasks ambiguous/hard.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 2
◮ Crowdsourcing: collect data from crowd and learn a model. ◮ E.g. Amazon Mechanical Turk ◮ Labelers may be malicious/unhelpful or tasks ambiguous/hard. ◮ ⇒ Crowdsourced data is biased.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 2
◮ Crowdsourcing: collect data from crowd and learn a model. ◮ E.g. Amazon Mechanical Turk ◮ Labelers may be malicious/unhelpful or tasks ambiguous/hard. ◮ ⇒ Crowdsourced data is biased.
◮ Problem: Can we still learn from partially biased data?
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 2
◮ “Robot: Get me the brown guitar behind the couch.”
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ “Robot: Get me the brown guitar behind the couch.” ◮ Human label data would be ambiguous:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ “Robot: Get me the brown guitar behind the couch.” ◮ Human label data would be ambiguous:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ “Robot: Get me the brown guitar behind the couch.” ◮ Human label data would be ambiguous:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ “Robot: Get me the brown guitar behind the couch.” ◮ Human label data would be ambiguous:
◮ There can be structural differences between labellers.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ “Robot: Get me the brown guitar behind the couch.” ◮ Human label data would be ambiguous:
◮ There can be structural differences between labellers. ◮ How to learn from this data?
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 3
◮ Bias addressed in three stages of a pipeline:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 4
◮ Bias addressed in three stages of a pipeline:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 4
◮ Bias addressed in three stages of a pipeline:
◮ Common Assumptions:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 4
◮ Bias addressed in three stages of a pipeline:
◮ Common Assumptions:
◮ Inappropriate when tasks are subjective or particularly hard.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 4
Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 5
Contribution I: Bayesian Preference Model
Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 6
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
◮ Benefits:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
◮ Benefits:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Unify pipeline steps in a Bayesian model.
◮ Benefits:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 7
Contribution I: Bayesian Preference Model
◮ Tasks i, labelers l. ◮ Example task: “Is the guitar behind or next to the couch?”
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 8
Contribution I: Bayesian Preference Model
◮ Tasks i, labelers l. ◮ Example task: “Is the guitar behind or next to the couch?” ◮ Task covariates xi ∈ Rd, i = 1, . . . , n in X. ◮ Labels are yi,l ∈ {−1, 0, +1}, i = 1, . . . , n; l = 1, . . . , m in Y .
m n
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 8
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K. ◮ m × K binary matrix Z models parameter sharing.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K. ◮ m × K binary matrix Z models parameter sharing. ◮ If zl,b = 1, labeler l expresses preference b.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K. ◮ m × K binary matrix Z models parameter sharing. ◮ If zl,b = 1, labeler l expresses preference b. ◮ Parameter βl accumulates preferences:
βl =
zl,bγb
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K. ◮ m × K binary matrix Z models parameter sharing. ◮ If zl,b = 1, labeler l expresses preference b. ◮ Parameter βl accumulates preferences:
βl =
zl,bγb
◮ Likelihood:
p(Y |X, Z, γ) =
p(yi,l|β⊤
l xi)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
Labelers express accumulated, shared preferences.
◮ Parameter γb models effect of preference b = 1, . . . , K. ◮ m × K binary matrix Z models parameter sharing. ◮ If zl,b = 1, labeler l expresses preference b. ◮ Parameter βl accumulates preferences:
βl =
zl,bγb
◮ Likelihood:
p(Y |X, Z, γ) =
p(yi,l|β⊤
l xi)
◮ Similar preferences ⇒ similar labelling.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) (2)
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2)
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
◮ Prior on γb: p(γb) = N(0, σ2I) for each b. ◮ Prior on Z: fix Z to be m × K.
πb|α ∼ Beta α K , 1
(1) zl,b|πb ∼ Bern (πb) , l = 1, . . . , m (2) K
m
◮ As K → ∞, distribution over Z converges to the Indian
Buffet Process (IBP).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
◮ Want to predict labeller l’s labels.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
◮ Want to predict labeller l’s labels. ◮ Labeller l could be in the crowd, or the gold standard.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
◮ Want to predict labeller l’s labels. ◮ Labeller l could be in the crowd, or the gold standard. ◮ Required inference: p(βl|X, Y ), or equivalently
p(zl,b, γb, b = 1, . . . , K|X, Y ).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
◮ Want to predict labeller l’s labels. ◮ Labeller l could be in the crowd, or the gold standard. ◮ Required inference: p(βl|X, Y ), or equivalently
p(zl,b, γb, b = 1, . . . , K|X, Y ).
◮ Model is complex. Exact inference intractable.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
Contribution I: Bayesian Preference Model
p(Y , Z, γ|X) = p(Y |X, Z, γ)p(γ|Z)p(Z)
◮ Recall bias: different labellers can have different β’s
Example: Disagreement if guitar is behind/next to the couch.
◮ Want to predict labeller l’s labels. ◮ Labeller l could be in the crowd, or the gold standard. ◮ Required inference: p(βl|X, Y ), or equivalently
p(zl,b, γb, b = 1, . . . , K|X, Y ).
◮ Model is complex. Exact inference intractable. ◮ Possible alternatives: Gibbs sampling, variational inference,
slice sampling, etc.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11
BBMC Results
Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 12
BBMC Results
◮ X is 2000 × 4 Gaussian matrix
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2). ◮ γb Gaussian b = 1, 2. βl =
b zl,bγb.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2). ◮ γb Gaussian b = 1, 2. βl =
b zl,bγb.
◮ Observation probability ǫ = 0.1.
yi,l = w.p. (1 − ǫ) +1 w.p. ǫΦ(x⊤
i βl)
−1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2). ◮ γb Gaussian b = 1, 2. βl =
b zl,bγb.
◮ Observation probability ǫ = 0.1.
yi,l = w.p. (1 − ǫ) +1 w.p. ǫΦ(x⊤
i βl)
−1
◮ Inference: want to recover β1 (say).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2). ◮ γb Gaussian b = 1, 2. βl =
b zl,bγb.
◮ Observation probability ǫ = 0.1.
yi,l = w.p. (1 − ǫ) +1 w.p. ǫΦ(x⊤
i βl)
−1
◮ Inference: want to recover β1 (say). ◮ Requires p(z1,b, γb, b = 1 . . . , K|, X, Y ).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results
◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix (m = 30, K = 2). ◮ γb Gaussian b = 1, 2. βl =
b zl,bγb.
◮ Observation probability ǫ = 0.1.
yi,l = w.p. (1 − ǫ) +1 w.p. ǫΦ(x⊤
i βl)
−1
◮ Inference: want to recover β1 (say). ◮ Requires p(z1,b, γb, b = 1 . . . , K|, X, Y ). ◮ For inference set K = 10.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ1,1.
1000 1200 1400 1600 1800 2000 0.55 0.6 0.65 0.7 0.75
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ1,1.
1000 1200 1400 1600 1800 2000 0.55 0.6 0.65 0.7 0.75
◮ True and posterior mean of β1 after 1000 iterations burnin.
β1 = 0.6915 0.0754 −0.6815 0.6988 ˆ β1 = 0.6514 0.0535 −0.6473 0.6957 (3)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14
BBMC Results
◮ Task: Is the triangle to the left or above the rectangle
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per
task, 76 labellers.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per
task, 76 labellers.
◮ Want to predict gold standard: compare centroid positions.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per
task, 76 labellers.
◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per
task, 76 labellers.
◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. ◮ Researcher also labels, and gives 60 gold standard labels.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15
BBMC Results ◮ Averaged log likelihood and error rate on test set. ◮ Our model: BBMC. Algorithm Final Loglik Final Error No Active Learning GOLD −3716 ± 1695 0.0547 ± 0.0102 CONS −421.1 ± 2.6 0.0935 ± 0.0031 BBMC − 219.1 ± 3.1 0.0309 ± 0.0033
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 16
Contribution II: Approximate Active Learning
Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 17
Contribution II: Approximate Active Learning
◮ Want to predict labeller l’s labels. Need βl.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning
◮ Want to predict labeller l’s labels. Need βl. ◮ Not all labellers are useful to infer βl.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning
◮ Want to predict labeller l’s labels. Need βl. ◮ Not all labellers are useful to infer βl. ◮ If l and l′ share parameters ⇒ can learn about βl from l′.
βl =
zl,bγb βl′ =
zl′,bγb (4)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning
◮ Want to predict labeller l’s labels. Need βl. ◮ Not all labellers are useful to infer βl. ◮ If l and l′ share parameters ⇒ can learn about βl from l′.
βl =
zl,bγb βl′ =
zl′,bγb (4)
◮ Active learning: repeatedly select training data that helps
learning βl.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning
◮ Want to predict labeller l’s labels. Need βl. ◮ Not all labellers are useful to infer βl. ◮ If l and l′ share parameters ⇒ can learn about βl from l′.
βl =
zl,bγb βl′ =
zl′,bγb (4)
◮ Active learning: repeatedly select training data that helps
learning βl.
◮ Goal: cheaper training data, faster learning.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y .
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
|Mean(·) − µ| |2
2
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
|Mean(·) − µ| |2
2
◮ For each (i′, l′) score, need posterior p(β|yi′,l′, X, Y ).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
|Mean(·) − µ| |2
2
◮ For each (i′, l′) score, need posterior p(β|yi′,l′, X, Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score (i′, l′).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
|Mean(·) − µ| |2
2
◮ For each (i′, l′) score, need posterior p(β|yi′,l′, X, Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score (i′, l′). ◮ We are already running one Gibbs sampler for basic inference.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
◮ Suppose we start with training data Y . ◮ Query task-labeler pair (i, l) to maximize expected utility of
adding it (i, l) = argmax(i′,l′)Eyi′,l′
|Mean(·) − µ| |2
2
◮ For each (i′, l′) score, need posterior p(β|yi′,l′, X, Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score (i′, l′). ◮ We are already running one Gibbs sampler for basic inference. ◮ Problem: Can we avoid running the extra scoring chains?
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19
Contribution II: Approximate Active Learning
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference βt+
2
βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring βt+
2
βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring βt+
2
ˆ βt−
1
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
βt+
2
ˆ βt−
1
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
βt+
2
ˆ βt−
1
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
◮ Our method: βt+
2
ˆ βt−
1
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
◮ Our method:
samples of p(β|X, Y ). βt+
2
ˆ βt−
1
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
◮ Our method:
samples of p(β|X, Y ). βt+
2
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
ˆ βt+
3
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Gibbs sampler for p(β|X, Y ) is a Markov chain for inference ◮ Sampler for p(β|yi′,l′, X, Y ) is a perturbed chain for scoring ◮ Na¨
ıve scoring:
◮ Our method:
samples of p(β|X, Y ).
βt+
2
ˆ βt+
1
ˆ βt+
2
ˆ βt βt+
1
βt βt−
1
ˆ βt+
3
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20
Contribution II: Approximate Active Learning
◮ Suppose chain p(βt|βt−1) and a perturbed chain ˆ
p(ˆ βt|ˆ βt−1).
◮ Stationary distributions are p∞(β) and ˆ
p∞(ˆ β).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning
◮ Suppose chain p(βt|βt−1) and a perturbed chain ˆ
p(ˆ βt|ˆ βt−1).
◮ Stationary distributions are p∞(β) and ˆ
p∞(ˆ β).
◮ Let βs ∼ p∞(β) s = 1, . . . , S, and approximate
ˆ p∞(ˆ β) ≈
p(ˆ β|β)p∞(β)dβ ≈ 1 S
S
ˆ p(ˆ β|βs).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning
◮ Suppose chain p(βt|βt−1) and a perturbed chain ˆ
p(ˆ βt|ˆ βt−1).
◮ Stationary distributions are p∞(β) and ˆ
p∞(ˆ β).
◮ Let βs ∼ p∞(β) s = 1, . . . , S, and approximate
ˆ p∞(ˆ β) ≈
p(ˆ β|β)p∞(β)dβ ≈ 1 S
S
ˆ p(ˆ β|βs).
◮ If p∞(β) = ˆ
p∞(β), the first approximation is exact.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning
◮ Suppose chain p(βt|βt−1) and a perturbed chain ˆ
p(ˆ βt|ˆ βt−1).
◮ Stationary distributions are p∞(β) and ˆ
p∞(ˆ β).
◮ Let βs ∼ p∞(β) s = 1, . . . , S, and approximate
ˆ p∞(ˆ β) ≈
p(ˆ β|β)p∞(β)dβ ≈ 1 S
S
ˆ p(ˆ β|βs).
◮ If p∞(β) = ˆ
p∞(β), the first approximation is exact.
◮ Specialize to active learning:
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21
Contribution II: Approximate Active Learning
◮ Suppose W is n × n, positive, symmetric. P = D−1W .
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning
◮ Suppose W is n × n, positive, symmetric. P = D−1W . ◮ Stationary distribution is left eigenvector of P. Decompose
A = D−1/2WD−1/2 (5) = V ΛV ⊤, λ1 ≤ λ2 ≤ . . . ≤ λn = 1 (6) p∞ ∝ D1/2vn (7)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning
◮ Suppose W is n × n, positive, symmetric. P = D−1W . ◮ Stationary distribution is left eigenvector of P. Decompose
A = D−1/2WD−1/2 (5) = V ΛV ⊤, λ1 ≤ λ2 ≤ . . . ≤ λn = 1 (6) p∞ ∝ D1/2vn (7)
◮ Perturb the matrix: ˆ
W = W + dW ≥ 0, with dW 1 = 0.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning
◮ Suppose W is n × n, positive, symmetric. P = D−1W . ◮ Stationary distribution is left eigenvector of P. Decompose
A = D−1/2WD−1/2 (5) = V ΛV ⊤, λ1 ≤ λ2 ≤ . . . ≤ λn = 1 (6) p∞ ∝ D1/2vn (7)
◮ Perturb the matrix: ˆ
W = W + dW ≥ 0, with dW 1 = 0.
◮ Then ˆ
P = D−1 ˆ W = P + D−1dW = P + dP.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22
Contribution II: Approximate Active Learning
◮ Matrix perturbation theory:
˜ p∞ ≈ p∞ + D1/2
k=n
vkv⊤
k
1 − λk dP⊤D−1/2p∞ (8)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23
Contribution II: Approximate Active Learning
◮ Matrix perturbation theory:
˜ p∞ ≈ p∞ + D1/2
k=n
vkv⊤
k
1 − λk dP⊤D−1/2p∞ (8)
◮ Works for discrete random walks, but not in general.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23
Contribution II: Approximate Active Learning
◮ Matrix perturbation theory:
˜ p∞ ≈ p∞ + D1/2
k=n
vkv⊤
k
1 − λk dP⊤D−1/2p∞ (8)
◮ Works for discrete random walks, but not in general. ◮ Our method is general and approximates:
ˆ p∞ ≈ ˆ P⊤p∞ = p∞ + dP⊤p∞. (9)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23
Contribution II: Approximate Active Learning
◮ Matrix perturbation theory:
˜ p∞ ≈ p∞ + D1/2
k=n
vkv⊤
k
1 − λk dP⊤D−1/2p∞ (8)
◮ Works for discrete random walks, but not in general. ◮ Our method is general and approximates:
ˆ p∞ ≈ ˆ P⊤p∞ = p∞ + dP⊤p∞. (9)
◮ If D = I then accuracy depends on spectral gap.
| |˜ p∞ − ˆ p∞| | ≤ max
1 1 − λn−1
(10)
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23
Active Learning Results
Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 24
Active Learning Results
◮ Task: Is the triangle to the left or above the rectangle
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 25
Active Learning Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Active learning methods can query 100 labels.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 25
Active Learning Results
◮ Task: Is the triangle to the left or above the rectangle ◮ Active learning methods can query 100 labels. ◮ Here: only query gold standard (could be other labeller).
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 25
Active Learning Results ◮ Averaged log likelihood and error rate on test set. ◮ BBMC and BBMC-ACT: us with/without active learning. Algorithm Final Loglik Final Error No Active Learning GOLD −3716 ± 1695 0.0547 ± 0.0102 CONS −421.1 ± 2.6 0.0935 ± 0.0031 BBMC −219.1 ± 3.1 0.0309 ± 0.0033
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 26
Active Learning Results ◮ Averaged log likelihood and error rate on test set. ◮ BBMC and BBMC-ACT: us with/without active learning. Algorithm Final Loglik Final Error No Active Learning GOLD −3716 ± 1695 0.0547 ± 0.0102 CONS −421.1 ± 2.6 0.0935 ± 0.0031 BBMC −219.1 ± 3.1 0.0309 ± 0.0033 Active Learning GOLD-ACT −1957 ± 696 0.0290 ± 0.0037 CONS-ACT −396.1 ± 3.6 0.0906 ± 0.0024 RAND-ACT −186.0 ± 2.2 0.0292 ± 0.0029 DIS-ACT −198.3 ± 5.8 0.0392 ± 0.0052 MCMC-ACT −196.1 ± 6.7 0.0492 ± 0.0050 BBMC-ACT −160.8 ± 3.9 0.0188 ± 0.0018
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 26
Conclusion
◮ Bayesian model to mitigate label bias.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
◮ Approximate active learning for Gibbs sampling inference.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
◮ Approximate active learning for Gibbs sampling inference.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
◮ Bayesian model to mitigate label bias.
◮ Approximate active learning for Gibbs sampling inference.
ıve MCMC scoring.
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 27
Conclusion
Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 28