Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC - PowerPoint PPT Presentation

Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9

Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l ◮ Similar preferences ⇒ similar labelling. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z ◮ As K → ∞ , distribution over Z converges to the Indian Buffet Process (IBP) . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. ◮ Possible alternatives: Gibbs sampling , variational inference , slice sampling , etc. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

BBMC Results Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 12

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). ◮ For inference set K = 10. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 ◮ True and posterior mean of β 1 after 1000 iterations burnin.     0 . 6915 0 . 6514 0 . 0754 0 . 0535 ˆ     β 1 = β 1 = (3)     − 0 . 6815 − 0 . 6473     0 . 6988 0 . 6957 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. ◮ Researcher also labels, and gives 60 gold standard labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

BBMC Results ◮ Averaged log likelihood and error rate on test set. ◮ Our model: BBMC. Algorithm Final Loglik Final Error − 3716 ± 1695 0 . 0547 ± 0 . 0102 GOLD No Active − 421 . 1 ± 2 . 6 0 . 0935 ± 0 . 0031 CONS Learning − 219 . 1 ± 3 . 1 0 . 0309 ± 0 . 0033 BBMC Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 16

Contribution II: Approximate Active Learning Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 17

Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . ◮ Goal: cheaper training data, faster learning. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. ◮ Problem: Can we avoid running the extra scoring chains? Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). • Approximate U ( p ( β | y i ′ , l ′ , X , Y )) from these. β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. ◮ Specialize to active learning: • Unperturbed chain = Gibbs sampler for p ( β | X , Y ). • Perturbed chain = Gibbs sampler for p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. P = D − 1 ˆ ◮ Then ˆ W = P + D − 1 dW = P + dP . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Matrix perturbation theory:   v k v ⊤ � p ∞ ≈ p ∞ + D 1 / 2 k  dP ⊤ D − 1 / 2 p ∞ ˜ (8) 1 − λ k k � = n Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23

Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC - PowerPoint PPT Presentation

University of California, Berkeley Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1 The Problem of Bias in

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

A/B Testing Crowdsourcing and Human Computation Instructor: Chris Callison-Burch Website:

Crowdsourcing and Human Computer Interaction Design Crowdsourcing and Human Computation

How Crowdsourcing Enabled Computer Vision Crowdsourcing and Human Computation Instructor: Chris

Rise of Crowdsourcing Crowdsourcing = Harvesting societys wisdom, skill, creativity, and scale

Crowdsourcing and HCI 2: Privacy and Latency Crowdsourcing and Human Computation Instructor:

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Hazard Mitigation Planning Katie Sommers State Hazard Mitigation Officer Roxanne Gray

East Fishkill Hazard Mitigation Plan Town of East Fishkill Hazard Mitigation Plan Town of East

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1

Part 3: Probabilistic Inference in Graphical Models Sebastian Nowozin and Christoph H. Lampert

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Stochastic Simulation Idea: probabilities samples Get probabilities from samples: X count X

Hierarchical Modeling Hierarchical modeling has taken over the landscape in contemporaery

Felisa J. V azquez-Abad and Lachlan L. H. Andrew D epartement dinformatique et recherche

A Search for e Oscillation with MiniBooNE Hai-Jun Yang University of Michigan, Ann

Motivation Dramatic increase in digital data Privacy-Preserving Data Mining World Wide

Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC - PowerPoint PPT Presentation

University of California, Berkeley Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1 The Problem of Bias in

BIAS What Is Bias? Bias can be defined as favoring one side, position, or belief being

Variable selection bias Bias in Ensemble Bias in Ensemble Methods Methods Variable selection

BIAS BIAS LIGHT LIGHT &amp; &amp; MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Expectancy bias and Bias and forensic evidence Bias and speech research forensic speech

Publication bias in QCA Publication bias in QCA Publication bias in QCA Meaning, diagnosis and

A/B Testing Crowdsourcing and Human Computation Instructor: Chris Callison-Burch Website:

Crowdsourcing and Human Computer Interaction Design Crowdsourcing and Human Computation

How Crowdsourcing Enabled Computer Vision Crowdsourcing and Human Computation Instructor: Chris

Rise of Crowdsourcing Crowdsourcing = Harvesting societys wisdom, skill, creativity, and scale

Crowdsourcing and HCI 2: Privacy and Latency Crowdsourcing and Human Computation Instructor:

Being Bayesian About Being Bayesian About Net work St ruct ure Net work St ruct ure A Bayesian

Outline Intro to RL and Bayesian Learning History of Bayesian RL Model-based Bayesian

Hazard Mitigation Planning Katie Sommers State Hazard Mitigation Officer Roxanne Gray

East Fishkill Hazard Mitigation Plan Town of East Fishkill Hazard Mitigation Plan Town of East

Equity &amp; Excellence: Hidden Bias Implicit Bias Inherent Bias

Bias in, Bias out: Gender Equality and the Fourth Industrial Revolution Debra Howcroft and

Markov chain Monte Carlo sampling SPiNCOM reading group Jun. 10 th , 2016 Dimitris Berberidis 1

Part 3: Probabilistic Inference in Graphical Models Sebastian Nowozin and Christoph H. Lampert

Bayesian networks: approximate inference Machine Intelligence Thomas D. Nielsen September 2008

Stochastic Simulation Idea: probabilities samples Get probabilities from samples: X count X

Hierarchical Modeling Hierarchical modeling has taken over the landscape in contemporaery

Felisa J. V azquez-Abad and Lachlan L. H. Andrew D epartement dinformatique et recherche

A Search for e Oscillation with MiniBooNE Hai-Jun Yang University of Michigan, Ann

Motivation Dramatic increase in digital data Privacy-Preserving Data Mining World Wide

BIAS BIAS LIGHT LIGHT & & MEDIUM MEDIUM TR TRUCK UCK TIRES TIRES Bias Bias Ligh

Equity & Excellence: Hidden Bias Implicit Bias Inherent Bias