bayesian bias mitigation for crowdsourcing
play

Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC - PowerPoint PPT Presentation

University of California, Berkeley Bayesian Bias Mitigation for Crowdsourcing Fabian L. Wauthier, UC Berkeley with Michael I. Jordan 9th of May, 2012 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 1 The Problem of Bias in


  1. Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9

  2. Contribution I: Bayesian Preference Model Bayesian Preference Model Labelers express accumulated, shared preferences. ◮ Parameter γ b models effect of preference b = 1 , . . . , K . ◮ m × K binary matrix Z models parameter sharing. ◮ If z l , b = 1, labeler l expresses preference b . ◮ Parameter β l accumulates preferences: � β l = z l , b γ b b ◮ Likelihood: � � p ( y i , l | β ⊤ p ( Y | X , Z , γ ) = l x i ) i : y i , l � =0 l ◮ Similar preferences ⇒ similar labelling. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 9

  3. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  4. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  5. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  6. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  7. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  8. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  9. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  10. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  11. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  12. Contribution I: Bayesian Preference Model Priors ◮ Prior on γ b : p ( γ b ) = N (0 , σ 2 I ) for each b . ◮ Prior on Z : fix Z to be m × K . � α � π b | α ∼ Beta K , 1 , b = 1 , . . . , K (1) z l , b | π b ∼ Bern ( π b ) , l = 1 , . . . , m (2) K m Z ◮ As K → ∞ , distribution over Z converges to the Indian Buffet Process (IBP) . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 10

  13. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  14. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  15. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  16. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  17. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  18. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  19. Contribution I: Bayesian Preference Model Complete model p ( Y , Z , γ | X ) = p ( Y | X , Z , γ ) p ( γ | Z ) p ( Z ) ◮ Recall bias: different labellers can have different β ’s Example: Disagreement if guitar is behind/next to the couch. ◮ Want to predict labeller l ’s labels. ◮ Labeller l could be in the crowd, or the gold standard . ◮ Required inference: p ( β l | X , Y ), or equivalently p ( z l , b , γ b , b = 1 , . . . , K | X , Y ). ◮ Model is complex. Exact inference intractable. ◮ Possible alternatives: Gibbs sampling , variational inference , slice sampling , etc. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 11

  20. BBMC Results Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 12

  21. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  22. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  23. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  24. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  25. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  26. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  27. BBMC Results Results: Synthetic Data ◮ X is 2000 × 4 Gaussian matrix ◮ Z is 30 × 2 uniform binary matrix ( m = 30 , K = 2). ◮ γ b Gaussian b = 1 , 2. β l = � b z l , b γ b . ◮ Observation probability ǫ = 0 . 1.  0 w.p. (1 − ǫ )  w.p. ǫ Φ( x ⊤ y i , l = +1 i β l ) − 1 o.w.  ◮ Inference: want to recover β 1 (say). ◮ Requires p ( z 1 , b , γ b , b = 1 . . . , K | , X , Y ). ◮ For inference set K = 10. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 13

  28. BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

  29. BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

  30. BBMC Results ◮ Latent Z mostly correct after 1000 Gibbs steps. ◮ Gibbs sequence for γ 1 , 1 . 0.75 0.7 0.65 0.6 0.55 1000 1200 1400 1600 1800 2000 ◮ True and posterior mean of β 1 after 1000 iterations burnin.     0 . 6915 0 . 6514 0 . 0754 0 . 0535 ˆ     β 1 = β 1 = (3)     − 0 . 6815 − 0 . 6473     0 . 6988 0 . 6957 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 14

  31. BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

  32. BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

  33. BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

  34. BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

  35. BBMC Results Results: Crowdsourced data ◮ Task: Is the triangle to the left or above the rectangle ◮ Labelled on Amazon Mechanical Turk: 523 tasks, 3 labels per task, 76 labellers. ◮ Want to predict gold standard: compare centroid positions. ◮ All 26 labellers with over 20 labels have error above 0.16. ◮ Researcher also labels, and gives 60 gold standard labels. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 15

  36. BBMC Results ◮ Averaged log likelihood and error rate on test set. ◮ Our model: BBMC. Algorithm Final Loglik Final Error − 3716 ± 1695 0 . 0547 ± 0 . 0102 GOLD No Active − 421 . 1 ± 2 . 6 0 . 0935 ± 0 . 0031 CONS Learning − 219 . 1 ± 3 . 1 0 . 0309 ± 0 . 0033 BBMC Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 16

  37. Contribution II: Approximate Active Learning Overview Contribution I: Bayesian Preference Model BBMC Results Contribution II: Approximate Active Learning Active Learning Results Conclusion Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 17

  38. Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

  39. Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

  40. Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

  41. Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

  42. Contribution II: Approximate Active Learning Active Learning ◮ Want to predict labeller l ’s labels. Need β l . ◮ Not all labellers are useful to infer β l . ◮ If l and l ′ share parameters ⇒ can learn about β l from l ′ . � � β l = z l , b γ b β l ′ = z l ′ , b γ b (4) b b ◮ Active learning: repeatedly select training data that helps learning β l . ◮ Goal: cheaper training data, faster learning. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 18

  43. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  44. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  45. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  46. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  47. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  48. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  49. Contribution II: Approximate Active Learning Approximate inference and Active Learning ◮ Suppose we start with training data Y . ◮ Query task-labeler pair ( i , l ) to maximize expected utility of adding it � � p ( β | y i ′ , l ′ , X , Y ) �� ( i , l ) = argmax ( i ′ , l ′ ) E y i ′ , l ′ U ◮ Examples: U ( · ) = − Entropy( · ). U µ ( · ) = | | 2 | Mean( · ) − µ | 2 ◮ For each ( i ′ , l ′ ) score, need posterior p ( β | y i ′ , l ′ , X , Y ). ◮ Gibbs sampling ⇒ separate Gibbs samplers to score ( i ′ , l ′ ). ◮ We are already running one Gibbs sampler for basic inference. ◮ Problem: Can we avoid running the extra scoring chains? Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 19

  50. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  51. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  52. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  53. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  54. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  55. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  56. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  57. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  58. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ β t − 1 ˆ ˆ β t β t + 1 β t + 2 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  59. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  60. Contribution II: Approximate Active Learning Contribution II: Approximate Active Learning ◮ Gibbs sampler for p ( β | X , Y ) is a Markov chain for inference ◮ Sampler for p ( β | y i ′ , l ′ , X , Y ) is a perturbed chain for scoring ◮ Na¨ ıve scoring: • Run perturbed chain; sample from stationary distribution. • Compute U ( p ( β | y i ′ , l ′ , X , Y )). ◮ Our method : • Get approximate samples of p ( β | y i ′ , l ′ , X , Y ) by transforming samples of p ( β | X , Y ). • Approximate U ( p ( β | y i ′ , l ′ , X , Y )) from these. β t − 1 β t β t + 1 β t + 2 ˆ ˆ ˆ ˆ β t β t + 1 β t + 2 β t + 3 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 20

  61. Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

  62. Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

  63. Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

  64. Contribution II: Approximate Active Learning Approximate Scoring for Active Learning p (ˆ β t | ˆ ◮ Suppose chain p ( β t | β t − 1 ) and a perturbed chain ˆ β t − 1 ). p ∞ (ˆ ◮ Stationary distributions are p ∞ ( β ) and ˆ β ). ◮ Let β s ∼ p ∞ ( β ) s = 1 , . . . , S , and approximate S β | β ) p ∞ ( β ) d β ≈ 1 � p ∞ (ˆ p (ˆ � p (ˆ β | β s ) . β ) ≈ ˆ ˆ ˆ S s =1 ◮ If p ∞ ( β ) = ˆ p ∞ ( β ), the first approximation is exact. ◮ Specialize to active learning: • Unperturbed chain = Gibbs sampler for p ( β | X , Y ). • Perturbed chain = Gibbs sampler for p ( β | y i ′ , l ′ , X , Y ). Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 21

  65. Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

  66. Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

  67. Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

  68. Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Suppose W is n × n , positive, symmetric. P = D − 1 W . ◮ Stationary distribution is left eigenvector of P . Decompose A = D − 1 / 2 WD − 1 / 2 (5) = V Λ V ⊤ , λ 1 ≤ λ 2 ≤ . . . ≤ λ n = 1 (6) p ∞ ∝ D 1 / 2 v n (7) ◮ Perturb the matrix: ˆ W = W + dW ≥ 0, with dW 1 = 0. P = D − 1 ˆ ◮ Then ˆ W = P + D − 1 dW = P + dP . Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 22

  69. Contribution II: Approximate Active Learning Special Case: Discrete Random Walks ◮ Matrix perturbation theory:   v k v ⊤ � p ∞ ≈ p ∞ + D 1 / 2 k  dP ⊤ D − 1 / 2 p ∞ ˜ (8) 1 − λ k k � = n Fabian L. Wauthier: Bayesian Bias Mitigation for Crowdsourcing, 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend