a correlated worker model for grouped imbalanced and
play

A Correlated Worker Model for Grouped, Imbalanced and Multitask Data - PowerPoint PPT Presentation

A Correlated Worker Model for Grouped, Imbalanced and Multitask Data An T. Nguyen 1 Byron C. Wallace Matthew Lease University of Texas at Austin UAI 2016 1 Presenter 1 Overview A model of workers in crowdsourcing. 2 Overview A model


  1. A Correlated Worker Model for Grouped, Imbalanced and Multitask Data An T. Nguyen 1 Byron C. Wallace Matthew Lease University of Texas at Austin UAI 2016 1 Presenter 1

  2. Overview ◮ A model of workers in crowdsourcing. 2

  3. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. 2

  4. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. 2

  5. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. 2

  6. Overview ◮ A model of workers in crowdsourcing. ◮ Idea: Transfer knowledge of worker quality. ◮ Variational EM learning. ◮ Apply to two datasets: ◮ Biomed Citation Screening: imbalanced, grouped. ◮ Galaxy Classification: multiple tasks. 2

  7. Background ◮ Crowdsourcing: collect labels quickly at low cost. 3

  8. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. 3

  9. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. 3

  10. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . 3

  11. Background ◮ Crowdsourcing: collect labels quickly at low cost. ◮ But (usually) lower quality. ◮ Common solution: collect 5 labels for each instance ... ◮ ... then aggregate them. ◮ Most previous work: improve (the estimates of) labels . ◮ Our work: improve (the estimates of) worker qualities . 3

  12. Motivation for estimating worker qualities 4

  13. Motivation for estimating worker qualities Diagnostic insights. 4

  14. Motivation for estimating worker qualities Diagnostic insights. Help workers improve. 4

  15. Motivation for estimating worker qualities Diagnostic insights. Help workers improve. Intelligent task routing (assign work to workers). 4

  16. Worker Quality Measure Accurary: simple but not enough. 5

  17. Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) 5

  18. Worker Quality Measure Accurary: simple but not enough. → Confusion matrix: Pr (worker label | true label) Binary task (this work): ◮ Sensitivity: Pr (positive | positive). ◮ Specificity: Pr (negative | negative). 5

  19. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). 6

  20. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. 6

  21. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. 6

  22. Setting Input ◮ Crowd labels for each instance. ◮ No instance-level features (future work). Output ◮ For each worker: sensitivity and specificity. Eval. Metric ◮ RMSE on sen. and spe. ◮ gold sen. spe.: gold labels in whole dataset. 6

  23. Challenges Sparsity: many workers do only a few instances. 7

  24. Challenges Sparsity: many workers do only a few instances. Data is imbalanced: ◮ A lot more negative than positive ◮ Difficult to estimate sensitivity 7

  25. Idea Transfer knowledge of worker quality ◮ Between classes. ◮ Within group. ◮ In multiple tasks. 8

  26. Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. 9

  27. Previous models (Raykar et. al. 2010; Liu & Wang 2012; Kim & Ghahramani 2012) Hidden vars: ◮ True label for each instance. ◮ Confusion mat. (sen. + spe.) for each worker. Assumptions: ◮ Sen. & Spe. are independent params. ◮ A single group of workers. ◮ Multiple tasks: independent models. 9

  28. Our Model Assumptions: ◮ Sen. & Spe. are correlated. ◮ Multiple groups of workers (group membership is known). ◮ Sen. & Spe. in multiple tasks are correlated. 10

  29. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) 11

  30. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) 11

  31. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) 11

  32. The Base Model (i indexes instances, j indexes workers) U j , V j ∼ N ( µ, C ) Z i ∼ Ber ( θ ) L ij | Z i = 1 ∼ Ber ( S ( U j )) L ij | Z i = 0 ∼ Ber ( S ( V j )) 11

  33. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 12

  34. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. 12

  35. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). 12

  36. Extensions 1. Worker Groups: ◮ Know group membership. ◮ Model each group k = a Normal dist ( µ k , C k ). 2. Multiple tasks: ◮ Assume two tasks. ◮ ( Sen 1 , Spe 1 ) correlates with ( Sen 2 , Spe 2 ). ◮ ( U 1 , V 1 , U 2 , V 2 ) ∼ N ( µ, C ) 12

  37. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). 13

  38. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . 13

  39. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . 13

  40. Inference For the Base Model Approach: Variational EM ◮ E-step: infer Pr ( U 1 .. m , V 1 .. m , Z 1 .. n | L ). ◮ M-step: maximize parameters µ, C , θ . Variational Inference: ◮ Approximate the (complex) posterior Pr ( | )... ◮ ... by a simpler function q . ◮ Minimize KL ( q || p ) ... ◮ ... equivalent to maximize a log-likelihood lower bound. 13

  41. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 14

  42. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) 14

  43. Inference Meanfield Assumptions: ◮ q factorizes: m n � � q ( U 1 .. m , V 1 .. m , Z 1 .. n ) = q ( U j ) q ( V j ) q ( Z i ) j =1 i =1 ◮ Factors: σ 2 q ( U j ) = N (˜ µ uj , ˜ uj ) σ 2 q ( V j ) = N (˜ µ vj , ˜ vj ) q ( Z i ) = Ber (˜ θ i ) ◮ Optimize with respect to σ 2 σ 2 vj | j = 1 ... m } and { ˜ { ˜ θ i | i = 1 ... n } µ uj , ˜ uj , ˜ µ vj , ˜ 14

  44. Optimization Coordinate Descent: update one var at a time. 15

  45. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) 15

  46. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) 15

  47. Optimization Coordinate Descent: update one var at a time. Update Z i : � � � q ∗ ( Z i = 1) ∝ exp log Ber (1 | θ ) + E U j ∼ q ( U j ) log Ber ( L ij |S ( U j )) � � � q ∗ ( Z i = 0) ∝ exp log Ber (0 | θ ) + E V j ∼ q ( V j ) log Ber ( L ij |S ( V j )) Intuition: ◮ Z i ≈ Prior + � E (Crowd labels for i) ◮ E wrt worker quality. 15

  48. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) 16

  49. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . 16

  50. Optimization Update U j : � q ∗ ( U j ) ∝ exp E V j ∼ q ( V j ) log N ( U j , V j | µ, C )+ � � q ( Z i = 1) log Ber ( L ij |S ( U j )) Intuition: ◮ U j = logit sensitivity of worker j . ◮ U j ≈ E (correlation with specificity) + ... 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend