Aggregating and Predicting Sequence Labels from Crowd Annotations An - - PowerPoint PPT Presentation

aggregating and predicting sequence labels from crowd
SMART_READER_LITE
LIVE PREVIEW

Aggregating and Predicting Sequence Labels from Crowd Annotations An - - PowerPoint PPT Presentation

Aggregating and Predicting Sequence Labels from Crowd Annotations An T. Nguyen 1 Byron C. Wallace 2 Jessy Li 1 , 3 Ani Nenkova 3 Matthew Lease 1 1 University of Texas at Austin 2 Northeastern University 3 University of Pennsylvania ACL 2017


slide-1
SLIDE 1

Aggregating and Predicting Sequence Labels from Crowd Annotations

An T. Nguyen1∗ Byron C. Wallace2 Jessy Li1,3 Ani Nenkova3 Matthew Lease1

1University of Texas at Austin 2 Northeastern University 3 University of Pennsylvania

ACL 2017

∗Presenter

1

slide-2
SLIDE 2

Problem: Sequence Labeling with Crowd Labels

2

slide-3
SLIDE 3

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc

2

slide-4
SLIDE 4

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc

2

slide-5
SLIDE 5

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc W2 Org Per Per O O Loc

2

slide-6
SLIDE 6

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc W2 Org Per Per O O Loc W3 Org O Per O O Loc

2

slide-7
SLIDE 7

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc W2 Org Per Per O O Loc W3 Org O Per O O Loc Two tasks:

2

slide-8
SLIDE 8

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc W2 Org Per Per O O Loc W3 Org O Per O O Loc Two tasks:

◮ Aggregation: Given (X, W1,2,3), Estimate Y

2

slide-9
SLIDE 9

Problem: Sequence Labeling with Crowd Labels

Example: Named Entity Recognition. X U.N.

  • fficial

Ekeus heads for Baghdad Y Org O Per O O Loc W1 Org O Org O O Loc W2 Org Per Per O O Loc W3 Org O Per O O Loc Two tasks:

◮ Aggregation: Given (X, W1,2,3), Estimate Y ◮ Prediction: Given train data (X, W1,2,3), Predict Ytest for Xtest

2

slide-10
SLIDE 10

Our work

Contribution: Two Joint models of sequences and crowd.

3

slide-11
SLIDE 11

Our work

Contribution: Two Joint models of sequences and crowd.

  • 1. Aggregation.

◮ Hidden Markov Models (HMMs) +

Crowd Confusion Matrices.

3

slide-12
SLIDE 12

Our work

Contribution: Two Joint models of sequences and crowd.

  • 1. Aggregation.

◮ Hidden Markov Models (HMMs) +

Crowd Confusion Matrices.

  • 2. Prediction.

◮ Long Short Term memory (LSTM) +

Crowd Embedding Vectors.

3

slide-13
SLIDE 13

Our work

Contribution: Two Joint models of sequences and crowd.

  • 1. Aggregation.

◮ Hidden Markov Models (HMMs) +

Crowd Confusion Matrices.

  • 2. Prediction.

◮ Long Short Term memory (LSTM) +

Crowd Embedding Vectors. Evaluation:

◮ News NER + Biomedical IE. ◮ A range of baselines.

3

slide-14
SLIDE 14

Our work

Contribution: Two Joint models of sequences and crowd.

  • 1. Aggregation.

◮ Hidden Markov Models (HMMs) +

Crowd Confusion Matrices.

  • 2. Prediction.

◮ Long Short Term memory (LSTM) +

Crowd Embedding Vectors. Evaluation:

◮ News NER + Biomedical IE. ◮ A range of baselines.

Code + Data on Github.

3

slide-15
SLIDE 15

HMM-Crowd

(for task 1 - aggregation)

HMM (position i): hi+1|hi ∼ Discrete(τ hi) vi|hi ∼ Discrete(Ωhi)

4

slide-16
SLIDE 16

HMM-Crowd

(for task 1 - aggregation)

HMM (position i): hi+1|hi ∼ Discrete(τ hi) vi|hi ∼ Discrete(Ωhi) Crowd model (worker j): lij|hi ∼ Discrete(C(j)

hi )

4

slide-17
SLIDE 17

HMM-Crowd

(for task 1 - aggregation)

HMM (position i): hi+1|hi ∼ Discrete(τ hi) vi|hi ∼ Discrete(Ωhi) Crowd model (worker j): lij|hi ∼ Discrete(C(j)

hi )

C(j): confusion matrix for j

4

slide-18
SLIDE 18

HMM-Crowd: Parameter Learning

Expectation Maximization (EM) algorithm:

5

slide-19
SLIDE 19

HMM-Crowd: Parameter Learning

Expectation Maximization (EM) algorithm: E-step

◮ Estimate posterior p(h) ◮ Extend Forward-Backward algorithm.

5

slide-20
SLIDE 20

HMM-Crowd: Parameter Learning

Expectation Maximization (EM) algorithm: E-step

◮ Estimate posterior p(h) ◮ Extend Forward-Backward algorithm.

M-step:

◮ Estimate parameters τ, Ω, C ◮ Variational Bayes estimate.

5

slide-21
SLIDE 21

LSTM for NER

(Lample et al. 2016)

6

slide-22
SLIDE 22

LSTM for NER

(Lample et al. 2016)

LSTM: word rep. → sent. rep.

6

slide-23
SLIDE 23

LSTM for NER

(Lample et al. 2016)

LSTM: word rep. → sent. rep. Hidden Layer: fully connected.

6

slide-24
SLIDE 24

LSTM for NER

(Lample et al. 2016)

LSTM: word rep. → sent. rep. Hidden Layer: fully connected. Tags Scores: ∼ prob. each label for each word.

6

slide-25
SLIDE 25

LSTM for NER

(Lample et al. 2016)

LSTM: word rep. → sent. rep. Hidden Layer: fully connected. Tags Scores: ∼ prob. each label for each word. CRF: word prediction → sent. prediction.

6

slide-26
SLIDE 26

LSTM-Crowd

(for task 2 - prediction)

7

slide-27
SLIDE 27

LSTM-Crowd

(for task 2 - prediction)

◮ vectors represented noise by worker.

7

slide-28
SLIDE 28

LSTM-Crowd

(for task 2 - prediction)

◮ vectors represented noise by worker. ◮ v(good worker) ≈ 0

7

slide-29
SLIDE 29

Data

Dataset Application Documents Gold Labels Crowd Labels CoNLL’03 NER 1393 All 400 Medical IE 5000 200 All

8

slide-30
SLIDE 30

Evaluation: Task 1 - aggregation

Baselines:

  • 1. Non-sequential:

◮ Majority Voting ◮ Dawid & Skene (1979) ◮ MACE (Hovy et al. 2013)

9

slide-31
SLIDE 31

Evaluation: Task 1 - aggregation

Baselines:

  • 1. Non-sequential:

◮ Majority Voting ◮ Dawid & Skene (1979) ◮ MACE (Hovy et al. 2013)

  • 2. Sequential:

◮ CRF-MA (Rodrigues et al. 2014)

9

slide-32
SLIDE 32

Results: NER task 1 - aggregation

Method F1 Majority Vote 65.71

10

slide-33
SLIDE 33

Results: NER task 1 - aggregation

Method F1 Majority Vote 65.71 MACE (Hovy et al. 2013) 67.37

10

slide-34
SLIDE 34

Results: NER task 1 - aggregation

Method F1 Majority Vote 65.71 MACE (Hovy et al. 2013) 67.37 Dawid-Skene (DS) 71.39

10

slide-35
SLIDE 35

Results: NER task 1 - aggregation

Method F1 Majority Vote 65.71 MACE (Hovy et al. 2013) 67.37 Dawid-Skene (DS) 71.39 CRF-MA (Rodrigues et al. 2014) 62.53

10

slide-36
SLIDE 36

Results: NER task 1 - aggregation

Method F1 Majority Vote 65.71 MACE (Hovy et al. 2013) 67.37 Dawid-Skene (DS) 71.39 CRF-MA (Rodrigues et al. 2014) 62.53 HMM-Crowd 74.76

10

slide-37
SLIDE 37

Evaluation: Task 2 - prediction

Baselines:

  • 1. Aggregate then train:

◮ Majority Vote then CRF ◮ Dawid-Skene then LSTM

11

slide-38
SLIDE 38

Evaluation: Task 2 - prediction

Baselines:

  • 1. Aggregate then train:

◮ Majority Vote then CRF ◮ Dawid-Skene then LSTM

  • 2. Train directly on crowd labels:

◮ CRF-MA (Rodrigues et al. 2014) ◮ LSTM (original, Lample et al. 2016)

11

slide-39
SLIDE 39

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20

12

slide-40
SLIDE 40

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60

12

slide-41
SLIDE 41

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60 LSTM (Lample et al. 2016) 67.73

12

slide-42
SLIDE 42

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60 LSTM (Lample et al. 2016) 67.73 Dawid-Skene then LSTM 66.27

12

slide-43
SLIDE 43

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60 LSTM (Lample et al. 2016) 67.73 Dawid-Skene then LSTM 66.27 LSTM-Crowd 70.82

12

slide-44
SLIDE 44

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60 LSTM (Lample et al. 2016) 67.73 Dawid-Skene then LSTM 66.27 LSTM-Crowd 70.82 HMM-Crowd then LSTM 70.87

12

slide-45
SLIDE 45

Results: NER task 2 - prediction

Method F1 Majority Vote then CRF 58.20 CRF-MA (Rodrigues et al. 2014) 62.60 LSTM (Lample et al. 2016) 67.73 Dawid-Skene then LSTM 66.27 LSTM-Crowd 70.82 HMM-Crowd then LSTM 70.87 LSTM on Gold Labels (upper-bound) 84.22

12

slide-46
SLIDE 46

Conclusion

◮ Joint models of sequences and crowd labels.

13

slide-47
SLIDE 47

Conclusion

◮ Joint models of sequences and crowd labels. ◮ HMMs good for aggregation, ... ◮ ... LSTMs good for prediction.

13

slide-48
SLIDE 48

Conclusion

◮ Joint models of sequences and crowd labels. ◮ HMMs good for aggregation, ... ◮ ... LSTMs good for prediction.

Paper:

◮ Alternative LSTM-Crowd model. ◮ Results for Biomedical IE.

13

slide-49
SLIDE 49

Conclusion

◮ Joint models of sequences and crowd labels. ◮ HMMs good for aggregation, ... ◮ ... LSTMs good for prediction.

Paper:

◮ Alternative LSTM-Crowd model. ◮ Results for Biomedical IE.

13

slide-50
SLIDE 50

Conclusion

◮ Joint models of sequences and crowd labels. ◮ HMMs good for aggregation, ... ◮ ... LSTMs good for prediction.

Paper:

◮ Alternative LSTM-Crowd model. ◮ Results for Biomedical IE.

Acknowledgment: Reviewers, Workers, NSF & NIH.

13

slide-51
SLIDE 51

Conclusion

◮ Joint models of sequences and crowd labels. ◮ HMMs good for aggregation, ... ◮ ... LSTMs good for prediction.

Paper:

◮ Alternative LSTM-Crowd model. ◮ Results for Biomedical IE.

Acknowledgment: Reviewers, Workers, NSF & NIH. Questions?

13