Detecting annotation noise in automatically labelled data Ines - PowerPoint PPT Presentation

Motivation Related Work Method Experiments Conclusions Detecting annotation noise in automatically labelled data Ines Rehbein & Josef Ruppenhofer Leibniz ScienceCampus ACL 2017

Motivation Related Work Method Experiments Conclusions Motivation • Many projects in the DH rely on automatically annotated data • Quality of automatic annotations not always good enough What we need: • A cheap and efficient way to find errors in automatically labeled data

Motivation Related Work Method Experiments Conclusions Related work • Many studies on finding errors in manually annotated data (Eskin 2000; van Halteren 2000; Kveton and Oliva 2002; Dickinson and Meurers 2003; Boyd et al. 2008; Loftsson 2009; Ambati et al. 2011; Dickinson 2015; Snow et al. 2008; Bian et al. 2009; Hovy et al. 2013; . . . )

Motivation Related Work Method Experiments Conclusions Related work • Many studies on finding errors in manually annotated data (Eskin 2000; van Halteren 2000; Kveton and Oliva 2002; Dickinson and Meurers 2003; Boyd et al. 2008; Loftsson 2009; Ambati et al. 2011; Dickinson 2015; Snow et al. 2008; Bian et al. 2009; Hovy et al. 2013; . . . ) • Few studies on finding errors in automatically annotated data (Rocio et al. 2007; Loftsson 2009; Rehbein 2014) Errors in automatic annotations are systematic and consistent

Motivation Related Work Method Experiments Conclusions Related work • Many studies on finding errors in manually annotated data (Eskin 2000; van Halteren 2000; Kveton and Oliva 2002; Dickinson and Meurers 2003; Boyd et al. 2008; Loftsson 2009; Ambati et al. 2011; Dickinson 2015; Snow et al. 2008; Bian et al. 2009; Hovy et al. 2013; . . . ) • Few studies on finding errors in automatically annotated data (Rocio et al. 2007; Loftsson 2009; Rehbein 2014) • Our work builds on Hovy, Berg-Kirkpatrick, Vaswani and Hovy (2013): Learning Whom to Trust with MACE

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 word j A 1 A 2 ... A m They PRP PRP ... PRP eat VBP VG ... VBP lots NNS RB ... NN of IN IN ... IN meat NN NNS ... NN ... ... ... ...

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 word j A 1 A 2 ... A m 1: procedure GenerateAnnot ( A ) They PRP PRP ... PRP 2: for i = 1 ... I instances do eat VBP VG ... VBP 3: T i ∼ Uniform 4: for j = 1 ... J annotators do lots NNS RB ... NN 5: S ij ∼ Bernoulli (1 − θ j ) of IN IN ... IN 6: if S ij = 0 then meat NN NNS ... NN 7: A ij = T i ... ... ... ... 8: else 9: A ij ∼ Multinomial ( ξ j ) 10: end if 11: end for 12: end for 13: end procedure 14: procedure UpdateParam (P(A; θ, ξ )) 15: return posterior entropies E 16: end procedure

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 word j A 1 A 2 ... A m 1: procedure GenerateAnnot ( A ) They PRP PRP ... PRP 2: for i = 1 ... I instances do eat VBP VG ... VBP 3: T i ∼ Uniform 4: for j = 1 ... J annotators do lots NNS RB ... NN 5: S ij ∼ Bernoulli (1 − θ j ) of IN IN ... IN 6: if S ij = 0 then meat NN NNS ... NN 7: A ij = T i ... ... ... ... 8: else 9: A ij ∼ Multinomial ( ξ j ) 10: end if Parameters: 11: end for θ trustworthyness of Annotator j 12: end for ξ behaviour of j if spamming 13: end procedure 14: procedure UpdateParam (P(A; θ, ξ )) 15: return posterior entropies E 16: end procedure

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 word j A 1 A 2 ... A m 1: procedure GenerateAnnot ( A ) They PRP PRP ... PRP 2: for i = 1 ... I instances do eat VBP VG ... VBP 3: T i ∼ Uniform 4: for j = 1 ... J annotators do lots NNS RB ... NN 5: S ij ∼ Bernoulli (1 − θ j ) of IN IN ... IN 6: if S ij = 0 then meat NN NNS ... NN 7: A ij = T i ... ... ... ... 8: else 9: A ij ∼ Multinomial ( ξ j ) 10: end if Parameters: 11: end for θ trustworthyness of Annotator j 12: end for ξ behaviour of j if spamming 13: end procedure 14: procedure UpdateParam (P(A; θ, ξ )) 15: return posterior entropies E 16: end procedure � N M � � � � P ( A ; θ, ξ ) = P ( T i ) · P ( S ij ; θ j ) · P ( A ij | S ij , T i ; ξ j ) T,S i =1 j =1

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 word j A 1 A 2 ... A m 1: procedure GenerateAnnot ( A ) They PRP PRP ... PRP 2: for i = 1 ... I instances do eat VBP VG ... VBP 3: T i ∼ Uniform 4: for j = 1 ... J annotators do lots NNS RB ... NN 5: S ij ∼ Bernoulli (1 − θ j ) of IN IN ... IN 6: if S ij = 0 then meat NN NNS ... NN 7: A ij = T i ... ... ... ... 8: else 9: A ij ∼ Multinomial ( ξ j ) 10: end if Parameters: 11: end for θ trustworthyness of Annotator j 12: end for ξ behaviour of j if spamming 13: end procedure 14: procedure UpdateParam (P(A; θ, ξ )) 15: return posterior entropies E Output: 16: end procedure E confidence in model predictions � N M � � � � P ( A ; θ, ξ ) = P ( T i ) · P ( S ij ; θ j ) · P ( A ij | S ij , T i ; ξ j ) T,S i =1 j =1

Motivation Related Work Method Experiments Conclusions MACE: Multi-Annotator Competence Estimation Hovy et al. 2013 ... word j A 1 A 2 A m 1: procedure GenerateAnnot ( A ) They PRP PRP ... PRP 2: for i = 1 ... I instances do eat VBP VG ... VBP 3: T i ∼ Uniform 4: for j = 1 ... J annotators do lots NNS RB ... NN 5: S ij ∼ Bernoulli (1 − θ j ) of IN IN ... IN 6: if S ij = 0 then meat NN NNS ... NN 7: A ij = T i ... ... ... ... 8: else 9: A ij ∼ Multinomial ( ξ j ) 10: end if Parameters: 11: end for θ trustworthyness of Annotator j 12: end for ξ behaviour of j if spamming 13: end procedure 14: procedure UpdateParam (P(A; θ, ξ )) 15: return posterior entropies E Output: 16: end procedure E confidence in model predictions Models: EM, Bayesian Variational Inference

Motivation Related Work Method Experiments Conclusions Estimating the reliability of automatic annotations • Task: POS tagging (7 POS taggers as “annotators”) • Data: English Penn Treebank (in-domain) Tagger Acc. bilstm 97.00 hunpos 96.18 stanford 96.93 svmtool 95.86 treetagger 94.35 tweb 95.99 wapiti 94.52 majority vote 97.28

Motivation Related Work Method Experiments Conclusions Estimating the reliability of automatic annotations • Task: POS tagging (7 POS taggers as “annotators”) • Data: English Penn Treebank (in-domain) Tagger Acc. bilstm 97.00 hunpos 96.18 stanford 96.93 svmtool 95.86 treetagger 94.35 tweb 95.99 wapiti 94.52 majority vote 97.28 MACE 97.27 ⇒ MACE doesn’t beat the majority vote baseline

Motivation Related Work Method Experiments Conclusions Estimating the reliability of automatic annotations • Task: POS tagging (7 POS taggers as “annotators”) • Data: English Penn Treebank (in-domain) Tagger Acc. bilstm 97.00 hunpos 96.18 stanford 96.93 svmtool 95.86 treetagger 94.35 tweb 95.99 wapiti 94.52 majority vote 97.28 MACE 97.27 Guide Variational Inference model with human feedback from active learning

Motivation Related Work Method Experiments Conclusions Combining Bayesian Inference with Active Learning • Selection strategy 1 (Baseline) : Query-by-Committee (QBC) Use disagreements in the predictions to identify errors: 1. compute entropy over predicted labels M : M � H = − P ( y i = m ) log P ( y i = m ) m =1 2. select N instances with highest entropy ⇒ potential errors 3. replace predicted label with true label • Evaluate accuracy for QBC after updating N instances ranked highest for entropy

Motivation Related Work Method Experiments Conclusions Combining Bayesian Inference with Active Learning • Selection strategy 2 : Variational Inference & AL (VI-AL) Maximize the probability of the observed data, using the variational model: 1. compute posterior entropy over predicted labels M 2. select N instances with highest entropy ⇒ potential errors 3. replace randomly selected predicted label with true label 4. compute new probabilities, based on the updated labels • Evaluate accuracy of VI-AL after updating N instances ranked highest for entropy

Motivation Related Work Method Experiments Conclusions Annotation matrix : Preprocessing c c ... c 1 2 n DT DT ... DT Classif ers: c 1 ,c 2 , ..., c n N NE ... N V V ... V ... ... ... ... EVAL: tagger acc.

Detecting annotation noise in automatically labelled data Ines - PowerPoint PPT Presentation

Motivation Related Work Method Experiments Conclusions Detecting annotation noise in automatically labelled data Ines Rehbein & Josef Ruppenhofer Leibniz ScienceCampus ACL 2017 Motivation Related Work Method Experiments Conclusions

Detecting annotation noise in automatically labelled data Ines Rehbein Josef Ruppenhofer IDS

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

On Hypersequents and Labelled Sequents Translating Labelled Sequent Proofs to Hypersequent Proofs

Labelled transition systems Labelled transition systems are relations of the form a Q P

Today Perceptron. Today Perceptron. Support Vector Machine. Labelled points with x 1 ,..., x n

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

802.1 Plenary November 2018 Bangkok, Thailand Opening Agenda Glenn Parsons IEEE 802.1 WG

Words & Pictures Tamara Berg Features Announcements HW1

On Variants of Modified Bar Recursion Paulo Oliva Queen Mary, University of London, UK

Relational Proof Interpretations Paulo Oliva Queen Mary University of London Logic Colloquium

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

funding: From images to descriptors and back again Patrick Prez FGMIA 2014 Visual search

Housing F ir st and Coor dinate d E ntr y Chic a g o , I L Se pte mb e r 12-13, 2018 Home

Administrivia Homework 2 will be posted today Will be due Tue., Feb. 23 before class

Detecting annotation noise in automatically labelled data Ines - PowerPoint PPT Presentation

Motivation Related Work Method Experiments Conclusions Detecting annotation noise in automatically labelled data Ines Rehbein & Josef Ruppenhofer Leibniz ScienceCampus ACL 2017 Motivation Related Work Method Experiments Conclusions

Detecting annotation noise in automatically labelled data Ines Rehbein Josef Ruppenhofer IDS

Introduction Detecting Errors in Effects of Annotation Errors Detecting Errors in Corpus

Detecting Errors in Semantic Annotation Argument identification variation Heuristics for

Annotation Processing in a Kotlin World Zac Sweers @pandanomic Annotation Processing in a

Module-2c: Two Port Noise Modelling 20 July 2018 16:40 Shot Noise vs. Flicker Noise Simple

Detecting Spammers and Content Detecting Spammers and Content Detecting Spammers and Content

12/6/2013 Detecting Fakes Image Forensics: Detecting Forged Photos 1.Detecting photorealistic

Visioning Committee Air Quality and Noise January 23, 2020 Noise Data Noise is evaluated on

On Hypersequents and Labelled Sequents Translating Labelled Sequent Proofs to Hypersequent Proofs

Labelled transition systems Labelled transition systems are relations of the form a Q P

Today Perceptron. Today Perceptron. Support Vector Machine. Labelled points with x 1 ,..., x n

Lecture 19- ECE 240a Laser Phase Noise 1 ECE 240a Lasers - Fall 2019 Lecture 19 Phase Noise

Making Polynomials Robust to Noise Alexander Sherstov U C L A Noise in computation 2 Noise in

Johnson Noise: Determinations of k and Absolute Zero Edwin Ng | 12 December 2011 Nyquists

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -&gt; value Pseudo-random:

802.1 Plenary November 2018 Bangkok, Thailand Opening Agenda Glenn Parsons IEEE 802.1 WG

Words &amp; Pictures Tamara Berg Features Announcements HW1

On Variants of Modified Bar Recursion Paulo Oliva Queen Mary, University of London, UK

Relational Proof Interpretations Paulo Oliva Queen Mary University of London Logic Colloquium

Visualizing and Interpreting Deep Neural Networks Bolei Zhou Department of Information

funding: From images to descriptors and back again Patrick Prez FGMIA 2014 Visual search

Housing F ir st and Coor dinate d E ntr y Chic a g o , I L Se pte mb e r 12-13, 2018 Home

Administrivia Homework 2 will be posted today Will be due Tue., Feb. 23 before class

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Noises Jaanus Jaggo Noise Noise is a function: noise(coordinate) -> value Pseudo-random:

Words & Pictures Tamara Berg Features Announcements HW1