Evaluating Information Extraction Andrea Esuli and Fabrizio - PowerPoint PPT Presentation

Evaluating Information Extraction Andrea Esuli and Fabrizio Sebastiani Istituto di Scienza e Tecnologie dell’Informazione Consiglio Nazionale delle Ricerche Via Giuseppe Moruzzi, 1 – 56124 Pisa, Italy E-mail: { firstname . lastname } @isti.cnr.it Conference on Multilingual and Multimodal Information Access Evaluation (CLEF 2010) September 20-23, 2010 – Padova, IT

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work (Annotation-based) Information Extraction: an example 2 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Outline Introduction 1 Defining Information Extraction 2 The Segmentation F-score 3 The Token & Separator Model 4 Experiments 5 Conclusion and further work 6 3 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Introduction Little past research and discussion on mathematical measures for evaluating Information Extraction (IE) Generalized feeling that no satisfactory measure has been found yet. The most frequently used evaluation model in IE is the segmentation F-score We claim that it suffers from several problems, and propose a new evaluation model that does not suffer from them. 5 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work A formal definition of IE Let a text U = { t 1 ≺ s 1 ≺ . . . ≺ s n − 1 ≺ t n } consist of a sequence of tokens (e.g., word occurrences) t 1 , . . . , t n and separators (e.g., sequences of blanks and punctuation symbols) s 1 . . . s n − 1 The term textual unit (or simply t-unit) denotes either a token or a separator. Let C = { c 1 , . . . , c m } be a predefined set of tags, or tagset. Let A = { σ 11 , . . . , σ 1 k 1 , . . . , σ m 1 , . . . , σ mk m } be an annotation for U , where a segment σ ij for U is a pair ( st ij , et ij ) composed of a start token st ij ∈ U and an end token et ij ∈ U . 7 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work A formal definition of IE (cont’d) We define Information Extraction (IE) as the task of estimating an unknown target function Φ : U × C → A , that defines how a text U ∈ U ought to be annotated (according to a tagset C ) by an annotation A ∈ A . The result ˆ Φ : U × C → A of this estimation is called a tagger. Given a true annotation A = Φ( U , C ) = { σ 11 , . . . , σ 1 k 1 , . . . , σ m 1 , . . . , σ mk m } ˆ A = ˆ a predicted annotation Φ( U , C ) = { ˆ σ 11 , . . . , ˆ k 1 , . . . , ˆ σ m 1 , . . . , ˆ k m } σ 1ˆ σ m ˆ our aim is that of defining precise criteria for measuring how accurate this estimation is. 8 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Single-tag IE or Multi-tag IE? Our definition allows a given t-unit to be tagged by more than one tag (multi-tag IE). Example: in the expression “the Ronald Reagan Presidential Library” we might decree the t-units in “Ronald Reagan” to be instances of both the PER (“person”) tag and the ORG (“organization”) Single-tag IE is a special case of multi-tag IE, and a measure for multi-tag IE by definition accounts for single-tag IE too. Multi-tag IE thus consists of m independent subproblems of estimating ˆ Φ i : U → A i , for any i ∈ { 1 , . . . , m } . We will thus simply deal with c i -annotations, i.e., sets of c i -segments of the form A i = { σ i 1 , . . . , σ ik i } , for any i ∈ { 1 , . . . , m } . 9 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work The segmentation F-score Example FN FN The quick brown fox jumps over the lazy dog true predicted The quick brown fox jumps over the lazy dog FP FP FP The segmentation F-score model assumes IE to be a single-tag task 1 2 TP F 1 = FP + FN + 2 TP as the evaluation measure 2 The set of segments (true or predicted) as the event space 3 These choices give rise to problems 11 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Problems with the segmentation F-score: 1. True negatives Assumption 3 makes the notion of a true negative (“any segment of any length that is neither a true nor a predicted segment”) too clumsy to be of any real use. There are O ( n 2 ) such TNs ... While this is not a problem for F 1 , this would not allow switching to other plausible measures of agreement (e.g., Cohen’s kappa, ROC, accuracy). 12 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Problems with the segmentation F-score: 2. Overlap In the segmentation F-score there are several alternative models of what counts as a TP: Exact match model (most frequently used one): only exact matches count as TPs; too harsh (e.g., for tag ORG, σ =“Ronald Reagan Presidential Library”, ˆ σ =“Reagan Presidential Library” count as a double mistake, since σ is a FN and ˆ σ is a FP); Overlap model: if σ and ˆ σ overlap even marginally, this is a TP: too lenient encourages “cheating” (e.g., when ˆ σ covers the entire document ...) Constrained overlap model: max k 1 spurious tokens and max k 2 missing tokens are accepted: too arbitrary; does not reward exact matches (e.g., ˆ σ ′ =“the Ronald Reagan Presidential” is given the same credit as σ ′′ =“Ronald Reagan Presidential Library”) ˆ 13 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Introduction Defining Information Extraction The Segmentation F-score The Token & Separator Model Experiments Conclusion and further work Problems with the segmentation F-score: 3. Tag switches Not clear how to deal with tag switches, i.e., with cases in which the boundaries of a segment have been recognized (more or less exactly, according to one of the three models above) but the right tag has not. E.g., tagging “San Diego” as PER instead of LOC 14 / 29 Andrea Esuli and Fabrizio Sebastiani Evaluating Information Extraction

Evaluating Information Extraction Andrea Esuli and Fabrizio - PowerPoint PPT Presentation

Evaluating Information Extraction Andrea Esuli and Fabrizio Sebastiani Istituto di Scienza e Tecnologie dellInformazione Consiglio Nazionale delle Ricerche Via Giuseppe Moruzzi, 1 56124 Pisa, Italy E-mail: { firstname . lastname }

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Selective Sampling for Information Extraction with a Committee of Classifiers Evaluating Machine

3. Feature Extraction 3.1 Feature Extraction from Speech or other types of audio like music

Convex relaxations for weakly supervised information extraction Edouard Grave Columbia

Information Extraction Pedro Szekely Information Sciences Institute, USC Viterbi School of

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

Automated Feature Extraction Automated Feature Extraction for Object Recognition for Object

HANDLING UNCERTAINTY IN INFORMATION EXTRACTION Maurice van Keulen and Mena Badieh Habib URSW 23

SI485i : NLP Set 13 Information Extraction Information Extraction Yesterday GM released

SI425 : NLP Set 13 Information Extraction Information Extraction Yesterday GM released third

Sequence Labeling Markov Models Many information extraction tasks can be formulated as

SI485i : NLP Set 13 Information Extraction Information Extraction Yesterday GM released

Information Extraction Using the Structured Language Model Ciprian Chelba, Milind Mahajan

Multi-Source Information Extraction Valentin Tablan University of Sheffield University of

Towards an Ecosystem for Verifying Implementations of BFT protocols Ivana Vukotic, Vincent Rahli,

The Future of Interaction Technology Trends Sensors & Implicit Interaction New Modalities:

Exploiting Time-based Synonyms in Searching Document Archives Nattiya Kanhabua and Kjetil

Secrets of Conflict Resolution CodeStock 2018 April 20, 2018 Who is Chad Green Data &

Introduction Decision Tree for PlayTennis (Mitchell) CSCE CSCE 478/878 478/878 Outlook

Transition Planning One Agencys Ongoing Experience Background per the Fairfax County Park

Extending Decision Trees Alice Gao Lecture 10 Based on work by K. Leyton-Brown, K. Larson, and

Welcome! Everything youll need to know is on the master website:

Evaluating Information Extraction Andrea Esuli and Fabrizio - PowerPoint PPT Presentation

Evaluating Information Extraction Andrea Esuli and Fabrizio Sebastiani Istituto di Scienza e Tecnologie dellInformazione Consiglio Nazionale delle Ricerche Via Giuseppe Moruzzi, 1 56124 Pisa, Italy E-mail: { firstname . lastname }

uf: Minimizing the Coq Extraction TCB Eric Mullen , Stuart Pernsteiner, James Wilcox, Zachary

Soil Extraction Cell: An Alternative Soil Extraction Cell: An Alternative Method of Soil

Declarative Information Extraction Declarative Information Extraction Using Datalog Datalog with

Selective Sampling for Information Extraction with a Committee of Classifiers Evaluating Machine

3. Feature Extraction 3.1 Feature Extraction from Speech or other types of audio like music

Convex relaxations for weakly supervised information extraction Edouard Grave Columbia

Information Extraction Pedro Szekely Information Sciences Institute, USC Viterbi School of

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

Automated Feature Extraction Automated Feature Extraction for Object Recognition for Object

HANDLING UNCERTAINTY IN INFORMATION EXTRACTION Maurice van Keulen and Mena Badieh Habib URSW 23

SI485i : NLP Set 13 Information Extraction Information Extraction Yesterday GM released

SI425 : NLP Set 13 Information Extraction Information Extraction Yesterday GM released third

Sequence Labeling Markov Models Many information extraction tasks can be formulated as

SI485i : NLP Set 13 Information Extraction Information Extraction Yesterday GM released

Information Extraction Using the Structured Language Model Ciprian Chelba, Milind Mahajan

Multi-Source Information Extraction Valentin Tablan University of Sheffield University of

Towards an Ecosystem for Verifying Implementations of BFT protocols Ivana Vukotic, Vincent Rahli,

The Future of Interaction Technology Trends Sensors &amp; Implicit Interaction New Modalities:

Exploiting Time-based Synonyms in Searching Document Archives Nattiya Kanhabua and Kjetil

Secrets of Conflict Resolution CodeStock 2018 April 20, 2018 Who is Chad Green Data &amp;

Introduction Decision Tree for PlayTennis (Mitchell) CSCE CSCE 478/878 478/878 Outlook

Transition Planning One Agencys Ongoing Experience Background per the Fairfax County Park

Extending Decision Trees Alice Gao Lecture 10 Based on work by K. Leyton-Brown, K. Larson, and

Welcome! Everything youll need to know is on the master website:

The Future of Interaction Technology Trends Sensors & Implicit Interaction New Modalities:

Secrets of Conflict Resolution CodeStock 2018 April 20, 2018 Who is Chad Green Data &