Reranking and Self-Training for Parser Adaptation David McClosky, - PowerPoint PPT Presentation

Reranking and Self-Training for Parser Adaptation David McClosky, Eugene Charniak, and Mark Johnson { dmcc|ec|mj } @cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 1

Overview Introduction and Previous Work Parser portability Parser adaptation Reranker portability Analysis Future Work and Conclusions David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 2

Parsing David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 3

Parameters Parser as in [Charniak and Johnson ACL 2005] Corpus # words # sentences Parameters 950,028 39,832 ∼ 2,200,000 WSJ 373,152 19,740 ∼ 1,300,000 BROWN Number of parameters is a function of training data. David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 4

Parsing David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 5

n -best Parsing David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 6

Reranking Parsers David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 7

More Parameters Reranking parser as in [Charniak and Johnson 2005] 14 feature schemas Extract features according to schemas then estimate feature weights Corpus Parser parameters Reranker features ∼ 2,200,000 ∼ 1,300,000 WSJ ∼ 1,300,000 ∼ 700,000 BROWN Again, number of parameters is a function of training data. David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 8

Corpora and Domains WSJ : labeled news text, about 40,000 parses NANC : unlabeled news text, about 24 million sentences BROWN : labeled text from various domains, about 24,000 parses total David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 9

Corpora and Domains WSJ : labeled news text, about 40,000 parses NANC : unlabeled news text, about 24 million sentences BROWN : labeled text from various domains, about 24,000 parses total Divisions as in [Bacchiani et al. 2006] (based on [Gildea 2001]) 19,740 train, 2,078 tune, 2,425 test Treebanked sections are predominantly fiction Each division of the corpus consists of sentences from all available genres David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 9

Self-Training [McClosky, Charniak, and Johnson NAACL 2006] Train model from labeled data train reranking parser on WSJ Use model to annotate unlabeled data use model to parse NANC Combine annotated data with labeled training data merge parsed NANC data with WSJ training data Train a new model from the combined data train reranking parser on WSJ + NANC data David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 10

Overtrained? Question: How does setting so many parameters from Wall Street Journal data affect parsing performance on the Brown corpus? David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 11

Previous Work f -measure Training Testing Gildea Bacchiani 86.4 87.0 WSJ WSJ 80.6 81.1 WSJ BROWN 84.0 84.7 BROWN BROWN WSJ + BROWN 84.3 85.6 BROWN [Gildea 2001] , [Bacchiani et al. 2006] David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 12

Summary of findings The self-trained WSJ + NANC model does not appear to be overtrained. Both self-training and reranking techniques are fairly portable across domains. WSJ data with these techniques gives performance almost as good as actual BROWN corpus (does not work as well with more distant domains) David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 13

Overview Introduction and Previous Work Parser portability Parser adaptation Reranker portability Analysis Future Work and Conclusions David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 14

Parser Portability Task: Use existing data/models from source domain to parse target domain. Train: WSJ Test: BROWN Variables: Parser vs. reranker parser Effect of self-training on NANC David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 15

Parser Portability Train Test Parser Reranking Parser 89.7 91.0 WSJ WSJ 83.9 85.8 WSJ BROWN f -score on WSJ section 23 and BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 16

Parser Portability Parsing model Parser Reranking Parser WSJ baseline 83.9 85.8 WSJ +50k NANC 84.8 86.6 WSJ +250k NANC 85.7 87.2 WSJ +500k NANC 86.0 87.3 WSJ +1,000k NANC 86.2 87.3 WSJ +1,500k NANC 86.2 87.6 WSJ +2,500k NANC 86.4 87.7 f -score on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 17

Parser Portability Parsing model Parser Reranking Parser WSJ baseline 83.9 85.8 WSJ +50k NANC 84.8 86.6 WSJ +250k NANC 85.7 87.2 WSJ +500k NANC 86.0 87.3 WSJ +1,000k NANC 86.2 87.3 WSJ +1,500k NANC 86.2 87.6 WSJ +2,500k NANC 86.4 87.7 BROWN baseline 86.4 87.7 f -score on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 18

Parser Adaptation Task: Use existing data/models from source domain with some target domain material to parse target domain. Train: WSJ and/or BROWN Test: BROWN Variables: Number of self-trained sentences added Amount of BROWN training data David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 19

Labeled In-domain Data Parser model Parser Reranker WSJ alone 83.9 85.8 BROWN alone 86.3 87.4 WSJ + BROWN 86.5 88.1 f -score on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 20

Adding Self-Trained Data Parser model Parser Reranker WSJ alone 83.9 85.8 WSJ +2,500k NANC 86.4 87.7 BROWN alone 86.3 87.4 BROWN +250k NANC 86.8 88.1 WSJ + BROWN 86.5 88.1 WSJ + BROWN +250k NANC 86.8 88.1 f -score on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 21

Reranker Portability Reranker Parser model Parser alone WSJ BROWN 82.9 85.2 85.2 WSJ WSJ + NANC 87.1 87.8 87.9 86.7 88.2 88.4 BROWN f -scores on BROWN test section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 22

Analysis Overview Oracle scores Parser agreement Per-category f -scores Factor analysis David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 23

Oracle Scores Model 1-best 10-best 25-best 50-best 82.6 88.9 90.7 91.9 WSJ WSJ + NANC 86.4 92.1 93.5 94.3 86.3 92.0 93.3 94.2 BROWN f -score on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 24

Parser Agreement Bracketing agreement f -score 88.03% Complete match 44.92% Average crossing brackets 0.94 POS Tagging agreement 94.85% Agreement of parses from WSJ + NANC reranking parser with parses from BROWN reranking parser David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 25

Per-Category f -scores Description Size WSJ + NANC ∆ BROWN Popular Lore 271 87.3 89.6 2.28 Letters 281 87.6 87.1 -0.45 General fiction 333 87.2 85.9 -1.29 Mystery 318 88.7 88.3 -0.45 Science fiction 76 87.7 88.8 1.17 Adventure 378 89.7 89.0 -0.64 Romance 338 88.0 86.6 -1.40 Humor 83 84.6 87.0 2.45 f -scores on BROWN development section David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 26

Factor Analysis Generalized linear model with binomial link with the predicted variable as BROWN f -score > WSJ + NANC f -score Explanatory variables: sentence length number of prepositions number of conjunctions BROWN subcorpus ID David McClosky - dmcc@cs.brown.edu - ACL 2006 - 7.18.2006 - 27

Reranking and Self-Training for Parser Adaptation David McClosky, - PowerPoint PPT Presentation

Reranking and Self-Training for Parser Adaptation David McClosky, Eugene Charniak, and Mark Johnson { dmcc|ec|mj } @cs.brown.edu Brown Laboratory for Linguistic Information Processing (BLLIP) David McClosky - dmcc@cs.brown.edu - ACL 2006 -

https://bazel.build/ Inputs /usr/bin/cc Action Outputs ./parser.h cc -I. -c parser.c -o

1 2 3+4 2 type Parser = String Tree type Parser = String ( Tree, String) type Parser =

HiddenVariable Models for Discriminative Reranking Terry Koo and Michael Collins {

Building a Predictive Parser I.e., How to build the parse table for a recursive-descent parser 1

Tasks of a Parser Tasks of a Parser Document Parser Interfaces Document Parser Interfaces

Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Dynamic Adaptation Minema Minema

Neural Reranking Improves Subjective Quality of Machine Translation: NAIST at WAT 2015 Graham

Parser Evaluation and the BNC Standard Parser Evaluation The Parsers Jennifer Foster and Josef

Ensemble Models for Dependency Parsing: Cheap and Good? Mihai Surdeanu and Christopher D. Manning

Parser Larissa von Witte Institut fr Softwaretechnik und Programmiersprachen 11. Januar 2016

Coastal Adaptation Kellie Fisher FCERM Senior Advisor Why Adaptation? Adaptation to a

Session 2 Overview Juergen Branke The Baldwin Effect Hinders Self-Adaptation Jim Smith Two

Keep Calm Keep Calm and Use Parser and Use Parser Nov, 2015 Howard Huang, Huawei Julien

A Transition-Based Directed Acyclic Graph Parser for Universal Conceptual Cognitive Annotation

Outline LR Parsing Review of bottom-up parsing LALR Parser Generators Computing the

Models of Human Parsing Experimental Data 2 Informatics 2A: Lecture 22 Eye-tracking Reading

Social Networking with Frientegrity: Privacy and Integrity with an Untrusted Provider Ariel

Building an Aragon App An Aragon App uses standard interfaces to support governance and

Align, Disambiguate, and Walk A Unified Approach for Measuring Semantic Similarity Semantic

How to Read/Write an International Conference Paper Graham Neubig Nara Institute of Science and

Rogue Femtocell Owners: How Mallory Can Monitor My Devices David Malone, Darren F. Kavanagh and

A Low-budget Tagger for Old Czech Jirka Hana 1 Anna Feldman 2 Katsiaryna Aharodnik 2 1 Charles

[show me your privileges and I will lead you to SYSTEM] Andrea Pierini, Paris, June 19 th 2019 1

Sparsity in Dependency Grammar Induction Jennifer Gillenwater 1 Kuzman Ganchev 1 ca 2 Jo ao