Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase - PDF document

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase Alignment Dan Klein – UC Berkeley 1

Phrase Scoring Phrase Size � Phrases do help � Learning weights has been tried, several times: � But they don’t need � [Marcu and Wong, 02] to be long � [DeNero et al, 06] � … and others � Why should this be? aiment poisson � Seems not to work well, for les chats le frais . a variety of partially understood reasons cats like � Main issue: big chunks get fresh all the weight, obvious fish priors don’t help � . Though, [DeNero et al 08] . Lexical Weighting Phrase Alignment 2

Identifying Phrasal Translations In the past two years , a number of US citizens … 两年中一批美国公民过去 , … past two year in , one lots US citizen Past Go over Phrase alignment models: Choose a segmentation and a one-to-one phrase alignment Underlying assumption: There is a correct phrasal segmentation Unique Segmentations? Identifying Phrasal Translations In the past two years , a number of US citizens … In the past two years , a number of US citizens … 过去两年中 , 一批美国公民 … 过去两年中 , 一批美国公民 … past two year in , one lots US citizen past two year in , one lots US citizen Problem 1: Overlapping phrases can be useful (and This talk: Modeling sets of overlapping, multi-scale phrase pairs complementary) Input: sentence pairs Problem 2: Phrases and their sub-phrases can both be useful Output: extracted phrases Hypothesis: This is why models of phrase alignment don’t work well Our Task: Predict Extraction Sets … But the Standard Pipeline has Overlap! Conditional model of extraction sets given sentence pairs 过去 past 0 0 过去过去两 two 1 1 两两 2 2 年年年 year 3 3 中中 4 4 中 in 0 1 2 3 4 5 0 1 2 3 4 5 In the past two years In the past two years In the past two years Sentence Word Extracted Sentence Extracted Phrases + Extracted Pair Alignment Phrases Pair ``Word Alignments’’ Phrases M O T I V A T I O N M O T I V A T I O N 4

Alignments Imply Extraction Sets Incorporating Possible Alignments Word-level Sure and Word-to-span Extraction set Word-to-span Extraction set alignment possible alignments of bispans alignments of bispans links word links 0 0 过去 past 过去 past 1 1 两 two 两 two 2 2 年 year 年 year 3 3 中 in 中 in 4 4 0 1 2 3 4 5 0 1 2 3 4 5 In the past two years In the past two years M O D E L M O D E L Linear Model for Extraction Sets Features on Bispans and Sure Links Some features on sure links HMM posteriors go over 过 Presence in dictionary Features on 地球 Earth Numbers & punctuation 0 sure links 过去 over the Earth 1 Features on bispans Features on 两 all bispans HMM phrase table features: e.g., phrase relative frequencies 2 年 Lexical indicator features for phrases with common words 3 中 Shape features: e.g., Chinese character counts 4 0 1 2 3 4 5 In the past two years Monolingual phrase features: e.g., “the _____” M O D E L F E A T U R E S Getting Gold Extraction Sets Discriminative Training with MIRA Hand Aligned: Word-to-span Extraction set Sure and possible alignments of bispans word links Deterministic: Find min and max alignment index for each word Guess (arg max w �ɸ ) Gold (annotated) Loss function: F-score of bispan errors (precision & recall) Deterministic: A bispan is included iff every Minimal change to w such that the gold is word within the bispan aligns within the Training Criterion: bispan preferred to the guess by a loss-scaled margin T R A I N I N G T R A I N I N G 5

Inference: An ITG Parser Experimental Setup ITG captures some bispans Chinese-to-English newswire Parallel corpus: 11.3 million words; sentences length ≤ 40 Supervised data: 150 training & 191 test sentences (NIST ‘02) Unsupervised Model: Jointly trained HMM (Berkeley Aligner) MT systems: Tuned and tested on NIST ‘04 and ‘05 I N F E R E N C E R E S U L T S Baselines and Limited Systems Word Alignment Performance HMM: Joint training & competitive posterior decoding 80.4 HMM Source of many features for supervised models 83.6 ITG 1 - AER 83.1 Coarse State-of-the-art unsupervised baseline 84.4 Full 76.9 ITG: Supervised ITG aligner with block terminals 83.8 Recall Re-implementation of Haghighi et al., 2009 84.2 84.0 State-of-the-art supervised baseline 84.0 83.4 Coarse: Supervised block ITG + possible alignments Precision 82.2 84.7 Coarse pass of full extraction set model R E S U L T S R E S U L T S Extracted Bispan Performance Translation Performance (BLEU) 59.9 34.5 62.8 HMM F5 34.7 72.8 Joshua ITG 74.0 35.7 Coarse 64.1 35.9 HMM Full 68.4 F1 ITG 71.4 33.2 71.6 Coarse 33.6 Full 59.5 Moses 62.3 34.2 Recall 72.9 34.4 74.2 69.5 31 32 33 34 35 36 37 75.8 Precision 70.0 Supervised conditions also included HMM alignments 69.0 R E S U L T S R E S U L T S 6

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase - PDF document

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase Alignment Dan Klein UC Berkeley 1 Phrase Scoring Phrase Size Phrases do help Learning weights has been tried, several times: But they dont need [Marcu

NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Plane partitions with two-periodic weights Sevak Mkrtchyan University of Rochester GGI June 15,

Building a Phrase-based SMT System Graham Neubig & Kevin Duh Nara Institute of Science and

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

K nearest neighbor LING 572 Advanced Statistical Methods for NLP Shane Steinert-Threlkeld

What Is an Expanded Noun Phrase? An expanded noun phrase gives much more detail than a simple

Translation Model Parallel corpus source target translation e f phrase phrase features

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Statistical Phrase-Based Translation Philipp Koehn, Franz Och, Daniel Marcu koehn@isi.edu,

Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein UC Berkeley Machine

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein

Statistical NLP Spring 2011 Lecture 8: Word Alignment Dan Klein UC Berkeley Phrase-Based

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Use of Spirituality in Healing & Recovery in Substance Abuse Treatment Amandia

W&M work packages Mike Kordosky, Je ff Nelson, Tricia Vahle NSF mid-scale proposal for APAs

2 Background Many articles have been published which focus on the evolution of mechanical

EX EXPL PLORA ORATION TION OF OF OR ORGAN ANIC IC SYNTHE THESIS SIS ROU OUTES ES TO OB

IIT Bombay CDEEP Autumn 2009 Introduction to Programmable Logic Design Flow Presented by-

Climate Adaptation Addendum Conservation Board of Burlington, VT What is it? Addition to

Produce Safety Educators Call #27 October 2, 2017 Instructions All participants are

CSE 110A: Winter 2020 Fundamentals of Compiler Design I Data on the Heap Owen Arden

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase - PDF document

Phrase Weights Statistical NLP Spring 2011 Lecture 10: Phrase Alignment Dan Klein UC Berkeley 1 Phrase Scoring Phrase Size Phrases do help Learning weights has been tried, several times: But they dont need [Marcu

NLP Programming Tutorial 8 - Phrase Structure Parsing Graham Neubig Nara Institute of Science

CTA WEIGHTS AND CTA WEIGHTS AND DIMENSIONS DIMENSIONS INITIATIVES INITIATIVES Meeting of the

Plane partitions with two-periodic weights Sevak Mkrtchyan University of Rochester GGI June 15,

Building a Phrase-based SMT System Graham Neubig &amp; Kevin Duh Nara Institute of Science and

SI485i : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

SI425 : NLP Missing Topics and the Future Who cares about NLP? NLP has expanded quickly

K nearest neighbor LING 572 Advanced Statistical Methods for NLP Shane Steinert-Threlkeld

What Is an Expanded Noun Phrase? An expanded noun phrase gives much more detail than a simple

Translation Model Parallel corpus source target translation e f phrase phrase features

NLP: Two pictures Wordnet and Word Sense Problem NLP Disambiguation Semantics NLP Trinity

Recurrent Neural Networks Graham Neubig Site https://phontron.com/class/nn4nlp2017/ NLP and

Statistical Phrase-Based Translation Philipp Koehn, Franz Och, Daniel Marcu koehn@isi.edu,

Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein UC Berkeley Machine

Machine Translation: Examples Statistical NLP Spring 2011 Lecture 7: Phrase-Based MT Dan Klein

Statistical NLP Spring 2011 Lecture 8: Word Alignment Dan Klein UC Berkeley Phrase-Based

Ontologies for NLP NLP for Ontologies FOIS 2014 - LogOnto Workshop on Logics and Ontologies for

Use of Spirituality in Healing &amp; Recovery in Substance Abuse Treatment Amandia

W&amp;M work packages Mike Kordosky, Je ff Nelson, Tricia Vahle NSF mid-scale proposal for APAs

2 Background Many articles have been published which focus on the evolution of mechanical

EX EXPL PLORA ORATION TION OF OF OR ORGAN ANIC IC SYNTHE THESIS SIS ROU OUTES ES TO OB

IIT Bombay CDEEP Autumn 2009 Introduction to Programmable Logic Design Flow Presented by-

Climate Adaptation Addendum Conservation Board of Burlington, VT What is it? Addition to

Produce Safety Educators Call #27 October 2, 2017 Instructions All participants are

CSE 110A: Winter 2020 Fundamentals of Compiler Design I Data on the Heap Owen Arden

Building a Phrase-based SMT System Graham Neubig & Kevin Duh Nara Institute of Science and

Use of Spirituality in Healing & Recovery in Substance Abuse Treatment Amandia

W&M work packages Mike Kordosky, Je ff Nelson, Tricia Vahle NSF mid-scale proposal for APAs