Learning Non-Isomorphic Tree Mappings for Machine Translation Jason - PDF document

Appeared in Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), Companion Volume , Sapporo, July 2003. Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner, Computer Science Dept., Johns Hopkins Univ. <jason@cs.jhu.edu> Abstract 2 A Natural Proposal: Synchronous TSG Often one may wish to learn a tree-to-tree mapping, training it We make the quite natural proposal of using a syn- on unaligned pairs of trees, or on a mixture of trees and strings. chronous tree substitution grammar (STSG). An STSG Unlike previous statistical formalisms (limited to isomorphic is a collection of (ordered) pairs of aligned elementary trees), synchronous TSG allows local distortion of the tree topol- trees . These may be combined into a derived pair of ogy. We reformulate it to permit dependency trees, and sketch trees. Both the elementary tree pairs and the operation to EM/Viterbi algorithms for alignment, training, and decoding. combine them will be formalized in later sections. As an example, the tree pair shown in the introduction 1 Introduction: Tree-to-Tree Mappings might have been derived by “vertically” assembling the Statistical machine translation systems are trained on 6 elementary tree pairs below. The ⌢ symbol denotes pairs of sentences that are mutual translations. For exam- a frontier node of an elementary tree, which must be ple, ( beaucoup d’enfants donnent un baiser ` a Sam , kids replaced by the circled root of another elementary tree. kiss Sam quite often ). This translation is somewhat free, If two frontier nodes are linked by a dashed line labeled as is common in naturally occurring data. The first sen- with the state X , then they must be replaced by two roots tence is literally Lots of’children give a kiss to Sam. that are also linked by a dashed line labeled with X . This short paper outlines “natural” formalisms and algorithms for training on pairs of trees . Our methods work Start donnent kiss on either dependency trees (as shown) or phrase-structure baiser a null (0,Adv) trees. Note that the depicted trees are not isomorphic. NP un donnent kiss (0,Adv) null often baiser a null NP beaucoup often Sam kids (0,Adv) beaucoup un NP Sam d’ quite d’ NP (0,Adv) null quite enfants NP Our main concern is to develop models that can align enfants kids and learn from these tree pairs despite the “mismatches” NP in tree structure. Many “mismatches” are characteristic Sam Sam of a language pair: e.g., preposition insertion ( of → ǫ ), multiword locutions ( kiss ↔ give a kiss to; misinform The elementary trees represent idiomatic translation ↔ wrongly inform ), and head-swapping ( float down ↔ “chunks.” The frontier nodes represent unfilled roles in descend by floating ). Such systematic mismatches should the chunks, and the states are effectively nonterminals be learned by the model, and used during translation. that specify the type of filler that is required. Thus, don- It is even helpful to learn mismatches that merely tend nent un baiser ` a (“give a kiss to”) corresponds to kiss , to arise during free translation. Knowing that beaucoup with the French subject matched to the English subject, d’ is often deleted will help in aligning the rest of the tree. and the French indirect object matched to the English When would learned tree-to-tree mappings be useful? direct object. The states could be more refined than Obviously, in MT, when one has parsers for both the those shown above: the state for the subject, for exam- source and target language. Systems for “deep” anal- ple, should probably be not NP but a pair ( N pl , NP 3s ). ysis and generation might wish to learn mappings be- STSG is simply a version of synchronous tree- tween deep and surface trees (B¨ ohmov´ a et al., 2001) adjoining grammar or STAG (Shieber and Schabes, 1990) or between syntax and semantics (Shieber and Schabes, that lacks the adjunction operation. (It is also equivalent 1990). Systems for summarization or paraphrase could to top-down tree transducers.) What, then, is new here? also be trained on tree pairs (Knight and Marcu, 2000). First, we know of no previous attempt to learn the Non-NLP applications might include comparing student- “chunk-to-chunk” mappings. That is, we do not know at written programs to one another or to the correct solution. training time how the tree pair of section 1 was derived, Our methods can naturally extend to train on pairs of or even what it was derived from. Our approach is to forests (including packed forests obtained by chart pars- reconstruct all possible derivations , using dynamic pro- ing). The correct tree is presumed to be an element of gramming to decompose the tree pair into aligned pairs the forest. This makes it possible to train even when the of elementary trees in all possible ways. This produces correct parse is not fully known, or not known at all. a packed forest of derivations, some more probable than

Learning Non-Isomorphic Tree Mappings for Machine Translation Jason - PDF document

Appeared in Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), Companion Volume , Sapporo, July 2003. Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner, Computer Science

Learning Non-Isomorphic Tree Mappings for Machine Translation Syntax-Based Machine Translation

Isomorphic Data Type Transformations Alessandro Coglio Stephen Westfold KESTREL INSTITUTE

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Non-linearity in Davenport-Schinzel Sequences Seth Pettie University of Michigan Isomorphism

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

On the behavior of pro-isomorphic zeta functions under base extension Michael M. Schein Bar-Ilan

Covering mappings. Theory and applications S.E. Zhukovskiy February 2013 1. Covering mappings

Locator/ID Separation: Study on the cost of Mappings Caching and Mappings Lookups Technical

Results on set mappings P eter Komj ath E otv os U. Budapest 15th International

On the branch set of mappings of finite and bounded distortion. R ami Luisto Univerzita Karlova

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Pseudo-random numbers: a line at a time mostly Nelson H. F.

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

TRECVID-2006: Search Task Alan Smeaton Dublin City University & Tzveta Ianeva NIST

Rainbow-free colorings in PG( n , q ) Gy orgy Kiss, ELTE Budapest September 21 st , 2012, Bovec

Set01 - Data Management STAT 401 (Engineering) - Iowa State University January 11, 2017

Doesn't matter what media you use you still have to: 1. Standout from Competition 2. Create

Matthew Series Lesson #175 October 29, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

CS6 Practical System Skills Fall 2019 edition Leonhard Spiegelberg lspiegel@cs.brown.edu 19

Learning Non-Isomorphic Tree Mappings for Machine Translation Jason - PDF document

Appeared in Proceedings of the 41st Annual Meeting of the Association for Computational Linguistics (ACL 2003), Companion Volume , Sapporo, July 2003. Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner, Computer Science

Learning Non-Isomorphic Tree Mappings for Machine Translation Syntax-Based Machine Translation

Isomorphic Data Type Transformations Alessandro Coglio Stephen Westfold KESTREL INSTITUTE

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Non-linearity in Davenport-Schinzel Sequences Seth Pettie University of Michigan Isomorphism

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

On the behavior of pro-isomorphic zeta functions under base extension Michael M. Schein Bar-Ilan

Covering mappings. Theory and applications S.E. Zhukovskiy February 2013 1. Covering mappings

Locator/ID Separation: Study on the cost of Mappings Caching and Mappings Lookups Technical

Results on set mappings P eter Komj ath E otv os U. Budapest 15th International

On the branch set of mappings of finite and bounded distortion. R ami Luisto Univerzita Karlova

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Pseudo-random numbers: a line at a time mostly Nelson H. F.

Why Nobody Should Care About Operating Systems for Exascale Operating Systems for Exascale Ron

TRECVID-2006: Search Task Alan Smeaton Dublin City University &amp; Tzveta Ianeva NIST

Rainbow-free colorings in PG( n , q ) Gy orgy Kiss, ELTE Budapest September 21 st , 2012, Bovec

Set01 - Data Management STAT 401 (Engineering) - Iowa State University January 11, 2017

Doesn't matter what media you use you still have to: 1. Standout from Competition 2. Create

Matthew Series Lesson #175 October 29, 2017 Dean Bible Ministries www.deanbibleministries.org Dr.

CS6 Practical System Skills Fall 2019 edition Leonhard Spiegelberg lspiegel@cs.brown.edu 19

TRECVID-2006: Search Task Alan Smeaton Dublin City University & Tzveta Ianeva NIST