Learning Non-Isomorphic Tree Mappings for Machine Translation - PDF document

Learning Non-Isomorphic Tree Mappings for Machine Translation Syntax-Based Machine Translation Jason Eisner - Johns Hopkins Univ. • Previous work assumes essentially isomorphic trees – Wu 1995, Alshawi et al. 2000, Yamada & Knight 2000 a A • But trees are not isomorphic! – Discrepancies between the languages b B report misinform – Free translation in the training data 2 words become 1 reorder dependents a A wrongly events to-John of him 0 words become 1 b B misinform report events wrongly events to-John him of 0 words become 1 the events “wrongly report events to-John” “him misinform of the events” the Synchronous Tree Substitution Grammar Synchronous Tree Substitution Grammar Two training trees, showing a free translation from French to English. Two training trees, showing a free translation from French to English. A possible alignment is shown in orange. donnent donnent kiss Start kiss (“give”) (“give”) à (“to”) à (“to”) baiser Sam baiser Sam NP NP Sam Sam Adv Adv (“kiss”) (“kiss”) often often null null kids kids un un (“a”) (“a”) beaucoup beaucoup quite Adv Adv quite null null (“lots”) (“lots”) NP NP d’ d’ NP (“of”) (“of”) NP enfants enfants (“kids”) (“kids”) “beaucoup d’enfants donnent un baiser à Sam” � “kids kiss Sam quite often” “beaucoup d’enfants donnent un baiser à Sam” � “kids kiss Sam quite often” Synchronous Tree Substitution Grammar Synchronous Tree Substitution Grammar Two training trees, showing a free translation from French to English. Two training trees, showing a free translation from French to English. A possible alignment is shown in orange. A possible alignment is shown in orange. A much worse alignment ... donnent donnent Start kiss Start kiss (“give”) (“give”) à (“to”) à (“to”) baiser Sam baiser Sam NP NP NP Sam Sam Adv Adv (“kiss”) (“kiss”) often often null null kids kids un un NP (“a”) (“a”) beaucoup beaucoup quite Adv Adv quite null null (“lots”) (“lots”) NP NP d’ NP d’ NP NP (“of”) (“of”) Adv enfants enfants (“kids”) (“kids”) “beaucoup d’enfants donnent un baiser à Sam” � “kids kiss Sam quite often” “beaucoup d’enfants donnent un baiser à Sam” � “kids kiss Sam quite often” 1

Synchronous Tree Substitution Grammar Grammar = Set of Elementary Trees donnent Two training trees, showing a free translation from French to English. Start kiss (“give”) A possible alignment is shown in orange. à (“to”) Alignment shows how trees are generated synchronously from “little trees” ... baiser NP Adv (“kiss”) donnent idiomatic null Start Start kiss (“give”) un translation à (“to”) (“a”) NP baiser Sam NP NP Sam Adv Adv (“kiss”) often null null un kids (“a”) beaucoup Adv Adv quite null null (“lots”) NP NP d’ kids NP (“of”) NP enfants enfants (“kids”) NP (“kids”) “beaucoup d’enfants donnent un baiser à Sam” � “kids kiss Sam quite often” NP Sam Sam Grammar = Set of Elementary Trees Grammar = Set of Elementary Trees donnent donnent Start kiss Start kiss (“give”) (“give”) à (“to”) à (“to”) baiser baiser NP NP NP Sam Sam Adv Adv (“kiss”) (“kiss”) idiomatic null null kids un un translation (“a”) (“a”) enfants (“kids”) NP NP NP beaucoup (“lots”) NP d’ “beaucoup d’” deletes kids (“of”) NP inside the tree kids enfants (“kids”) NP NP enfants NP (“kids”) NP Sam Sam Sam Sam Grammar = Set of Elementary Trees Grammar = Set of Elementary Trees donnent donnent Start kiss Start kiss (“give”) (“give”) à (“to”) à (“to”) baiser baiser NP NP Adv Adv (“kiss”) (“kiss”) null null un un kids (“a”) (“a”) beaucoup beaucoup (“lots”) NP NP (“lots”) NP NP d’ d’ “beaucoup d’” deletes “beaucoup d’” matches (“of”) inside the tree NP (“of”) nothing in English NP NP enfants (“kids”) kids kids NP NP enfants enfants NP NP (“kids”) (“kids”) Sam Sam Sam Sam 2

Grammar = Set of Elementary Trees Probability model similar to PCFG donnent Start kiss (“give”) à (“to”) Probability of generating training baiser NP trees T1, T2 with alignment A Adv (“kiss”) P(T1, T2, A) = ∏ p(t1,t2,a | n) null un adverbial subtree (“a”) matches nothing in French NP probabilities of the “little” Adv beaucoup trees that are used often null (“lots”) NP d’ p( | ) Adv kids (“of”) NP null report misinform VP VP Adv quite wrongly NP null NP NP enfants is given by a maximum entropy model (“kids”) NP Sam Sam Form of model of big tree pairs Maxent model of little tree pairs p( | ) Joint model P θ (T1,T2). report misinform VP VP Wise to use noisy-channel form: P θ (T1 | T2) * P θ (T2) But any joint model will do. wrongly NP NP could be trained on zillions train on paired trees of target-language trees (hard to get) FEATURES • verb incorporates adverb child? report+wrongly ↔ misinform? • • verb incorporates child 1 of 3? In synchronous TSG, aligned big tree pair is generated (use dictionary) • children 2, 3 switch positions? by choosing a sequence of little tree pairs: report ↔ misinform? (at root) • • common tree sizes & shapes? P(T1, T2, A) = ∏ p(t1,t2,a | n) wrongly ↔ misinform? • • ... etc. .... Inside Probabilities Inside Probabilities a a A A o n l y O b B b B ( misinform misinform n 2 report VP report VP ) wrongly events to-John wrongly events to-John of of him him NP events events NP the the β ( ) = ... p( | ) β ( ) = ... p( | ) misinform misinform VP misinform VP VP report report report VP VP NP wrongly NP * β ( ) * β ( ) + ... * β ( ) * β ( ) + ... events NP of to-John NP him 3

Learning Non-Isomorphic Tree Mappings for Machine Translation - PDF document

Learning Non-Isomorphic Tree Mappings for Machine Translation Syntax-Based Machine Translation Jason Eisner - Johns Hopkins Univ. Previous work assumes essentially isomorphic trees Wu 1995, Alshawi et al. 2000, Yamada & Knight 2000

Learning Non-Isomorphic Tree Mappings for Machine Translation Jason Eisner, Computer Science

Isomorphic Data Type Transformations Alessandro Coglio Stephen Westfold KESTREL INSTITUTE

Are Hybrid Physical Designs Important? 1 B+ tree 2 C O L B+ tree 3 ? C O L C O L B+ tree

Non-linearity in Davenport-Schinzel Sequences Seth Pettie University of Michigan Isomorphism

61A Lecture 21 Announcements Binary Trees Binary Tree Class 4 Binary Tree Class class

On the behavior of pro-isomorphic zeta functions under base extension Michael M. Schein Bar-Ilan

Covering mappings. Theory and applications S.E. Zhukovskiy February 2013 1. Covering mappings

Locator/ID Separation: Study on the cost of Mappings Caching and Mappings Lookups Technical

Results on set mappings P eter Komj ath E otv os U. Budapest 15th International

On the branch set of mappings of finite and bounded distortion. R ami Luisto Univerzita Karlova

Tree-sitter @maxbrunsfeld What is Tree-sitter? Why I wrote Tree-sitter What were

Introduction to Machine Learning Introduction to Machine Learning Introduction to Machine

Quantum Machine Learning Adam Brown, HEP-AI Quantum Computing Machine Learning Quantum

MICROSOFT AZURE MACHINE LEARNING Oscar Naim Microsoft Microsoft Azure Machine Learning What is

MACHINE LEARNING Overview 1 1 APPLIED MACHINE LEARNING 2011-2012 APPLIED MACHINE LEARNING

MACHINE LEARNING kernels 1 MACHINE LEARNING 2012 MACHINE LEARNING Kernels: Intuition How

Principles for secure design Some of the slides and content are from Mike Hicks Coursera

KISS: A Bit Too Simple Greg Rose ggr@qualcomm.com Outline q KISS random number generator

HamiltonJacobi meets M obius Alon E. Faraggi AEF, Marco Matone, PLB 450 (1999) 34; ...

Accelerating Materials Discovery with High-Throughput DFT: The Open Quantum Materials Database

Demo: VoxelCAD Csongor Kiss, Toby Shaw What is VoxelCAD? Collaborative voxel-based CAD tool

K.I.S.S. 1969 Thomson writes interpreter B based on BCPL -- Ritchie improves on B and called it

Digital Content Management Standards for the Personalised Multlingual Web Alexander OConnor

is Practical gnes Kiss Thomas Schneider TU Darmstadt Eurocrypt 2016 May 11, 2016 Universal