Discriminative word alignment by learning the Discriminative word - PowerPoint PPT Presentation

Discriminative word alignment by learning the Discriminative word alignment by learning the alignment structure and syntactic divergence alignment structure and syntactic divergence between a language pair between a language pair Sriram Venkatapathy Venkatapathy Aravind Joshi Joshi Sriram Aravind IIIT – IIIT – Hyderabad University of Pennsylvania Hyderabad University of Pennsylvania 1 1

Outline Outline ► Word Alignment ► Word Alignment - - English English- -Hindi Language Pair Hindi Language Pair ► Related approaches ► Related approaches ► Discriminative Re ► Discriminative Re- -ranking approach ranking approach � Features � Features � Parameter optimization using MIRA � Parameter optimization using MIRA � Results � Results ► Future Work and Conclusion ► Future Work and Conclusion 2

े े � े Word- -Alignment Alignment Word ► People of these islands have adopted Hindi as a means of communi People of these islands have adopted Hindi as a means of communication . cation . ► ► इन इन ��प� ��प� क क लोग� ने लोग� ने �हंद� �हंद� भाषा भाषा को को एक एक संपक संपक � भाषा भाषा क क े �प �प म� म� अपना अपना िलया िलया है है . . ► ► ► These islands of people These islands of people hindi hindi language a language a commu commu. language in form of adopted . language in form of adopted- -take take- -be be ► Primary Observation: ► Primary Observation: � � The alignment between English- -Hindi is largely non Hindi is largely non- -monotonic, unlike the monotonic, unlike the The alignment between English alignment between English- -French. French. alignment between English 3

Comparison Comparison Alignments of an example sentence (English-Hindi) 20 Hindi word indices 15 10 French Hindi 5 0 0 5 10 15 English word indices Alignments of an example sentence (English-Hindi) English English 4

Related approaches Related approaches ► Generative models ► Generative models � IBM Models, HMM models (Implemented in � IBM Models, HMM models (Implemented in Giza Giza+ + ) + + ) ► Discriminative models ► Discriminative models � � ( (Taskar Taskar et al., 2005) et al., 2005) � (Moore et al., 2005) � (Moore et al., 2005) 6

Generative models - - Limitations Limitations Generative models � Difficult to add new Parameters. � Difficult to add new Parameters. ► ► The generative story needs to be modified appropriately to The generative story needs to be modified appropriately to incorporate the new parameters. incorporate the new parameters. � Parameters are not optimized. � Parameters are not optimized. ► All the parameters used have equal weights. For example, All the parameters used have equal weights. For example, ► translation probabilities have the same importance as distortion translation probabilities have the same importance as distortion probabilities. probabilities. ► As more complex features are added to the model, the parameters As more complex features are added to the model, the parameters ► need to be optimized appropriately. need to be optimized appropriately. 7

(Taskar Taskar et al., 2005) et al., 2005) - - Limitations Limitations ( � The alignment search and optimization requires that the � The alignment search and optimization requires that the features are local to the alignment link. features are local to the alignment link. th order correlation with other alignments links in � There is 0 � There is 0 th order correlation with other alignments links in an alignment. an alignment. � ( � (Lacoste Lacoste- -Simon et al., 2006) include first Simon et al., 2006) include first- -order features order features (similar to HMM Parameter) and fertility but still there isn’ ’t t (similar to HMM Parameter) and fertility but still there isn much room for more complex global features required for much room for more complex global features required for aligning diverse language pairs such as English- -Hindi. Hindi. aligning diverse language pairs such as English 8

(Moore et al., 2006) - - Limitations Limitations (Moore et al., 2006) ► Structural features are applied on partial structures ( ► Structural features are applied on partial structures (ie ie.., .., every time a new alignment link is considered) every time a new alignment link is considered) � May lead to ruling out good alignments at an early stage. � May lead to ruling out good alignments at an early stage. � Restricts us from using more complex syntactic features. � Restricts us from using more complex syntactic features. (As it is a left to right search). (As it is a left to right search). 9

Discriminative Re- -ranking Approach ranking Approach Discriminative Re ► ► The best alignment The best alignment â â = = argmax argmax score( a | e, h) score( a | e, h) ► Here, e is the h is the ► Here, e is the english english and and h is the hindi hindi sentence. sentence. ► ► score( a | e, h) = score( a | e, h) = score score La ( a | e, h) + score score S ( a | e, h) La ( a | e, h) + S ( a | e, h) 11

Alignment search Alignment search (Discriminative Re- -ranking) ranking) (Discriminative Re ► Three main steps ► Three main steps � Populate the Beam � Populate the Beam ► Use local features to determine K ► Use local features to determine K- -best alignments of source best alignments of source words with words in the target sentence. words with words in the target sentence. � Re � Re- -order the Beam order the Beam ► Re ► Re- -order the above alignments using structural features. order the above alignments using structural features. � Post � Post- -processing processing ► Extend alignments to include other links that can be inferred ► Extend alignments to include other links that can be inferred using simple rules. using simple rules. 12

Alignment search Alignment search (Discriminative Re- -ranking) ranking) (Discriminative Re K=4 13

Populate the Beam Populate the Beam ► Obtain K ► Obtain K- -best candidate alignments using local scores . best candidate alignments using local scores . ► Local score is computed by looking at the features of the ► Local score is computed by looking at the features of the individual alignment links independently. individual alignment links independently. ► score ► score L (e j , h h k ) = W. f f L (e j , h h k ) L (e j , k ) = W. L (e j , k ) ∑ score ( a | e, h) = ∑ ► score ► score La score L (e j , h h k ) La ( a | e, h) = L (e j , k ) 14

Populate the Beam - - 2 2 Populate the Beam ► Task: Populate the beam in the decreasing order of ► Task: Populate the beam in the decreasing order of score La ( a | e, h). score La ( a | e, h). ► Compute the local score of each source word with every ► Compute the local score of each source word with every target word (including NULL). target word (including NULL). ► Top ► Top- -k alignment links of each source word are chosen. k alignment links of each source word are chosen. 15

Populate the Beam - - 2 2 Populate the Beam ► Populating K ► Populating K- -best alignments best alignments � Implemented using Priority Queues. � Implemented using Priority Queues. ► Initial State of Priority Queue ► Initial State of Priority Queue � One entry representing the best alignment (set of best � One entry representing the best alignment (set of best alignment links). alignment links). ► At every iteration ► At every iteration � Pop the best entry from the PQ. � Pop the best entry from the PQ. � Add it ’ s k successor entries back into the PQ. � Add it ’ s k successor entries back into the PQ. 16

Discriminative word alignment by learning the Discriminative word - PowerPoint PPT Presentation

Discriminative word alignment by learning the Discriminative word alignment by learning the alignment structure and syntactic divergence alignment structure and syntactic divergence between a language pair between a language pair Sriram

Discriminative Models Joakim Nivre Uppsala University Department of Linguistics and Philology

Machine Translation: Word Alignment Problem Marcello Federico FBK, Trento - Italy 2013 M.

Sequence Alignment Gerhard Jger ESSLLI 2016 Gerhard Jger Sequence Alignment ESSLLI 2016 1

Sequence Alignment (chapter 6) p The biological problem p Global alignment p Local alignment p

Memory Memory Decoders M bits M bits RWM NVRWM ROM S 0 S 0 Word 0 Word 0 S 1 Word 1 Word

Compilers Alignment Alex Aiken Alignment Most modern machines are 32 or 64 bit 8 bits in

Statistical Machine Translation Overview p EM algorithm Lecture 3 Improved word alignment

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Ben Burr Trail PROJECT ALIGNMENT Project alignment Hamblen Elem School PROJECT ALIGNMENT

Data driven Ontology Alignment Data driven Ontology Alignment Nigam Shah nigam@stanford.edu

Sequence Alignment (chapter 6) The biological problem l Global alignment l Local alignment l

Image alignment Slides from Derek Hoiem, Svetlana Lazebnik Image source Alignment applications

Generative vs. discriminative Generative Discriminative Belief network A is more More

Three models for discriminative machine Three models for discriminative machine translation using

TOD Alignment Rezoning Public Meeting July 18, 2019 TOD Alignment Rezoning The TOD Alignment

This week CSE 527 Sequence alignment Computational Biology More sequence alignment

MECT Microeconometrics Blundell Lecture 3 Selection Models Richard Blundell

Hypothesis Testing: Large Sample Asymptotic Theory Part IV James J. Heckman University of

Multi-parameter MCMC notes by Mark Holder Review In the last lecture we justified the

JUST THE MATHS SLIDES NUMBER 10.7 DIFFERENTIATION 7 (Inverse hyperbolic functions) by

Scaling of scoring rules Jonas Wallin joint work with David Bolin (KAUST) CIRM virtual

SUPERVISOR TRAINING Maddy Marasciulo-Rice Africa Regional Case Management Specialist, Malaria

COL106: Data Structures and Algorithms Ragesh Jaiswal, IITD Ragesh Jaiswal, IITD COL106: Data

On Channel Bindings draft-ietf-nfsv4-channel-bindings-02.txt Nicolas.Williams@sun.com (to be