Machine Translation and Sequence-to-sequence Models - PowerPoint PPT Presentation

Machine Translation and Sequence-to-sequence Models Machine Translation and Sequence-to-sequence Models http://phontron.com/class/mtandseq2seq2018/ Graham Neubig Carnegie Mellon University CS 11-731 1

Machine Translation and Sequence-to-sequence Models What is Machine Translation? kare wa ringo wo tabeta . He ate an apple . 2

Machine Translation and Sequence-to-sequence Models What are Sequence-to-sequence Models? Sequence-to-sequence Models Machine translation: kare wa ringo wo tabeta → he ate an apple Tagging: he ate an apple → PRN VBD DET PP Dialog: he ate an apple → good, he needs to slim down Speech Recognition → he ate an apple And just about anything...: 1010000111101 → 00011010001101 3

Machine Translation and Sequence-to-sequence Models Why MT as a Representative? Useful! Global MT Market Expected To Reach $983.3 Million by 2022 Source: The Register Source: Grand View Research Imperfect... 4

Machine Translation and Sequence-to-sequence Models MT and Machine Learning Big Data! Billions of words for major languages … but little for others Well-defined, Difficult Problem! Use for algorithms, math, etc. Algorithms Widely Applicable! 5

Machine Translation and Sequence-to-sequence Models MT and Linguistics 트레이나 베이커는 좋은 사람이니까요 Baker yinikkayo tray or a good man Trina Baker is a good person Morphology! 이니까요 is a variant of 이다 (to be) Syntax! should keep subject together Semantics! “Trina” is probably not a man... … and so much more! 6

Machine Translation and Sequence-to-sequence Models Class Organization 7

Machine Translation and Sequence-to-sequence Models Class Format ● Before class: ● Read the assigned material ● Ask questions via web (piazza/email) ● In class: ● Take a small quiz about material ● Discussion, questions, elaboration ● Pseudo-code walk 8

Machine Translation and Sequence-to-sequence Models Assignments ● Assignment 1: Create a neural sequence-to-sequence modeling system. Turn in code to run it, and write a report. ● Assignment 2: Create a system for a challenge task, to be decided in class. ● Final project: Come up with an interesting new idea and test it. 9

Machine Translation and Sequence-to-sequence Models Assignment Instructions ● Work in groups of 2-3. ● Use a shared git repository and commit the code that you write, and in reports note who did what part of the project. ● All implementations must be basically your own, although you can use small code snippets. ● We recommend implementing in Python, using DyNet or PyTorch as your neural network library. 10

Machine Translation and Sequence-to-sequence Models Class Grading ● Short quizzes: 20% ● Assignment 1: 20% ● Assignment 2: 20% ● Final Project: 40% 11

Machine Translation and Sequence-to-sequence Models Class Plan 1. Introduction (Today): 1 class 2. Language Models: 3 classes 3. Neural MT: 3 classes 3. Evaluation/Analysis: 2 classes 4. Applications: 2 classes 5. Symbolic MT: 3 classes 7. Advanced Topics: 11 classes 8. Final Project Presentations: 2 classes 12

Machine Translation and Sequence-to-sequence Models Guest Lectures ● Bob Frederking (9/13): Rule/Knowledge-based Translation ● Bhiksha Raj (11/27): Speech Applications 13

Machine Translation and Sequence-to-sequence Models Models for Machine Translation 14

Machine Translation and Sequence-to-sequence Models Machine Learning for Machine Translation F = kare wa ringo wo tabeta . He ate an apple . E = Probability model: P( E | F;Θ ) Parameters 15

Machine Translation and Sequence-to-sequence Models Problems in MT ● Modeling: How do we define P( E | F ; Θ )? ● Learning: How do we learn Θ ? ● Search: Given F , how do we find the highest scoring translation? E' = argmax E P( E | F ; Θ ) ● Evaluation: Given E' and a human reference E , how do we determine how good E' is? 16

Machine Translation and Sequence-to-sequence Models Part 1: Neural Models 17

Machine Translation and Sequence-to-sequence Models Language Models 1: n-gram Language Models E 1 = he ate an apple Given multiple candidates, E 2 = he ate an apples which is most likely as E 3 = he insulted an apple an English sentence? E 4 = preliminary orange orange ● Definition of language modeling ● Count-based n-gram language models ● Evaluating language models ● Code Example: n-gram language model 18

Machine Translation and Sequence-to-sequence Models Language Models 2: Log-linear/Feed- forward Language Models a 3.0 -6.0 -0.2 -3.2 the 2.5 -5.1 -0.3 -2.9 talk -0.2 1.0 0.2 1.0 b = w 1,a = w 2,giving = s = gift 0.1 2.0 0.1 2.2 hat 1.2 0.6 -1.2 0.6 … … … … … ● Log-linear/feed-forward language models ● Stochastic gradient descent and mini-batching ● Features for language modeling ● Implement: Feed forward language model 19

Machine Translation and Sequence-to-sequence Models Language Models 3: Recurrent LMs <s> <s> this is a pen </s> ● Recurrent neural networks ● Vanishing Gradient and LSTMs/GRUs ● Regularization and dropout ● Implement: Recurrent neural network LM 20

Machine Translation and Sequence-to-sequence Models Neural MT 1: Encoder-decoder Models wa kore pen desu this is a pen </s> kore wa pen desu </s> ● Encoder-decoder Models ● Searching for hypotheses ● Mini-batched training ● Implement: Encoder-decoder model 21

Machine Translation and Sequence-to-sequence Models Neural MT 2: okonai kouen wo </s> masu Attentional Models g 1 ,...,g 4 a 1 a 2 a 3 a 4 h i-1 h i r i-1 P(e i |F,e 1 ,...,e i-1 ● Attention in its various varieties ● Unknown word replacement ● Attention improvements, coverage models ● Implement: Attentional model 22

Machine Translation and Sequence-to-sequence Models Neural MT 3: Self-attention, CNNs ● Self attention ● Convolutional neural networks ● A case study, the transformer ● Implement: Self-attentional models 23

Machine Translation and Sequence-to-sequence Models Data and Evaluation 24

Machine Translation and Sequence-to-sequence Models Data/Evaluation 1a: Creating Data ● Preprocessing ● Document harvesting and crowdsourcing ● Other tasks: dialog, captioning ● Implement: Find/preprocess data 25

Machine Translation and Sequence-to-sequence Models Data/Evaluation 1b: Evaluation taro ga hanako wo otozureta Taro visited Hanako the Taro visited the Hanako Hanako visited Taro ☓ Adequate? ○ ○ Fluent? ☓ ○ ○ Better? B, C C ● Human evaluation ● Automatic evaluation ● Significance tests and meta-evaluation ● Implement: BLEU and measure correlation 26

Machine Translation and Sequence-to-sequence Models Data/Evaluation 2: Analysis and Interpretation ● Analyzing results ● Visualization of neural MT models ● Implement: Visualization of results 27

Machine Translation and Sequence-to-sequence Models Application Examples 28

Machine Translation and Sequence-to-sequence Models Applications 1: Summarization and Data-to-text Generation President Trump said Monday that the United States and Mexico had reached agreement to revise key portions of the North American Free Trade Agreement and Trump Says Nafta Deal Reached Between would finalize it within days, suggesting he U.S. and Mexico was ready to jettison Canada from the trilateral trade pact if the country did not get on board quickly. ● Generating shorter summaries of long texts ● Generating written summaries of data ● Necessary improvements to models ● Implement: Summarization model 29

Machine Translation and Sequence-to-sequence Models Applications 2: Dialog he ate an apple → good, he needs to slim down ● Models for dialogs ● Ensuring diversity in outputs ● Coherence in generation ● Implement: Dialog generation 30

Machine Translation and Sequence-to-sequence Models Symbolic Translation Models 31

Machine Translation and Sequence-to-sequence Models Symbolic Methods 1: Word Alignment 訪問訪問太郎が花子をした。太郎が花子をした。 taro visited hanako . taro visited hanako . ● The IBM/HMM models ● The EM algorithm ● Finding word alignments ● Implement: Word alignment 32

Machine Translation and Sequence-to-sequence Models Symbolic Methods 2: Monotonic Transduction and FSTs he ate an apple ↓ PRN VBD DET PP ● Models for sequence transduction ● The Viterbi algorithm ● Weighted finite-state transducers ● Implement: A part-of-speech tagger 33

Machine Translation and Sequence-to-sequence Models Symbolic Methods 3: Phrase-based MT F = watashi wa CMU de kouen wo okonaimasu . watashi wa CMU de kouen wo okonaimasu . I at CMU a talk will give . watashi wa wo okonaimasu kouen CMU de . I will give a talk at CMU . E = I will give a talk at CMU . ● Phrase extraction and scoring ● Reordering models ● Phrase-based decoding 34 ● Implement: Phrase extraction or

Machine Translation and Sequence-to-sequence Models Advanced Topics 35

Machine Translation and Sequence-to-sequence Models - PowerPoint PPT Presentation

Machine Translation and Sequence-to-sequence Models Machine Translation and Sequence-to-sequence Models http://phontron.com/class/mtandseq2seq2018/ Graham Neubig Carnegie Mellon University CS 11-731 1 Machine Translation and

4CSLL5 IBM Translation Models Martin Emms October 22, 2020 4CSLL5 IBM Translation Models IBM

Sequence-to-sequence models used for machine translation and Murat Apishev Katya Artemova

4CSLL5 IBM Translation Models IBM models Probabilities and Translation Alignments Martin Emms

Statistical Machine Translation Nadir Durrani 21-November-2014 Machine Translation

Sequence to Sequence Models for Machine Translation (2) CMSC 723 / LING 723 / INST 725 Marine

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC

Latent Models: Sequence Models Beyond HMMs and Machine Translation Alignment CMSC 473/673 UMBC

Sequence to Sequence Models for Machine Translation CMSC 723 / LING 723 / INST 725 Marine

Introd u ction to machine translation MAC H IN E TR AN SL ATION IN P YTH ON Th u shan

Machine Translation Machine Translation February 13, 2008 Andreas Eisele UdS Computerlinguistik

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Representing Huge Translation Models Statistical Machine Translation parallel text + alignment

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Sequence to Sequence models: Attention Models 1 Sequence-to-sequence modelling Problem:

Machine Translation Philipp Koehn 28 April 2020 Philipp Koehn Artificial Intelligence: Machine

CENG3420 Lecture 01: Introduction Bei Yu byu@cse.cuhk.edu.hk (Latest update: January 10, 2018)

How is software made? 2 A stylized software supply chain test code build package 3

Hebrews 10:23 Let us hold fast the confession of our hope without wavering, for he who promised

Quantum Leaks in Space0me R.B. Mann Rela0vis0c Quantum

Liberty Leading the Vegetables (Delacroix) The Anatomy Lesson of Dr. Pickled Cabbage (Rembrandt)

Eclipse MicroProfile Starter CONFIDENTIAL Designator with Quarkus OpenAlt 2019 Michal Karm

Support vector machines and kernels Thurs Nov 19 Kristen Grauman UT Austin Last time

Dependency Grammars Topological Dependency Trees: A Constraint-based Account of Linear