Exploiting Cross-Sentence Context for Neural Machine Translation - PowerPoint PPT Presentation

Exploiting Cross-Sentence Context for Neural Machine Translation Longyue Wang ♥ Zhaopeng Tu ♠ Andy Way ♥ Qun Liu ♥ ♥ ADAPT Centre, Dublin City University ♠ Tencent AI Lab

这是⼀丁个⽣甠态⽹罒络。 Motivation • The majority of NMT models is sentence-level . <eos> this a neutral s 0 s 1 s t s T c t c T c ⨁ 0.0 0.1 0.0 0.2 0.7 0.1 <eos>

Motivation • The continuous vector representation of a symbol encodes multiple dimensions of similarity . Word x 0 Axis Nearest Neighbours 1 diary notebooks (notebook) sketchbook jottings notebook 2 palmtop notebooks (notebook) ipaq laptop 1 powers authority (power) powerbase sovereignity power 2 powers electrohydraulic microwatts hydel (power) (Choi et al., 2016)

Motivation • The continuous vector representation of a symbol encodes multiple dimensions of similarity . • Consistency is another critical issue in document- level translation. 那么在这个问题上，伊朗的 … well, on this issue , iran has a relatively … Past 在任内解决伊朗核问题，不泌管是⽤甩和平 … to resolve the iranian nuclear issue in his term , … 那刚刚提到这个 … 谈判的问题。 Current that just mentioned the issue of the talks …

Motivation • The cross-sentence context has proven helpful for the aforementioned two problems in multiple sequential tasks (Sordoni et al., 2015; Vinyals and Le, 2015; Serban et al., 2016).

Motivation • The cross-sentence context has proven helpful for the aforementioned two problems in multiple sequential tasks (Sordoni et al., 2015; Vinyals and Le, 2015; Serban et al., 2016). • However, it has received relatively little attention from the NMT research community.

Data and Setting • Chinese-English translation task • Training data: 1M sentence pairs from LDC corpora that contain document information • Tuning: NIST MT05, Test: NIST MT06 and MT08 • Build the model on top of Nematus ( https:// github.com/EdinburghNLP/nematus ) • Vocabulary size: 35K for both languages • Word embedding: 600; Hidden size: 1000

这⾥里離。⾃臫然保护区。养殖⼴庀义的我们 Approach • Use a Hierarchical RNN to summarize previous M source sentences Cross-Sentence Context Sentence-level RNN Word-level RNN … … … … <eos> <eos> … … …

这是⼀丁个⽣甠态⽹罒络。 Approach • Strategy I: Initialization — Encoder <eos> Cross-Sentence Context

这是⼀丁个⽣甠态⽹罒络。 Approach • Strategy I: Initialization — Decoder s 0 s 1 s t s T <eos> Cross-Sentence Context

这是⼀丁个⽣甠态⽹罒络。 Approach • Strategy I: Initialization — Both s 0 s 1 s t s T <eos> Cross-Sentence Context

Results • Impact of components 32.0 32.0 31.9 31.55 31.5 Baseline +Init_Enc 31.0 +Init_Dec +Init_Both 30.57 30.5 30.0

。⼀丁个⽹罒络⽣甠态这是 Approach • Strategy 2: Auxiliary Context . <eos> this a neutral s 0 s 1 s t s T c t Intra-Sentence Context ⨁ 0.0 0.1 0.0 0.2 0.7 0.1 <eos> Cross-Sentence Context

Approach • Strategy 2: Auxiliary Context y t-1 y t-1 y t-1 s t s t s t s t-1 s t-1 s t-1 act. act. act. c t c t c t 𝜏 𝑨 t D D ✕ (a) standard (b) decoder with (c) decoder with decoder auxiliary context gating auxiliary context

Results • Impact of components 32.5 32.24 32.0 31.5 Baseline 31.3 +Aux. Ctx. +Gating Aux. Ctx. 31.0 30.57 30.5 30.0

。⼀丁个⽹罒络⽣甠态这是 Approach • Initialization + Gating Auxiliary Context . <eos> this a neutral s 0 s 1 s t s T c t Intra-Sentence Context ⨁ 0.0 0.1 0.0 0.2 0.7 0.1 <eos> Cross-Sentence Context

Results • Impact of components 33.0 32.67 32.5 32.24 32.00 32.0 Baseline +Init_Both 31.5 +Gating Aux. Ctx. +Both 31.0 30.57 30.5 30.0

Analysis • Translation error statistics Errors Ambiguity Inconsistency All Total 38 32 70 Fixed 29 24 53 New 7 8 15

Analysis • Case Study Ÿ � I é � @ – M J … * ò Ï Hist. v ' l ˚ j ¡ ⌫ ? ˝ & O 6 å ⌥ Q P ò ? Input Can it inhibit and deter corrupt offi- Ref. cials? NMT Can we contain and deter the enemy ? Can it contain and deter the corrupt Our officials ?

Summary • We propose to use HRNN to summary previous source sentences, which aims at providing cross- sentence context for NMT • Limitations • Computational expensive • Only exploit source sentences due to error propagation • Encoded into a single fixed-length vector, not flexible

Publicly Available • The source code is publicly available at https:// github.com/tuzhaopeng/LC-NMT • The trained models and translation results will be released

Reference 1. Heeyoul Choi, Kyunghyun Cho, and Yoshua Bengio. Context-dependent word representation for neural machine translation . arXiv 2016. 2. Alessandro Sordoni, Yoshua Bengio, Hossein Vahabi, Christina Lioma, Jakob Grue Simonsen, and Jian- Yun Nie. A hierarchical recurrent encoder- decoder for generative context-aware query suggestion. CIKM 2015. 3. Iulian V. Serban, Alessandro Sordoni, Yoshua Bengio, Aaron Courville, and Joelle Pineau. Building end-to-end dialogue systems using generative hierarchical neural network models. AAAI 2016. 4. Oriol Vinyals and Quoc Le. A neural conversa- tional model. In Proceedings of the International Conference on Machine Learning, Deep Learning Workshop.

Question & Answer

Exploiting Cross-Sentence Context for Neural Machine Translation - PowerPoint PPT Presentation

Exploiting Cross-Sentence Context for Neural Machine Translation Longyue Wang Zhaopeng Tu Andy Way Qun Liu ADAPT Centre, Dublin City University Tencent AI Lab Motivation The

Probabilistic Models of Human Sentence Experiment 1: Entropy and Sentence Length 2 Processing

Neural Machine Translation Gongbo Tang 8 October 2018 Outline Neural Machine Translation 1

Convolutional Neural Networks for Sentence Classification Yoon Kim New York University 1 / 34

9.4 Local Perception Filters 9.4 Local Perception Filters Exploiting Exploiting Perceptual

A Sentence is a Sentence is a Sentence? Zarah Weiss Introduction Parallels and Differences

SENTENCE STRUCTURE ATI TEAS ENGLISH AND LANGUAGE USAGE SENTENCE STRUCTURE Sentence Structure

02 | 27 SOUTHERN CROSS 23.04 03 | 27 SOUTHERN CROSS 23.04 04 | 27 SOUTHERN CROSS 23.04 06

Neural Information Retrieval Wassila Lalouani 1 Plan Neural network architectures Neural

Introduction to Neural Machine Translation Gongbo Tang 16 September 2019 Outline Why Neural

The Shadow of the Cross The Cross of Jesus part 1B The Shadow of the Cross Hebrews 10:1-14 The

Motivation Good translation preserves the meaning of the sentence. Neural MT learns to

Structure for Semantic Tasks Gabriel Stanovsky, Ido Dagan and Mausam Sentence Level Semantic

I. Watch the Einstein video and answer the following questions: What is a sentence? What is a

Syntactic Grammaticality Doesnt depend on Context Free Grammars Having heard the sentence

Neural Machine Translation Philipp Koehn 6 October 2020 Philipp Koehn Machine Translation:

A Neural Attention Model for Abstractive Sentence Summarization Alexander Rush Sumit Chopra

Simulations of the inspiral and merger of neutron star binaries Jos A. Font Departamento de

GWs from neutron star mergers: accuracy and tidal effects S. Bernuzzi TPI-FSU Jena / SFB-TR7

PyFR Symposium 2020 Addin ing Mult ltiphase Capabili lities to to PyF yFR Xi Deng 1 , Pierre

COMMON ENVELOPE SIMULATIONS IN PHANTOM THOMAS REICHARDT COLLABORATORS: ORSOLA DE MARCO, ROBERTO

neutron star mergers Kenta Kiuchi (YITP) Masaru Shibata (YITP), Yuichiro Sekiguchi (Toho Univ.),

Varying Nf in QCD: scale separation, topology (and hot axions) Maria Paola Lombardo INFN I.

NLP William Wang Sameer Singh Slides: http://tiny.cc/adversarial With contributions from Jiwei

Theres Always a First Time A Clinical Problem Solving Case Gurpreet Dhaliwal, MD Professor of