document context neural machine translation with memory
play

Document Context Neural Machine Translation with Memory Networks - PowerPoint PPT Presentation

Document Context Neural Machine Translation with Memory Networks Document Context Neural Machine Translation with Memory Networks Sameen Maruf, Gholamreza Haffari Faculty of Information Technology Monash University July 17, 2017 1 / 30


  1. Document Context Neural Machine Translation with Memory Networks Document Context Neural Machine Translation with Memory Networks Sameen Maruf, Gholamreza Haffari Faculty of Information Technology Monash University July 17, 2017 1 / 30

  2. Document Context Neural Machine Translation with Memory Networks Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 2 / 30

  3. Document Context Neural Machine Translation with Memory Networks Introduction Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 3 / 30

  4. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? 4 / 30

  5. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently 4 / 30

  6. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30

  7. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30

  8. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Most MT models translate sentences independently Discourse phenomena are ignored, e.g. pronominal anaphora and lexical consistency which may have long range dependency 4 / 30

  9. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? 5 / 30

  10. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] 5 / 30

  11. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] Previous context-NMT models only use local context and report deteriorated performance when using the target-side context [Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018] 5 / 30

  12. Document Context Neural Machine Translation with Memory Networks Introduction Why document-level machine translation? Statistical MT attempts to document MT do not yield significant empirical improvements [Hardmeier and Federico, 2010, Gong et al., 2011, Garcia et al., 2014] Previous context-NMT models only use local context and report deteriorated performance when using the target-side context [Jean et al., 2017, Wang et al., 2017, Bawden et al., 2018] We incorporate global source and target document contexts 5 / 30

  13. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 6 / 30

  14. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction

  15. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction

  16. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction

  17. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction

  18. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction

  19. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 7 / 30

  20. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Two types of factors: f θ ( y t ; x t , x − t ), g θ ( y t ; y − t ) 8 / 30

  21. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 9 / 30

  22. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: 9 / 30

  23. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: Maximise P ( y 1 , . . . , y | d | | x 1 , . . . , x | d | ) 9 / 30

  24. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Training objective: Maximise P ( y 1 , . . . , y | d | | x 1 , . . . , x | d | ) = ⇒ Maximise the pseudo-likelihood | d | � arg max P θ ( y t | x t , y − t , x − t ) (1) θ t =1 where f θ and g θ are subsumed in the P θ ( y t | x t , y − t , x − t ) 9 / 30

  25. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction 10 / 30

  26. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given 10 / 30

  27. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)

  28. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)

  29. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding)

  30. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Challenge: During test time, the target document is not given Coordinate Ascent (i.e., Iterative Decoding) 10 / 30

  31. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding

  32. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding

  33. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding

  34. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding

  35. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding

  36. Document Context Neural Machine Translation with Memory Networks Document MT as Structured Prediction Document MT as Structured Prediction Iterative Decoding 11 / 30

  37. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Overview 1 Introduction 2 Document MT as Structured Prediction 3 Document NMT with MemNets 4 Experiments and Analysis 5 Conclusion 6 References 12 / 30

  38. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ P θ ( y t | x t , y − t , x − t ) 13 / 30

  39. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 14 / 30

  40. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 15 / 30

  41. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ 16 / 30

  42. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ Memory-to-Context: t , c trg s t , j = GRU ( s t , j − 1 , E T [ y t , j − 1 ] , c t , j , c src ) t 17 / 30

  43. Document Context Neural Machine Translation with Memory Networks Document NMT with MemNets Document NMT with MemNets = ⇒ Memory-to-Output: + W yt · c trg y t , j ∼ softmax ( W y · r t , j + W ym · c src + b y ) t t 18 / 30

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend