stacl simultaneous translation with integrated
play

STACL: Simultaneous Translation with Integrated Anticipation & - PowerPoint PPT Presentation

STACL: Simultaneous Translation with Integrated Anticipation & Controllable Latency Liang Huang Principal Scientist, Baidu Research Assistant Professor (on-leave), Oregon State University Joint work between Baidu Research (Sunnyvale) and


  1. STACL: Simultaneous Translation with Integrated Anticipation & Controllable Latency Liang Huang Principal Scientist, Baidu Research Assistant Professor (on-leave), Oregon State University Joint work between Baidu Research (Sunnyvale) and Baidu NLP (Beijing)

  2. Breakthrough in Simultaneous Translation full-sentence (non-simultaneous) translation simultaneous translation, latency ~3 secs STACL Baidu World Conference, November 2017 Baidu World Conference, November 2018 2

  3. Breakthrough in Simultaneous Translation full-sentence (non-simultaneous) translation simultaneous translation, latency ~3 secs STACL Baidu World Conference, November 2017 Baidu World Conference, November 2018 2

  4. Breakthrough in Simultaneous Translation full-sentence (non-simultaneous) translation simultaneous translation, latency ~3 secs STACL Baidu World Conference, November 2017 Baidu World Conference, November 2018 2

  5. Background: Consecutive vs. Simultaneous consecutive interpretation 
 simultaneous interpretation 
 multiplicative latency (x2) additive latency (+3 secs)

  6. Background: Consecutive vs. Simultaneous consecutive interpretation 
 simultaneous interpretation 
 multiplicative latency (x2) additive latency (+3 secs) simultaneous interpretation is extremely difficult only ~3,000 qualified simultaneous interpreters world-wide each interpreter can only sustain for 
 at most 10-30 minutes the best interpreters can only cover 
 ~ 60% of the source material

  7. Tradeoff between Latency and Quality consecutive 
 high quality interpreters machine 
 our 
 translation goal simultaneous interpreters word-by-word 
 low quality translation high latency low latency 1 sentence ~ 3 seconds 4

  8. Industrial Work in Simultaneous Translation • almost all existing “real-time” translation systems use conventional full- sentence translation techniques, causing at least one-sentence delay • some systems repeatedly retranslate, but constantly changing translations is annoying to the user and can’t be used for speech-to-speech translation Baidu, Nov. 2017 (~12 seconds delay) Sougou, Oct. 2018 (~12 seconds delay) 5

  9. Industrial Work in Simultaneous Translation • almost all existing “real-time” translation systems use conventional full- sentence translation techniques, causing at least one-sentence delay • some systems repeatedly retranslate, but constantly changing translations is annoying to the user and can’t be used for speech-to-speech translation Baidu, Nov. 2017 (~12 seconds delay) Sougou, Oct. 2018 (~12 seconds delay) 5

  10. Industrial Work in Simultaneous Translation • almost all existing “real-time” translation systems use conventional full- sentence translation techniques, causing at least one-sentence delay • some systems repeatedly retranslate, but constantly changing translations is annoying to the user and can’t be used for speech-to-speech translation Baidu, Nov. 2017 (~12 seconds delay) Sougou, Oct. 2018 (~12 seconds delay) 5

  11. Academic Work in Simultaneous Translation • prediction of German verb (Grissom et al, 2014) • reinforcement learning (Grissom et al, 2014; Gu et al, 2017) • learning Read/Write sequences on top of a pretained NMT model • “encourages” latency requirements, but can’t force them in testing • complicated, and slow to train Grissom et al, 2014 6

  12. Challenge: Word Order Difference • e.g. translate from SOV language (Japanese, German) to SVO (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014

  13. Challenge: Word Order Difference • e.g. translate from SOV language (Japanese, German) to SVO (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow

  14. Challenge: Word Order Difference • e.g. translate from SOV language (Japanese, German) to SVO (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow non-anticipative: President Bush ( …… waiting …… ) meets with Russian …

  15. Challenge: Word Order Difference • e.g. translate from SOV language (Japanese, German) to SVO (English) • German is underlyingly SOV, and Chinese is a mix of SVO and SOV • human simultaneous interpreters routinely “anticipate” (e.g., predicting German verb) Grissom et al, 2014 President Bush meets with Russian President Putin in Moscow non-anticipative: President Bush ( …… waiting …… ) meets with Russian … anticipative: President Bush meets with Russian President Putin in Moscow

  16. Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation

  17. 总统 布什茶 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng Bush President President

  18. 布什茶 总统 在 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng zài Bush President in President Bush

  19. 布什茶 总统 在 莫斯科 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng zài Mòs ī k ē Bush President in Moscow President Bush meets

  20. 总统 布什茶 在 莫斯科 与 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng zài Mòs ī k ē y ǔ Bush President in Moscow with President Bush meets with

  21. 布什茶 俄罗斯 与 莫斯科 在 总统 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng zài Mòs ī k ē y ǔ Éluós ī Bush President in Moscow with Russian President Bush meets with Russian

  22. 在 俄罗斯 总统 与 布什茶 莫斯科 总统 Our Solution: Prefix-to-Prefix • seq-to-seq is only suitable for 1 2 3 4 5 seq-to-seq source: conventional full-sentence MT … • we propose prefix-to-prefix, tailed to target: … wait whole source sentence … 1 2 simultaneous MT 1 2 3 4 5 source: • special case: wait- k policy: translation is prefix-to-prefix 
 … (wait- k ) target: always k words behind source sentence wait k words 1 2 • training in this way enables anticipation Bùshí z ǒ ngt ǒ ng zài Mòs ī k ē y ǔ Éluós ī z ǒ ngt ǒ ng Bush President in Moscow with Russian President President Bush meets with Russian President

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend