Simultaneous Speech Translation Graham Neubig Nara Institute of - PowerPoint PPT Presentation

Simultaneous Speech Translation Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST) 5/18/2015 Joint Work With: Satoshi Nakamura, Tomoki Toda, Sakriani Sakti, Tomoki Fujita, Hiroaki Shimizu, Yusuke Oda, Takashi Mieno 1

Simultaneous Speech Translation Background 2

Simultaneous Speech Translation Speech Translation Systems ● Translate speech from source language to target ASR こんにちは、駅はどこですか？ MT Hello, where is the station? TTS 3

Simultaneous Speech Translation Problem: Delay ● Wait for the whole utterance to end before translating Delay ASR こんにちは、駅はどこですか？ MT Hello, where is the station? TTS 4

Simultaneous Speech Translation Solution: Divide into Smaller Chunks ● Choose appropriate timing to start translation Delay: Reduced ASR こんにちは、駅はどこですか？ MT MT MT Hello, the station where is it? TTS TTS TTS 5

Simultaneous Speech Translation Four Problems ● Segmentation: When do we start translating? ● Prediction: Can we predict things that haven't been said? ● Data: Can we learn something from actual simultaneous interpreters? ● Evaluation: How do we decide which results are better? 6

Simultaneous Speech Translation 1) Sentence Segmentation for Simultaneous Speech Translation 7

Simultaneous Speech Translation Previous Work: Incremental Dependency Parsing/Manual Rules [Ryu+ 04] ● Utilize knowledge of English/Japanese to derive rules subj prep prep I went to the park with your brother Translate after the first prepositional phrase completes! MT MT あなたの弟と私は公園に行きました ● - Requires a bilingual linguist to design rules ● - Requires an accurate incremental dependency parser 8

Simultaneous Speech Translation Previous Work: Division on Pauses [Fugen+ 08, Bangalore+ 12] ● Simply divide on short pauses in the utterance ASR hello where is the station ● - Cannot capture relationship between languages ● - Result will greatly change with speech speed, disfluencies 9

Simultaneous Speech Translation Previous Work: Division on Predicted Commas [Sridhar+ 13] ● Guess where commas would appear in the text hello where is the ... Classifier Classifier Classifier comma! no comma no comma translate wait wait ● + Simple, and surprisingly effective ● - No parameter to adjust the granularity ● - Can't capture features of the target language 10

Simultaneous Speech Translation Considering Reordering Probabilities in Sentence Segmentation [Fujita et al., Interspeech 2013] 11

Simultaneous Speech Translation Phrase Based Machine Translation ● Divide the sentence into small phrases and translate Today I will give a lecture on machine translation . Today I will give a lecture on machine translation . 今日は、を行いますの講義機械翻訳。 Today machine translation a lecture on I will give . 今日は、機械翻訳の講義を行います。今日は、機械翻訳の講義を行います。 ● Score translations with translation model (TM), 12 reordering model (RM), and language model (LM)

Simultaneous Speech Translation Translation Model Creation ● Perform automatic alignment of bitext ● From aligned text, extract phrases for translation ホテ受ホテルの → hotel ルの付ホテルの → the hotel the 受付 → front desk hotel ホテルの受付 → hotel front desk front ホテルの受付 → the hotel front desk desk

Simultaneous Speech Translation Lexicalized Reordering Model ● Probabilistically models reorderings for increased accuracy of translation ● Given current phrase and next phrase: Monotone: Swap: 背の高い男太郎を訪問した the tall man visited Taro Discontinuous Right: Discontinuous Left: 私は太郎を訪問した背の高い男を訪問した I visited Taro visited the tall man ● “monotone” + “discontinuous right” = “right probability”

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こんにちは where is→ どこですか the station→ 駅 where→ どこ the→ その Input String hello where is the station

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こんにちは where is→ どこですか the station→ 駅 where→ どこ the→ その Input String hello where is the station “hello” phrase exists ↓ wait

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こんにちは where is→ どこですか the station→ 駅 where→ どこ the→ その Input String hello where is the station “hello” “hello where” phrase exists phrase missing ↓ ↓ wait translate “hello”

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こんにちは where is→ どこですか the station→ 駅 where→ どこ the→ その Input String hello where is the station “hello” “hello where” “where is” phrase exists phrase missing phrase exists ↓ ↓ ↓ wait translate wait “hello”

Simultaneous Speech Translation Method One: Choosing Translation Timing with Phrases ● Input words one at a time from ASR ● While words exist in phrase table, don't translate yet Phrase Table hello→ こんにちは where is→ どこですか the station→ 駅 where→ どこ the→ その Input String hello where is the station “hello” “hello where” “where is” “where is the” phrase exists phrase missing phrase exists phrase missing ↓ ↓ ↓ ↓ wait translate wait translate “hello” “where is”

Simultaneous Speech Translation Graham Neubig Nara Institute of - PowerPoint PPT Presentation

Simultaneous Speech Translation Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST) 5/18/2015 Joint Work With: Satoshi Nakamura, Tomoki Toda, Sakriani Sakti, Tomoki Fujita, Hiroaki Shimizu,

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Translation Speech Translation Three part systems

SDS Aplications - Speech-to-speech translation - Anca Burducea May 28, 2015 S2S Translation

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Toward Toward Univeral Network-based Univeral Network-based Speech Translation Speech

Simultaneous Translation: Recent Advances and Remaining Challenges Liang Huang Baidu

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Simultaneous GermanEnglish Lecture Translation Muntsin Kolss, Matthias Wlfel, Florian Kraft,

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Making Visions and Divisions: Prospects for a Reorientation in Irans Foreign Policy

Maybe Maybe not: Uncertainty in Time-Oriented Data Visualization Theresia Gschwandtner,

Modelling Civil Violence Modelling Social Interaction in Information Systems course 2014/2015-I

XML and Databases Chapter 5: XML Schema Prof. Dr. Stefan Brass Martin-Luther-Universit at

Riemann Sums Partition P = { x 0 , x 1 , . . . , x n } of an interval [a, b]. c k [x k 1 ,

Physical Security: Status and Outlook ECRYPT II: Crypto for 2020 January 22-24, Tenerife, Spain

NetflixOSS A Cloud Native Architecture LASER Sessions 2&3 Overview September 2013

5 CONVERSION FUNCTIONS Data type conversion Implicit data type Explicit data type conversion

Simultaneous Speech Translation Graham Neubig Nara Institute of - PowerPoint PPT Presentation

Simultaneous Speech Translation Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST) 5/18/2015 Joint Work With: Satoshi Nakamura, Tomoki Toda, Sakriani Sakti, Tomoki Fujita, Hiroaki Shimizu,

11-731 Machine Translation Speech 2 Speech Translation Speech Translation Three part systems

Simple, Lexicalized Choice of Translation Timing for Simultaneous Speech Translation Tomoki

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Speech Processing 15-492/18-492 Speech Translation Speech Translation Three part systems

SDS Aplications - Speech-to-speech translation - Anca Burducea May 28, 2015 S2S Translation

Simultaneous Speech Translation Graham Neubig Nara Institute of Science and Technology (NAIST)

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Toward Toward Univeral Network-based Univeral Network-based Speech Translation Speech

Simultaneous Translation: Recent Advances and Remaining Challenges Liang Huang Baidu

Community Translation By Willem Stoeller Examples Community Translation Virtual Teams Powering

Simultaneous GermanEnglish Lecture Translation Muntsin Kolss, Matthias Wlfel, Florian Kraft,

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

Speech Processing 11-492/18-492 Speech Processing 11-492/18-492 Speech Synthesis Evaluation

Speech Processing 15-492/18-492 Speech Synthesis Overview Text processing Speech Synthesis

Speech Processing 15- -492/18 492/18- -492 492 Speech Processing 15 Speech Synthesis Prosody

Project Overview Speech Speech Generation Generation Common Semantic Frame Speech Speech

Making Visions and Divisions: Prospects for a Reorientation in Irans Foreign Policy

Maybe Maybe not: Uncertainty in Time-Oriented Data Visualization Theresia Gschwandtner,

Modelling Civil Violence Modelling Social Interaction in Information Systems course 2014/2015-I

XML and Databases Chapter 5: XML Schema Prof. Dr. Stefan Brass Martin-Luther-Universit at

Riemann Sums Partition P = { x 0 , x 1 , . . . , x n } of an interval [a, b]. c k [x k 1 ,

Physical Security: Status and Outlook ECRYPT II: Crypto for 2020 January 22-24, Tenerife, Spain

NetflixOSS A Cloud Native Architecture LASER Sessions 2&amp;3 Overview September 2013

5 CONVERSION FUNCTIONS Data type conversion Implicit data type Explicit data type conversion

NetflixOSS A Cloud Native Architecture LASER Sessions 2&3 Overview September 2013