Toward Automatic Speech Interpretation Nara Institute of Science and - PowerPoint PPT Presentation

1 Toward Automatic Speech Interpretation Nara Institute of Science and Technology Data Science Center, and Graduate School of Science and Technology Satoshi Nakamura with Katsuhito Sudo, Graham Neubig Sakriani Sakti, Hiroki Tanaka, Katsuki Chosa, Do Quoc Truong 2019/06/15 CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

2 Speech-to-Speech Translation System Multilingual Multilingual Spoken Speech Speech Language Recognition Synthesis Translation English Japanese I go to school 「私は学校に行く : Watashi wa Gakko ni iku 」 Watashi wa I go to school Gakko ni iku 2 CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/ 2019/06/15

3 Speech Translation and Text Translation Speech Translation – Translation of spoken languages – Speech recognition errors – Translation from source language speech to target language speech (text) – Short latency for real-time human communication Translation of Spoken Language – Object is real-time communication and understanding – Para-linguistic/non-linguistic information necessary – Context dependent utterances, non syntactical utterances – No punctuation – No upper/lower case CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ 2019/06/15 http://www.naist.jp/

4 Technical Background around 2000 Corpus-based Approach – Statistical modeling and large size training data Machine Translation – Rule based: Linguists created translation rules – Corpus based ︓ • Example-Based Automatic extraction of translation rules [M.Nagao 1984 etc.] • Statistical MT （ Statistical Machine Translation) Extract rules statistically based on Noisy Channel Model [P. F. Brown, et.al., 1993] CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ 2019/06/15 http://www.naist.jp/

5 5 Contents 1. History of Automatic Speech Translation Research 2. Automatic Speech Interpretation Technologies 3. Current Project and Data Collection 4. Summary and Future Works CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ 2019/06/15 http://www.naist.jp/

6 Speech Translation Projects Japan – ATR Speech-to-speech Translation (1986-2008) – NICT Speech-to-speech Translation (2008-2011, 2014-2020) EU – Verbmobile (1993-2000) – Nespole(2001-2003) – TC-Star(2004-2006) – EU-Bridge(2012-2014) US – DARPA TransTac, Communicator (2006-2010) – DARPA GALE(2006-2010) – DARPA BOLT(2011-2015) International – C-Star Consortium (1991-2003) – IWSLT (2004-) – A-Star Consortium(2006-2008) – U-Star Consortium (2009-) CLI9 Keynote Satoshi Nakamura, NAIST 2019/06/15 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

History of Speech Translation Research in Japan 7 NAIST 2008 1986 1992 1999 2006 2010 ATR ATR ATR NICT Read Daily Wider and + More Languages Speech Conversation Real Domain for Translation VoiceTra • 21 multilateral text translation • Syntactically correct • Standard expression • Wider and real domain • Clear utterance • Unclear utterance “International Travel” A-STAR • Limited domain • Limited domain • Realistic expressions Ex. “Conference • Noisy speech Ex. “Hotel Reservation” • Multilateral translation for 8 Asian languages Registration” • J-E, J-C speech translation • Network-based S2ST 2011 C-STAR IWSLT • Multilateral translation for 7 world languages • Evaluation Campaign of S2S technologies Fundamentals Corpus-based Technology Rule-based Technology Large scale corpus Hand-made + Machine learning 2019/06/15 CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

Mechanism of Speech Translation System Multilingual Multilingual Spoken Speech Speech Language Recognition Synthesis Translation English Japanese I go to school 「私は学校に行く : Watashi wa Gakko he iku 」 w a t a sh i I to school go Watashi wa I go to school w a g a xtu Gakko he iku k o o n i….. Convert to word Select appropriate Convert Japanese word Convert Japanese Re-order word sequence sequencee sequence into English waveform According to English Phoneme sequence By lexicon and word sequence to English text grammer “a”,”I”,”u”,… grammer using dictionary from the corpus “I” “ I” 「私は :watashi ha 」 ⇒ “ I” “to school” “ go” 「学校に :Gakko ni 」 ⇒ “ to school” “go” “ to school” 「行く : iku 」 ⇒ “ go” Corpora Large Scale Large Scale Large Scale Japanese Text Large Scale Parallel Large Scale Japanese Speech English Speech Corpora Corpora between English Text Corpora Corpora Japanese and English Corpora Digital revolution for under resourced languages in Asia 2019 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

9 9 Phrase Based Machine Translation Divide the sentence into small phrases and translate  Today I will give a lecture on machine translation . Today I will give a lecture on machine translation . 今日は、を行いますの講義機械翻訳。 Today machine translation a lecture on I will give . 今日は、機械翻訳の講義を行います。今日は、機械翻訳の講義を行います。 kyowa kikaihonyaku no kogi wo okonaimasu  Score translations with translation model (TM), reordering model (RM), and language model (LM) 2019/06/15 CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

10 Translation Model Creation  Perform automatic alignment of parallel text  Extract phrases from the aligned text for translation 受付（ホテル Uketsuke) （ hoteru) の（ホテルの (hoteru no) → hotel no) ホテルの (hoteru no) → the hotel the 受付 (uketsuke) → front desk hotel ホテルの受付 → hotel front desk front ホテルの受付 → the hotel front desk desk CLI9 Keynote Satoshi Nakamura, NAIST 2019/06/15 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

11 Statistical MT • Translation Model, Reordering Model, Language Model Source and target Target language language parallel text corpus text corpus Parameter estimation Parameter estimation Translation model Reordering model Language model Phrase substitution Grammatical correctness Machine Translation Translation text Input text (Target Language) (Source Language) Decoding CLI9 Keynote Satoshi Nakamura, NAIST 2019/06/15 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

12 Parallel Corpus Japanese: “mado wo aketemo iidesuka” English: 1. may i open the window 2. ok if i open the window 3. can i open the window 4. could we crack the window 5. is it okay if i open the window 6. would you mind if i opened the window 7. is it okay to open the window 8. do you mind if i open the window 9. would it be all right to open the window 10. i’d like to open the window Japanese English Chinese Korean New lang. CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ 2019/06/15 http://www.naist.jp/

13 ATR BTEC Corpus Spoken Language Communication Research Laboratories Basic Trouble Shopping Move Stay 8.4% (8) 12.2% (7) 12.1% (20) 10.0% (13) 8.2% (11) • transportation • greet someone • make/change • luggage • buy something • buy a ticket • ask a question a reservation • emergency • gather information • rental car • state one’s • check-in • medicine • price • trouble purpose • trouble • assistance • wrapping • … • … • … • … • … Sightseeing Study Overseas 7.7% (11) 1.6% (14) Restaurant Drink 7.3% (11) 1.3% (4) Communication 6.4% (6) Exchange 1.2% (5) Airport Snack 5.5% (14) 1.2% (4) Business Beauty 5.3% (26) 0.8% (5) Contact Go Home 4.0% (6) 0.6% (4) Airplane Research 3.6% (11) 0.1% (12) Homestay 2.3% (11) 2019/06/15 CLI9 Keynote Satoshi Nakamura, NAIST 無限の可能性、ここが最先端－Outgrow your limits－ http://www.naist.jp/

Toward Automatic Speech Interpretation Nara Institute of Science and - PowerPoint PPT Presentation

1 Toward Automatic Speech Interpretation Nara Institute of Science and Technology Data Science Center, and Graduate School of Science and Technology Satoshi Nakamura with Katsuhito Sudo, Graham Neubig Sakriani Sakti, Hiroki Tanaka, Katsuki

Speech Processing Speech Processing Using Speech with Computers Overview Overview Speech vs

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 25: Speech

Toward Toward Univeral Network-based Univeral Network-based Speech Translation Speech

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 23: Speech

INTERPRETATION INTERPRETATION INTERPRETATION INTERPRETATION How can I know what How can I know

EECS E6870 converting speech to text Speech Recognition automatic speech recognition

6-Text To Speech (TTS) Speech Synthesis Speech Synthesis Concept Speech Naturalness Phone

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 12: Acoustic

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 4: WFSTs in ASR

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 21: Speaker

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 24: Statistical

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 22: Speaker

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Automatic Speech Recognition (CS753) Automatic Speech Recognition (CS753) Lecture 1: Introduction

Speech and Language CS 188: Artificial Intelligence Speech technologies Automatic

Automatic Verification of Automatic Verification of Automatic Verification of Automatic

From Local to Global: linking up the assessment and improvement agendas in Education Professor

P M T

Deep Science at Boulby Underground Laboratory: An update of support facilities and science at

Disclosures UPDATES IN HIV NEUROLOGY Felicia Chow, MD, MAS I have nothing to disclose.

LIDER: Building Free, Interlinked, and Interoperable Language

Trabeculectomy Slows or relationships to disclose: Reverses the Rate of Visual Alcon

Valued hyperfields, truncated DVRs, and valued fields Junguk Lee Institute of mathematics,

evaluation of cultural heritage digital collections: the DiLEO perspective Christos Papatheodorou

Sambuz

Useful Links

Newsletter

Mail Us