Error Analysis Applied to End-to-End Spoken Language Understanding - PowerPoint PPT Presentation

Antoine Caubrière, Sahar Ghannay, Natalia Tomashenko, Renato De Mori, Antoine Laurent, Emmanuel Morin, Yannick Estève ICASSP - May 2020 Error Analysis Applied to End-to-End Spoken Language Understanding

Introduction Context Analysis our End-to-End (E2E) Spoken Language Understanding (SLU) system This system reaches state-of-the-art performance for a french SLU task A. Caubrière et al. ICASSP 2020 1

Introduction Context Analysis our End-to-End (E2E) Spoken Language Understanding (SLU) system This system reaches state-of-the-art performance for a french SLU task Goal Analyze the errors produced by the system Understand the weakness of this E2E system From the weakness, discover how to improve our approach A. Caubrière et al. ICASSP 2020 1

Analysed system Deep Speech 2 (DS2) [Amodei et al. ] (2016) End-to-end speech recognition system Connectionist Temporal Classification (CTC) Allow the system to learn the alignment between speech and output sequence to produce A. Caubrière et al. ICASSP 2020 2

Analysed system Deep Speech 2 (DS2) [Amodei et al. ] (2016) End-to-end speech recognition system Connectionist Temporal Classification (CTC) Allow the system to learn the alignment between speech and output sequence to produce End-to-End Spoken Language Understanding (SLU) [Ghannay et al. ] (2018) Tag’s boundaries injection ASR : The sculptor Caesar died yesterday in Paris at the age of seventy-seven years NER : The sculptor <pers Caesar > died <time yesterday > in <loc Paris > at the age of <amount seventy-seven years > A. Caubrière et al. ICASSP 2020 2

Analysed system Deep Speech 2 (DS2) [Amodei et al. ] (2016) End-to-end speech recognition system Connectionist Temporal Classification (CTC) Allow the system to learn the alignment between speech and output sequence to produce End-to-End Spoken Language Understanding (SLU) [Ghannay et al. ] (2018) Tag’s boundaries injection ASR : The sculptor Caesar died yesterday in Paris at the age of seventy-seven years NER : The sculptor <pers Caesar > died <time yesterday > in <loc Paris > at the age of <amount seventy-seven years > Curriculum-based transfer learning (CTL) [Caubrière et al. ] (2019) Train the same model through a sequence of training processes and transfer learning Keep all parameters except the top layer Use of different tasks sorted from the most generic to the most specific A. Caubrière et al. ICASSP 2020 2

Learned tasks Automatic Speech Recognition (ASR) A. Caubrière et al. ICASSP 2020 3

Learned tasks Automatic Speech Recognition (ASR) Named Entity Recognition (NER) Annotation according to 8 entity-types ( pers, loc, amount, etc) A. Caubrière et al. ICASSP 2020 3

Learned tasks Automatic Speech Recognition (ASR) Named Entity Recognition (NER) Annotation according to 8 entity-types ( pers, loc, amount, etc) Merged semantic concepts extraction (SC_mer) MEDIA: French hotel booking task PORTMEDIA: French theater ticket booking task Annotation according to 76 semantic concepts ( location-town, stay-nbNight, nb-reservation, etc ) A. Caubrière et al. ICASSP 2020 3

Learned tasks Automatic Speech Recognition (ASR) Named Entity Recognition (NER) Annotation according to 8 entity-types ( pers, loc, amount, etc) Merged semantic concepts extraction (SC_mer) MEDIA: French hotel booking task PORTMEDIA: French theater ticket booking task Annotation according to 76 semantic concepts ( location-town, stay-nbNight, nb-reservation, etc ) Semantic concepts extraction on MEDIA (M) Our target task A. Caubrière et al. ICASSP 2020 3

Learned tasks Automatic Speech Recognition (ASR) Named Entity Recognition (NER) Annotation according to 8 entity-types ( pers, loc, amount, etc) Merged semantic concepts extraction (SC_mer) MEDIA: French hotel booking task PORTMEDIA: French theater ticket booking task Annotation according to 76 semantic concepts ( location-town, stay-nbNight, nb-reservation, etc ) Semantic concepts extraction on MEDIA (M) Our target task amount Order of learned tasks We define the following order of specificity: Speech > Named Entities > Semantic Concepts stay-nbNight nb-reservation A. Caubrière et al. ICASSP 2020 3

Data French data sets Uses of as much data as possible at our disposal Broadcast news, Telephone and Human-Human Dialogue A. Caubrière et al. ICASSP 2020 4

Data French data sets Uses of as much data as possible at our disposal Broadcast news, Telephone and Human-Human Dialogue Speech ~ 360h A. Caubrière et al. ICASSP 2020 4

Data French data sets Uses of as much data as possible at our disposal Broadcast news, Telephone and Human-Human Dialogue Speech ~ 360h NE ~ 300h A. Caubrière et al. ICASSP 2020 4

Data French data sets Uses of as much data as possible at our disposal Broadcast news, Telephone and Human-Human Dialogue Speech ~ 360h NE ~ 300h SC ~ 25h A. Caubrière et al. ICASSP 2020 4

CTL approach Generic concepts Specific concepts (ASR) (ASR) Character sequence Character sequence softmax FC bLSTM CNN A. Caubrière et al. ICASSP 2020 5

CTL approach Generic concepts Specific concepts (ASR) (ASR) (NER) Character sequence Character sequence Character sequence & named entity softmax FC Keep bLSTM CNN A. Caubrière et al. ICASSP 2020 5

CTL approach Generic concepts Specific concepts (ASR) (ASR) (NER) (SC_mer) Character sequence Character sequence Character sequence Character sequence & named entity & merged semantic concepts Reset softmax FC Keep Keep bLSTM CNN A. Caubrière et al. ICASSP 2020 5

CTL approach Generic concepts Specific concepts (ASR) (ASR) (NER) (M) (SC_mer) Character sequence Character sequence Character sequence Character sequence Character sequence & named entity & merged semantic & target semantic concepts concepts Reset Reset Reset softmax FC Keep Keep Keep bLSTM CNN A. Caubrière et al. ICASSP 2020 5

Errors distribution Systems outputs for MEDIA concepts (development dataset) The thirty most common mistakes A. Caubrière et al. ICASSP 2020 6

Errors distribution Systems outputs for MEDIA concepts (development dataset) The thirty most common mistakes Errors characteristics Concepts deletion errors mainly Represented by a few concepts Frequent errors corresponding to concepts with small values ( connectprop , lienref-coref, ... ) A. Caubrière et al. ICASSP 2020 6

Errors distribution Systems outputs for MEDIA concepts (development dataset) The thirty most common mistakes Errors characteristics Concepts deletion errors mainly Represented by a few concepts Frequent errors corresponding to concepts supported by small words ( connectprop , lienref-coref, ... ) -> “I'd like to know <lienref-coref the > <object price > of the night <connectprop and > if there are any <object rooms > left” A. Caubrière et al. ICASSP 2020 6

Transcription problem Cases of concepts deletions (MEDIA development dataset) A. Caubrière et al. ICASSP 2020 7

Transcription problem Cases of concepts deletions (MEDIA development dataset) Reference of an example -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > in <location-town Périgueux > ” A. Caubrière et al. ICASSP 2020 7

Transcription problem Cases of concepts deletions (MEDIA development dataset) Reference of an example -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > in <location-town Périgueux > ” Correct automatic transcription -> “From <time-date nineteen > to <time-date twenty-two > october and in <location-town Périgueux > ” A. Caubrière et al. ICASSP 2020 7

Transcription problem Cases of concepts deletions (MEDIA development dataset) Reference of an example -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > in <location-town Périgueux > ” Correct automatic transcription -> “From <time-date nineteen > to <time-date twenty-two > october and in <location-town Périgueux > ” Incorrect automatic transcription -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > par lieu ” A. Caubrière et al. ICASSP 2020 7

Transcription problem Cases of concepts deletions (MEDIA development dataset) Reference of an example -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > in <location-town Périgueux > ” Correct automatic transcription -> “From <time-date nineteen > to <time-date twenty-two > october and in <location-town Périgueux > ” Incorrect automatic transcription -> “From <time-date nineteen > to <time-date twenty-two > october <connectprop and > par lieu ” Correct automatic transcription but the value is nested in another concept -> “From <time-date nineteen > to <time-date twenty-two > october <location-town and in Périgueux > ” A. Caubrière et al. ICASSP 2020 7

Transcription problem Focused Nb Correct Wrong Nested concept Deletion ASR ASR 39 28 6 5 connectProp 33 19 10 4 lienref-coref 38 31 4 3 objet A. Caubrière et al. ICASSP 2020 8

Error Analysis Applied to End-to-End Spoken Language Understanding - PowerPoint PPT Presentation

Antoine Caubrire, Sahar Ghannay, Natalia Tomashenko, Renato De Mori, Antoine Laurent, Emmanuel Morin, Yannick Estve ICASSP - May 2020 Error Analysis Applied to End-to-End Spoken Language Understanding Introduction Context Analysis our

Spoken Language Structure Hsin-min Wang References: - X. Huang et al., Spoken Language

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Spoken Language Structure Berlin Chen 2004 References: - X. Huang et. al., Spoken Language

Spoken Language Structure Berlin Chen 2003 References: - X. Huang et. al., Spoken Language

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits

Defining EBCL descriptors for Reception Spoken and Production Spoken Federica Casalin

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

Error Detection Codes Error Detection Two types Nave scheme Error Detection Codes

llvm::Error Rich Error Handling in LLVM Error Handling History LLVMs APIs historically

Spoken and Sign Languages Spoken and Sign Languages A Cross Modal Study Purushottam Kar Achla

STANDARDS IN SPOKEN CORPORA OUTLINE (1) Case study: Spoken

Uncertainty in Spoken Uncertainty in Spoken Multimodal - speakers have intentions - speech,

THE SPOKEN BLESSING Numbers 6:22 27 Since the start of human history, the spoken blessing

Speech Processing 15-492/18-492 Spoken Dialog Systems Conversing with machines Spoken Dialog

State Transition Diagram Characterize interactive system as: 1) A set of states ; and 2) The State

Security and Authorization Chapter 21 Database Management Systems, 3ed, R. Ramakrishnan and J.

Experience in audit of public procurement cases and findings Supreme Audit Office of the

IB 1A5 (=E1R86), 1L1 (=E1R05) , IIB E2R40 , 2011 L3 URL

Introduction to HPC2N Birgitte Bryds HPC2N, Ume a University 3 December 2019 1 / 23

EECS 252 Graduate Computer Architecture Lec 7 Dynamically Scheduled Instruction Processing

WLAN Performance Aspects Mohammad Hossein Manshaei Jean-Pierre Hubaux http://mobnet.epfl.ch 1

Project Civic Access Practical Applications and Lessons Learned One Countys Experience

Error Analysis Applied to End-to-End Spoken Language Understanding - PowerPoint PPT Presentation

Antoine Caubrire, Sahar Ghannay, Natalia Tomashenko, Renato De Mori, Antoine Laurent, Emmanuel Morin, Yannick Estve ICASSP - May 2020 Error Analysis Applied to End-to-End Spoken Language Understanding Introduction Context Analysis our

Spoken Language Structure Hsin-min Wang References: - X. Huang et al., Spoken Language

Chapter 11: The R.M.S. Error for Regression Errors: A has a large positive error B has a large

Spoken Language Structure Berlin Chen 2004 References: - X. Huang et. al., Spoken Language

Spoken Language Structure Berlin Chen 2003 References: - X. Huang et. al., Spoken Language

ERROR DETECTON &amp; CORRECTION Error Detection EDC= Error Detection and Correction bits

Defining EBCL descriptors for Reception Spoken and Production Spoken Federica Casalin

Human Error and Human Error Identification Techniques adapted from an IE 545 presentaton by

An Overview of Human Error Drawn f rom J . Reason, Human Error , Cambridge, 1990 Aaron Brown CS

Questions From Chapter 1 Figure 1.1: Testing life cycle Ch 12 Error vocabulary 1

Error Detection Codes Error Detection Two types Nave scheme Error Detection Codes

llvm::Error Rich Error Handling in LLVM Error Handling History LLVMs APIs historically

Spoken and Sign Languages Spoken and Sign Languages A Cross Modal Study Purushottam Kar Achla

STANDARDS IN SPOKEN CORPORA OUTLINE (1) Case study: Spoken

Uncertainty in Spoken Uncertainty in Spoken Multimodal - speakers have intentions - speech,

THE SPOKEN BLESSING Numbers 6:22 27 Since the start of human history, the spoken blessing

Speech Processing 15-492/18-492 Spoken Dialog Systems Conversing with machines Spoken Dialog

State Transition Diagram Characterize interactive system as: 1) A set of states ; and 2) The State

Security and Authorization Chapter 21 Database Management Systems, 3ed, R. Ramakrishnan and J.

Experience in audit of public procurement cases and findings Supreme Audit Office of the

IB 1A5 (=E1R86), 1L1 (=E1R05) , IIB E2R40 , 2011 L3 URL

Introduction to HPC2N Birgitte Bryds HPC2N, Ume a University 3 December 2019 1 / 23

EECS 252 Graduate Computer Architecture Lec 7 Dynamically Scheduled Instruction Processing

WLAN Performance Aspects Mohammad Hossein Manshaei Jean-Pierre Hubaux http://mobnet.epfl.ch 1

Project Civic Access Practical Applications and Lessons Learned One Countys Experience

ERROR DETECTON & CORRECTION Error Detection EDC= Error Detection and Correction bits