Challenges and Techniques for Dialectal Arabic Speech Recognition - - PowerPoint PPT Presentation

challenges and techniques for dialectal arabic speech
SMART_READER_LITE
LIVE PREVIEW

Challenges and Techniques for Dialectal Arabic Speech Recognition - - PowerPoint PPT Presentation

Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, Wolfgang Minker Nov. 21, 20011 Qatar University University of Illinois Ulm


slide-1
SLIDE 1

Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation

Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, Wolfgang Minker

  • Nov. 21, 20011

Qatar University University of Illinois Ulm University

slide-2
SLIDE 2

Page 2 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

  • Largest still living Semitic language
  • 250+ million native speakers

Arabic Formal Dialectal

  • Modern Standard Arabic (MSA)
  • Standardized
  • A lot of ASR and MT research
  • Not used in everyday life
  • Used in everyday life
  • Not standardized (mainly spoken)
  • Many different dialects
  • Very few ASR and MT research

Significant differences between MSA and Dialectal Arabic

  • Considered as completely different languages

Introduction | Approaches | Experiments and results | Conclusions

Arabic Language

slide-3
SLIDE 3

Page 3 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

MSA Versus Dialectal Arabic

  • Let‟s have Egyptian Colloquial Arabic (ECA) as a typical Arabic dialect
  • Phonological

 /t/, /s/ in ECA instead of /T/ in MSA e.g. /tala:tah/ (three) in ECA versus /Tala:Tah/ in MSA

  • Lexical

 /t„ArAbE:zA/ (table) in ECA versus /t„awila/ in MSA

  • Syntactic

 SVO in ECA versus VSO in MSA

Introduction | Approaches | Experiments and results | Conclusions

slide-4
SLIDE 4

Page 4 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Automatic Speech Recognition

  • High level diagram for a state-of-the-art ASR system

Feature Extraction Speech Decoder Features Words Acoustic Model Language Model

Introduction | Approaches | Experiments and results | Conclusions

For dialectal Arabic, sparse and low quality corpora are available

) ( ) | ( max arg

^

W P W O P W

L W

) | ( W O P ) (W P

^

W O Pronunciation Model

slide-5
SLIDE 5

Page 5 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Statistical Machine Translation

  • High level diagram for a SMT system

Arabic sentence Decoder English sentence Translation Model Language Model

Introduction | Approaches | Experiments and results | Conclusions

Large parallel corpora are required For dialectal Arabic, parallel corpora are not available

) ( ) | ( max arg

^

E P E A P E

English E

) | ( E A P ) (E P

^

E A

slide-6
SLIDE 6

Page 6 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Objectives

  • ASR and MT for dialectal Arabic where little data exists
  • To benefit from existing MSA speech data to improve dialectal Arabic ASR

and MT

  • Ultimate goal “Speech-to-text MT” for dialectal Arabic

Introduction | Approaches | Experiments and results | Conclusions

slide-7
SLIDE 7

Page 7 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

  • Introduction
  • Approaches
  • Experiments and results
  • Conclusions and future directions

Introduction | Approaches | Experiments and results | Conclusions

slide-8
SLIDE 8

Page 8 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Proposed Approaches for Dialectal Arabic ASR

  • Phonemic acoustic modeling

→ Dialectal speech data where phonetic transcription is available

  • Graphemic acoustic modeling
  • Unsupervised acoustic modeling
  • Arabic Chat Alphabet-based acoustic modeling

Introduction | Approaches | Experiments and results | Conclusions

slide-9
SLIDE 9

Page 9 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling

  • Benefit from existing large MSA speech corpora
  • Assumptions:

 MSA is always a 2nd language for any Arabic speaker  Large amount of MSA speech data (large number of speakers) implicitly cover all the acoustic features of the different Arabic dialects

  • Approach:

 Train an acoustic model using a large amount of MSA speech data  Adaptation of the MSA acoustic models with a little amount of dialectal speech data

Introduction | Approaches | Experiments and results | Conclusions

slide-10
SLIDE 10

Page 10 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

  • State-of-the-art AM adaptation techniques include:

 Maximum Likelihood Linear Regression (MLLR)  Maximum A-Posteriori (MAP)

  • Requirement: adaptation data and the AM have to share the same

language and phoneme set

  • Egyptian Colloquial Arabic (ECA) is chosen as a typical dialect
  • INITIALLY: MSA and ECA do not share the same phoneme inventory

Introduction | Approaches | Experiments and results | Conclusions

MSA ECA

) ( ) | ( max arg    

P O P

MAP

b A

MLLR

   

Acoustic model adaptation is not possible

slide-11
SLIDE 11

Page 11 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

  • SOLUTION: Phoneme sets normalization

 AM adaptation is possible

  • Phoneme sets normalization

 Several phone mapping rules are applied  Map ECA phonemes to their origins in MSA (even if they are acoustically different)

Introduction | Approaches | Experiments and results | Conclusions

MSA MSA ECA ECA Normalization phone mapping rules ECA MSA /b/ /g/ /j/ /e/ /i/ /o/ /u/ /t/ ……. /b/ /dZ/ /i/ /u/ /t/ ………

رزج (carrot) /g/ /A/ /z/ /A/ /r/ /dZ/ /a/ /z/ /a/ /r/

slide-12
SLIDE 12

Page 12 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

  • Block diagram for the proposed approach
  • The adapted ECA AM is evaluated against the ECA baseline AM

Introduction | Approaches | Experiments and results | Conclusions

ECA corpus MSA corpus Phonemes Normalization Phonemes Normalization Normalized MSA corpus Normalized ECA corpus Training MSA acoustic model ECA baseline model Training MLLR adaptation MAP adaptation ECA final model

slide-13
SLIDE 13

Page 13 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Proposed Approaches for Dialectal Arabic ASR

Introduction | Approaches | Experiments and results | Conclusions

  • Phonemic acoustic modeling

→ Dialectal speech data where phonetic transcription is available

  • Graphemic acoustic modeling

→ Phonetic transcription is not possible/difficult → Short vowels are missing → Phonetic transcription is approximated to be word letters

  • Unsupervised acoustic modeling

→ Transcriptions are not available at all → Dialectal speech was automatically transcribed using a MSA model

  • Arabic Chat Alphabet-based acoustic modeling

→ Latin letters are used instead of Arabic ones → Include short vowels that are missing in traditional Arabic orthography

slide-14
SLIDE 14

Page 14 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

  • Introduction
  • Approaches
  • Experiments and results
  • Conclusions

Introduction | Approaches | Experiments and results | Conclusions

slide-15
SLIDE 15

Page 15 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Adaptation Results

  • ECA corpus:

→ 65% for training/adaptation → 35% for testing

Introduction | Approaches | Experiments and results | Conclusions

  • Word Error Rate (WER)

N Del Ins Sub WER   

0.00 5.00 10.00 15.00 20.00 25.00 30.00 ECA Phonemic AM ECA baseline MSA only MSA+ECA data pooling MSA+ECA adaptation WER (%)

41.8% Relative reduction in WER

slide-16
SLIDE 16

Page 16 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Effect of MSA Speech Data Amount

  • Varying the amount of MSA speech data
  • Effect on phonemic cross-lingual adaptation

2 4 6 8 10 12 14 16 18 0.5 1 2 4 8 16 32 MSA+ECA adaptation MSA speech amount (hours) WER (%)

Introduction | Approaches | Experiments and results | Conclusions

Consistent decrease in WER

slide-17
SLIDE 17

Page 17 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

  • Introduction
  • Approaches
  • Experiments and results
  • Conclusions

Introduction | Approaches | Experiments and results | Conclusions

slide-18
SLIDE 18

Page 18 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Conclusions and Future Directions

  • Conclusions

→ Problems in ASR and MT for dialectal Arabic → Cross-lingual acoustic modeling for dialectal Arabic ASR → Improvements are observed in both phonemic and graphemic modeling → Consistent reduction in WER by adding more MSA data

  • Future directions

→ Data collection (a focus is placed on the Qatari dialect) → Extension to all the Arabic dialects → Dialectal Arabic MT and LM

Introduction | Approaches | Experiments and results | Conclusions

slide-19
SLIDE 19

Page 19 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Thank you for your attention

Introduction | Approaches | Experiments and results | Conclusions