[PPT] - Challenges and Techniques for Dialectal Arabic Speech Recognition PowerPoint Presentation

SLIDE 1

Challenges and Techniques for Dialectal Arabic Speech Recognition and Machine Translation

Mohamed Elmahdy, Mark Hasegawa-Johnson, Eiman Mustafawi, Rehab Duwairi, Wolfgang Minker

Nov. 21, 20011

Qatar University University of Illinois Ulm University

SLIDE 2

Page 2 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Largest still living Semitic language
250+ million native speakers

Arabic Formal Dialectal

Modern Standard Arabic (MSA)
Standardized
A lot of ASR and MT research
Not used in everyday life
Used in everyday life
Not standardized (mainly spoken)
Many different dialects
Very few ASR and MT research

Significant differences between MSA and Dialectal Arabic

Considered as completely different languages

Introduction | Approaches | Experiments and results | Conclusions

Arabic Language

SLIDE 3

Page 3 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

MSA Versus Dialectal Arabic

Let‟s have Egyptian Colloquial Arabic (ECA) as a typical Arabic dialect
Phonological

 /t/, /s/ in ECA instead of /T/ in MSA e.g. /tala:tah/ (three) in ECA versus /Tala:Tah/ in MSA

Lexical

 /t„ArAbE:zA/ (table) in ECA versus /t„awila/ in MSA

Syntactic

 SVO in ECA versus VSO in MSA

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 4

Page 4 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Automatic Speech Recognition

High level diagram for a state-of-the-art ASR system

Feature Extraction Speech Decoder Features Words Acoustic Model Language Model

Introduction | Approaches | Experiments and results | Conclusions

For dialectal Arabic, sparse and low quality corpora are available

) ( ) | ( max arg

^

W P W O P W

L W



) | ( W O P ) (W P

^

W O Pronunciation Model

SLIDE 5

Page 5 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Statistical Machine Translation

High level diagram for a SMT system

Arabic sentence Decoder English sentence Translation Model Language Model

Introduction | Approaches | Experiments and results | Conclusions

Large parallel corpora are required For dialectal Arabic, parallel corpora are not available

) ( ) | ( max arg

^

E P E A P E

English E



) | ( E A P ) (E P

^

E A

SLIDE 6

Page 6 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Objectives

ASR and MT for dialectal Arabic where little data exists
To benefit from existing MSA speech data to improve dialectal Arabic ASR

and MT

Ultimate goal “Speech-to-text MT” for dialectal Arabic

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 7

Page 7 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

Introduction
Approaches
Experiments and results
Conclusions and future directions

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 8

Page 8 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Proposed Approaches for Dialectal Arabic ASR

Phonemic acoustic modeling

→ Dialectal speech data where phonetic transcription is available

Graphemic acoustic modeling
Unsupervised acoustic modeling
Arabic Chat Alphabet-based acoustic modeling

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 9

Page 9 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling

Benefit from existing large MSA speech corpora
Assumptions:

 MSA is always a 2nd language for any Arabic speaker  Large amount of MSA speech data (large number of speakers) implicitly cover all the acoustic features of the different Arabic dialects

Approach:

 Train an acoustic model using a large amount of MSA speech data  Adaptation of the MSA acoustic models with a little amount of dialectal speech data

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 10

Page 10 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

State-of-the-art AM adaptation techniques include:

 Maximum Likelihood Linear Regression (MLLR)  Maximum A-Posteriori (MAP)

Requirement: adaptation data and the AM have to share the same

language and phoneme set

Egyptian Colloquial Arabic (ECA) is chosen as a typical dialect
INITIALLY: MSA and ECA do not share the same phoneme inventory

Introduction | Approaches | Experiments and results | Conclusions

MSA ECA

) ( ) | ( max arg    



P O P

MAP

b A

MLLR

   

Acoustic model adaptation is not possible

SLIDE 11

Page 11 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

SOLUTION: Phoneme sets normalization

 AM adaptation is possible

Phoneme sets normalization

 Several phone mapping rules are applied  Map ECA phonemes to their origins in MSA (even if they are acoustically different)

Introduction | Approaches | Experiments and results | Conclusions

MSA MSA ECA ECA Normalization phone mapping rules ECA MSA /b/ /g/ /j/ /e/ /i/ /o/ /u/ /t/ ……. /b/ /dZ/ /i/ /u/ /t/ ………

رزج (carrot) /g/ /A/ /z/ /A/ /r/ /dZ/ /a/ /z/ /a/ /r/

SLIDE 12

Page 12 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Acoustic Modeling (cont.)

Block diagram for the proposed approach
The adapted ECA AM is evaluated against the ECA baseline AM

Introduction | Approaches | Experiments and results | Conclusions

ECA corpus MSA corpus Phonemes Normalization Phonemes Normalization Normalized MSA corpus Normalized ECA corpus Training MSA acoustic model ECA baseline model Training MLLR adaptation MAP adaptation ECA final model

SLIDE 13

Page 13 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Proposed Approaches for Dialectal Arabic ASR

Introduction | Approaches | Experiments and results | Conclusions

Phonemic acoustic modeling

→ Dialectal speech data where phonetic transcription is available

Graphemic acoustic modeling

→ Phonetic transcription is not possible/difficult → Short vowels are missing → Phonetic transcription is approximated to be word letters

Unsupervised acoustic modeling

→ Transcriptions are not available at all → Dialectal speech was automatically transcribed using a MSA model

Arabic Chat Alphabet-based acoustic modeling

→ Latin letters are used instead of Arabic ones → Include short vowels that are missing in traditional Arabic orthography

SLIDE 14

Page 14 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

Introduction
Approaches
Experiments and results
Conclusions

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 15

Page 15 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Phonemic Cross-Lingual Adaptation Results

ECA corpus:

→ 65% for training/adaptation → 35% for testing

Introduction | Approaches | Experiments and results | Conclusions

Word Error Rate (WER)

N Del Ins Sub WER   

0.00 5.00 10.00 15.00 20.00 25.00 30.00 ECA Phonemic AM ECA baseline MSA only MSA+ECA data pooling MSA+ECA adaptation WER (%)

41.8% Relative reduction in WER

SLIDE 16

Page 16 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Effect of MSA Speech Data Amount

Varying the amount of MSA speech data
Effect on phonemic cross-lingual adaptation

2 4 6 8 10 12 14 16 18 0.5 1 2 4 8 16 32 MSA+ECA adaptation MSA speech amount (hours) WER (%)

Introduction | Approaches | Experiments and results | Conclusions

Consistent decrease in WER

SLIDE 17

Page 17 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Outline

Introduction
Approaches
Experiments and results
Conclusions

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 18

Page 18 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Conclusions and Future Directions

Conclusions

→ Problems in ASR and MT for dialectal Arabic → Cross-lingual acoustic modeling for dialectal Arabic ASR → Improvements are observed in both phonemic and graphemic modeling → Consistent reduction in WER by adding more MSA data

Future directions

→ Data collection (a focus is placed on the Qatari dialect) → Extension to all the Arabic dialects → Dialectal Arabic MT and LM

Introduction | Approaches | Experiments and results | Conclusions

SLIDE 19

Page 19 Challenges and techniques for dialectal Arabic ASR and MT | Mohamed Elmahdy | Qatar | Doha | Nov. 21, 2011

Thank you for your attention

Introduction | Approaches | Experiments and results | Conclusions