On the use of phone-gram units in recurrent neural networks for - - PowerPoint PPT Presentation

▶

Jan 05, 2023 109 likes •170 views

On the use of phone-gram units in recurrent neural networks for language identification Christian Salamea, Luis F. D'Haro, Ricardo de Crdoba, Rubn San-Segundo Speech Technology Group. Dept. of Electronic Engineering. Universidad

SLIDE 1

On the use of phone-gram units in recurrent neural networks for language identification

Speech Technology Group.

Dept. of Electronic Engineering.

Universidad Politécnica de Madrid

Christian Salamea, Luis F. D'Haro, Ricardo de Córdoba, Rubén San-Segundo

SLIDE 2

Odyssey 2016 - Bilbao

Summary of the paper

Phone-gram units in RNN

Phonemes as input

Using a 1-N codification (N the total number of phonemes) We’ve incorporated contextual information in the NN input: uniphones, diphones, and trigrams

We propose the concatenation of n-adjacent phonemes RNNLM-P (phone grams) applied to LID:

Based on a PPRLM architecture: For each phonetic recognizer, a phoneme sequence is obtained In evaluation, for each utterance, an entropy metric is obtained from the RNNLM The entropy scores are calibrated and fused

SLIDE 3

Odyssey 2016 - Bilbao

Summary of the paper

Phonetic Vector Representation For Vocabulary Reduction

Use K-means to group phone-grams Neural Embedding Models with the Skip-Gram model We work at the phone level 7.3% improvement thanks to vocabulary reduction

The most relevant RNN parameters are considered

The Number of neurons in the state layer (NNE) Number of classes (NCS). Phone-grams are grouped in the output layer in a factorization process

A high NCS value speeds up the RNN training but the final language model is less accurate

Number of state layers (MEM) corresponding to previous times.

Previous context information is taken into account

SLIDE 4

Odyssey 2016 - Bilbao

Results

System Abs MFCCs 7,60 PPRLM 11,57 RNNLM-P 10,87 System Abs MFCCs 7,60 PPRLM 11,57 RNNLM-P 10,87 Cavg System Abs Improve % RNNLM-P+PPRLM 10,51 9,2 PPRLM+MFCCs 5,10 32,9 RNNLM-P+MFCCs 5,04 33,7 RNNLM-P+PPRLM+MFCC 4,80 36,8 Cavg System Abs Improve % RNNLM-P+PPRLM 10,51 9,2 PPRLM+MFCCs 5,10 32,9 RNNLM-P+MFCCs 5,04 33,7 RNNLM-P+PPRLM+MFCC 4,80 36,8