A PLDA Approach for Language and Text Independent Speaker - - PowerPoint PPT Presentation

▶

Jan 22, 2023 25 likes •77 views

A PLDA Approach for Language and Text Independent Speaker Recognition Abbas Khosravani, Mohammad M. Homayounpour, Dijana Petrovska-Delacrtaz, Grard Chollet Laboratory for Intelligent Multimedia Processing, Amirkabir University of Technology,

SLIDE 1

A PLDA Approach for Language and Text Independent Speaker Recognition

Abbas Khosravani, Mohammad M. Homayounpour, Dijana Petrovska-Delacrétaz, Gérard Chollet

Laboratory for Intelligent Multimedia Processing, Amirkabir University of Technology, Iran Institut Mines-Télécom, Télécom SudParis, France CNRS-LTCI, Institut Mines-Télécom, France and Intelligent Voice Ltd., England

SLIDE 2

What the problem is?

✦ The acoustic content of a given speech segment will affect the variability of an i-vector extracted from that segment. ✦ The Probabilistic Linear Discriminant Analysis (PLDA) aims at modeling all sources of undesirable variability within a single covariance matrix. ✦ Lack of multilingual utterances for each speaker in system development will restrict PLDA to model language source of variability.

Abbas Khosravani: Speaker and Language Recognition

SLIDE 3

Language Normalized WCCN

✦ Language source normalization is an effective technique to the reduction of language dependency in the state-of-the-art i-vector/PLDA speaker recognition system. ✦ It can be implemented by extending the Source-Normalized WCCN in order to mitigate variations that separate languages.

Abbas Khosravani: Speaker and Language Recognition

SLIDE 4

What we proposed?

✦ We aim at proposing a PLDA training algorithm so as to reduce the effect of language

n the performance of speaker recognition.

✦ If we can estimate a speaker and channel subspace from a multilingual training data set which are void of language variability, it can assist PLDA to work independent of the language.

Abbas Khosravani: Speaker and Language Recognition

The idea is to estimate speaker and channel variability void of language variability.

SLIDE 5

How did we evaluate it?

✦ We have evaluated the system on telephony multilingual trials as well as English trials

f SRE’08 core condition (3832 target and 33218 non-target trials).

✦ The development data contains 13338 utterances from 1108 speakers, speaking in 5 different languages including English (12047), Russian (314), Spanish (146), Arabic (488) and Mandarin (343), of whom 204 speakers have multilingual speech utterances.

Abbas Khosravani: Speaker and Language Recognition WCCN+PLDA LN-WCCN+PLDA WCCN+LI-PLDA LN-WCCN+LI-PLDA