Age and Gender Recognition from Speech Patterns Based on Supervised NonNegative Matrix Factorization
July 2011
- Mohamad Hasan Bahari
Hugo Van hamme
Age and Gender Recognition from Speech Patterns Based on Supervised - - PowerPoint PPT Presentation
Age and Gender Recognition from Speech Patterns Based on Supervised NonNegative Matrix Factorization Mohamad Hasan Bahari Hugo Van hamme July 2011 Outline Introduction and Motivations Age and Gender Recognition Corpora
July 2011
Hugo Van hamme
Introduction and Motivations Age and Gender Recognition Corpora Supervised Nonnegative Matrix Factorization
Proposed Method Results Conclusions and Future Researches
(kidnapping, threatening calls, +)
the speaker from his/her voice patterns (). Physical: Psychological:
1.
Gender
2.
Age
3.
Accent
4.
+ Psychological:
1.
Anxiousness
2.
Stress
3.
Confidence
4.
+
" !
$
'
&
) ) ) ) ) $
* + , - ./
+0000000 00000
0000000 00000 10000000 00000
'
2 ' -2
&
& ' )
&
( &
The corpus contains live and read commentaries, news, interviews, and
reports broadcast in Belgium
Different age groups and genders
1835 1835 3645 3645 4681 4681 Number of Speakers 85 53 160 41 191 25
[1] D. A. Van Leeuwen, J. Kessens, E. Sanders, and H. van den Heuvel, In proc. Interspeech, pp. 25712574, 2009.
./
[1] H. Van hamme, In proc. Interspeech, Australia, pp. 25542557, 2008.
Problem Statement: Given a training dataset: Str= {(x1, y1), . . ., (xn, yn), . . . , (xN, yN)} xn is a vector of observed characteristics for the data item yn denotes a label vector which represents the class that xn belongs to
Approximation of a classifier function (g), such that ŷ=g(xtst) is as close as possible to the true label. xtst is an unseen observation
SNMF in Training Phase:
First step: Second step:
[ ] [ ]
=
≈ ≈ =
Multiplicative updating formula:
)
( ) ( ) ( )
∑ ∑
+ − + =
] [ ] [ ] [ ] [ ] [ ] [ ] [ ]
+ ← ←
× ×
SNMF in Testing Phase:
First step: Second step:
( )
= =
=
Extended KullbeckLeibler divergence: Multiplicative updating formula:
)
( ) ( ) ( )
∑ ∑
+ − + =
] [ ] [ ] [ ]
+ ←
×
"
! '
#
9
Speech Signal Feature selection Feature Vectors +.
Speaker Independent Model Speaker Adaptation Method
&7
' ' '7
Model Method
3.
Supervector making procedure
Gaussian Mixture Model (GMM) of each speaker adapted HMMs is: Three type of supervectors:
)
=
Weights supervectors: The result of this step is 555 supervectors for each of 555 speakers
[ ] [ ]
λ λ χ λ
=
#
Evaluation Methodology
Run 1 Run 2
Gender recognition is 96%. relative confusion matrix
" =" 9> = @ = ? = << =#
=
= ## = #< =
9# = # < =
9# = # < = ! =" = ; = <@ = ! = => > > @
(0 6 ?0 ?0 0
1
; < "# # + "
(
" * 67% ( # , ;@A
, ;@A 9 +
B