Age and Gender Recognition from Speech Patterns Based on Supervised - - PowerPoint PPT Presentation

▶

Sep 19, 2023 136 likes •392 views

Age and Gender Recognition from Speech Patterns Based on Supervised NonNegative Matrix Factorization Mohamad Hasan Bahari Hugo Van hamme July 2011 Outline Introduction and Motivations Age and Gender Recognition Corpora

SLIDE 1

Age and Gender Recognition from Speech Patterns Based on Supervised NonNegative Matrix Factorization

July 2011

Mohamad Hasan Bahari

Hugo Van hamme

SLIDE 2

Outline

Introduction and Motivations Age and Gender Recognition Corpora Supervised Nonnegative Matrix Factorization

Supervised Nonnegative Matrix Factorization

Proposed Method Results Conclusions and Future Researches

SLIDE 3

Introduction

Confirming the identity of individuals
Biometric Characteristics
Fingerprint
Face
Iris
Iris
Hand Geometry
Ear Shape
+
Choosing a characteristic
Availability
Reliability

SLIDE 4

Motivation

In many real world cases, only speech patterns are available

(kidnapping, threatening calls, +)

Speech patterns can include many interesting information
Gender
Age
Age
Dialect (original or previous regions)
Membership of a particular social group
+

SLIDE 5

Goal

To extract different physical and psychological characteristics of

the speaker from his/her voice patterns (). Physical: Psychological:

Physical:

Gender

Age

Accent

+ Psychological:

Anxiousness

Stress

Confidence

SLIDE 6

Age and Gender Recognition

SLIDE 7

" !

Age and Gender Recognition

) ) ) ) ) $

SLIDE 8

* + , - ./

Age and Gender Recognition

+0000000 00000

[1] W. S. Brown, R. J. Morris, H. Hollien, and E. Howell, Journal of Voice, vol. 5, pp. 310–315, 1991.

0000000 00000 10000000 00000

SLIDE 9

Age and Gender Recognition

SLIDE 10

Age and Gender Recognition

)

2 ' -2

3 -2
4
4
+ 5

& ' )

( &

SLIDE 11

Corpora

555 speakers from the Nbest evaluation corpus [1]

The corpus contains live and read commentaries, news, interviews, and

reports broadcast in Belgium

Different age groups and genders

1835 1835 3645 3645 4681 4681 Number of Speakers 85 53 160 41 191 25

[1] D. A. Van Leeuwen, J. Kessens, E. Sanders, and H. van den Heuvel, In proc. Interspeech, pp. 25712574, 2009.

SLIDE 12

SNMF

67 & 5 -6

! 6 !6 ./

[1] H. Van hamme, In proc. Interspeech, Australia, pp. 25542557, 2008.

SLIDE 13

Problem Statement: Given a training dataset: Str= {(x1, y1), . . ., (xn, yn), . . . , (xN, yN)} xn is a vector of observed characteristics for the data item yn denotes a label vector which represents the class that xn belongs to

SNMF

Goal:

Approximation of a classifier function (g), such that ŷ=g(xtst) is as close as possible to the true label. xtst is an unseen observation

SLIDE 14

SNMF

SNMF in Training Phase:

First step: Second step:

[ ] [ ]

     ≈       ≈       =

Extended KullbeckLeibler divergence:

Multiplicative updating formula:

(

)

( ) ( ) ( )

∑ ∑

+ − +       =

ρ
[

] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

+ ← ←

× ×

SLIDE 15

SNMF

SNMF in Testing Phase:

First step: Second step:

( )

= =

Extended KullbeckLeibler divergence: Multiplicative updating formula:

≈
(

)

( ) ( ) ( )

∑ ∑

+ − +         =

ρ
[

] [ ] [ ] [ ]

+ ←

SLIDE 16

Proposed Method

! '

SLIDE 17

Proposed Method

*: !
5
5
+
+

Speech Signal Feature selection Feature Vectors +.

SLIDE 18

Proposed Method

Speaker Independent Model Speaker Adaptation Method

+ 2 #;<#= , "><" 7

' & ' 7

' ' '7

Model Method

SLIDE 19

Proposed Method

Supervector making procedure

Gaussian Mixture Model (GMM) of each speaker adapted HMMs is: Three type of supervectors:

)

)

∑
= ∑

Three type of supervectors:
1. Means
2. Variances
3. Weights

Weights supervectors: The result of this step is 555 supervectors for each of 555 speakers

[ ] [ ]

λ λ χ λ

SLIDE 20

Proposed Method

SLIDE 21

Results

Evaluation Methodology

5fold crossvalidation (five independent run)
In each of five run:
Training set is speech data of 444 speakers
Testing set is speech data of 111 speakers
Testing set is speech data of 111 speakers
Database
Database

Run 1 Run 2

SLIDE 22

Results

Gender recognition is 96%. relative confusion matrix

" =" 9> = @ = ? = << =#

= ## = #< =

9# = # < =

Age group recognition
=

9# = # < = ! =" = ; = <@ = ! = => > > @

(0 6 ?0 ?0 0

!0 !0

; < "# # + "

SLIDE 23

Conclusions and Future Researches

(

+ 7 !6
! ,

" * 67% ( # , ;@A

, ;@A 9 +

SLIDE 24