The present and future of voiceprint based security Prof. - PowerPoint PPT Presentation

APSIPA APSIPA Asia-Pacific Signal and Information Processing Association Asia-Pacific Signal and Information Processing Association Speaker Verification – The present and future of voiceprint based security Prof. Eliathamby Ambikairajah Head of School of Electrical Engineering & Telecommunications, University of New South Wales, Australia 21 Oct 2013

APSIPA Asia-Pacific Signal and Information Processing Association Outline • Introduction • Speaker Verification Applications • Speaker Verification System • Performance measure • NIST Speaker Recognition Evaluation (SRE) • Discussion 1 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Introduction “How are you?” Language Speech Speaker Emotion Accent Recognition Recognition Recognition Recognition Recognition “How are you?” English Taiwanese Hsing Ming Happy Linguistic Paralinguistic • Speech conveys several types of information – Linguistic: message and language information – Paralinguistic : emotional and physiological characteristics 2 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association “How are you?” Introduction Speech Language Speaker Emotion Accent Recognition Recognition Recognition Recognition Recognition Speaker Diarization Speaker Identification Speaker Verification partition an input audio determines who is determines if the stream into speaking given a set of unknown voice is from homogeneous segments enrolled speakers the claimed speaker according to the speaker identity 3 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Speaker Diarization Speaker Identification Speaker Verification partition an input audio determines who is determines if the stream into speaking given a set of unknown voice is from homogeneous segments enrolled speakers the claimed speaker according to the speaker identity Model repository Model repository Speaker 1 Speaker 1 Model Model Speaker 1 Speaker 1 Speaker 2 Speaker 2 Best Reject Unknown Model Claimed Model Matching Speaker Speaker 2 Speaker 2 Speaker Speaker M Speaker M Model Model 4 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Speaker Verification Applications - Biometrics Transaction Access control authentication Telephone credit Physical card purchases facilities 5 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Speaker Verification System – Basic Overview Speaker Model Feature Accept/ Classification Decision Making Speech Extraction Reject Front-end Back-end • In automatic speaker verification, – The front-end converts speech signal into a more convenient representation (typically a set of feature vectors) – The back-end compares this representation to a model of a speaker to determine how well they match 6 APSIPA Distinguished Lecture Series @ IIU, Malaysia

Speaker Verification System I am John Feature Extraction c 0 c 1 c 2 c n Speaker Models Universal Background Determine level Determine level Models (UBM) of Match of Match Speaker 1 Model Generic Male Likelihood of Likelihood of Generic Male John Generic Female John’s Model Likelihood Ratio Decision Making NOT JOHN UBM: represent general, speaker independent model to be compared against a person-specific model when making an accept or reject decision.

Speaker Verification System – Speaker Enrolment Creating a male UBM Speaker 1 Speaker 2 Universal Background Models (UBM) Feature Model Step 1 Generic Male Extraction Training Speaker N Generic Female Background male speaker data Creating male speaker-specific models Speaker Models Model Feature Speaker x 1 Speaker x 1 Adaptation Model Extraction Model Speaker x 2 Feature Speaker x 2 Step 2 Model Adaptation Extraction Feature Model Speaker x M Speaker x M Extraction Adaptation Model Target male speaker data 8

APSIPA Asia-Pacific Signal and Information Processing Association Detailed Speaker Verification System Feature Feature Speaker Model Classification Score Decision Accept/ Extraction Normalisation Modelling Normalisation (Scoring) Normalisation Making Speech Reject · Nuisance Attribute Projection (NAP) · Cepstral Mean Subtraction · Joint Factor Analysis (JFA) · Zero-normalisation (CMS) · i-vectors (Z-norm) · RelAtive SpecTrAl (RASTA) · Within Class Covariance · Test-normalisation · Feature Warping Normalisation (WCCN) (T-Norm) · Feature Mapping · Linear Discriminant Analysis (LDA) · Probabilistic Linear Discriminant Analysis (PLDA) 9 APSIPA Distinguished Lecture Series @ IIU, Malaysia

Front-end: Feature Extraction 25ms 25ms 25ms 25ms Frame 1 Frame 2 Frame 3 Frame N Windowing Windowing Windowing Windowing Feature Feature Feature Feature Extraction Extraction Extraction Extraction C o distribution Feature Feature Feature Feature Vector Vector Vector Vector c 0 c 1 c 2 c n c 0 c 1 c 2 c n c 0 c 1 c 2 c n c 0 c 1 c 2 c n BASIC FEATURES -5 0 5 Feature Normalisation Normalised Normalised Normalised Normalised Feature Feature Feature Feature Vector Vector Vector Vector c 0 c 1 c 2 c 0 c 1 c 2 c 0 c 1 c 2 c 0 c 1 c 2 c n c n c n c n NORMALISED FEATURES -5 0 5 Normalised C o distribution 10

APSIPA Asia-Pacific Signal and Information Processing Association Normalised Feature vectors Normalised Feature vectors Normalised Feature vectors (Frame 1) (Frame 2) (Frame P ) c 0 c 1 c 2 c 0 c 1 c 2 c 0 c 1 c 2 c n c n c n Temporal Derivative Delta Feature vectors Delta Feature vectors Delta Feature vectors (Frame 1) (Frame 2) (Frame P ) d 0 d 1 d 2 d 0 d 1 d 2 d n d n d 0 d 1 d 2 d n Temporal Derivative Acceleration Feature vectors Acceleration Feature vectors Acceleration Feature vectors (Frame 1) (Frame 2) (Frame P ) a 0 a 1 a 2 a 0 a 1 a 2 a n a n a 0 a 1 a 2 a n c 0 c 1 c 2 c n a 0 a 1 a 2 d 0 d 1 d 2 d n a n Frame 1 Features: (e.g: 39 dimensions) 11 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Detailed Speaker Verification System Feature Feature Speaker Model Classification Score Decision Accept/ Extraction Normalisation Modelling Normalisation (Scoring) Normalisation Making Speech Reject · Nuisance Attribute Projection (NAP) · Cepstral Mean Subtraction · Joint Factor Analysis (JFA) · Zero-normalisation (CMS) · i-vectors (Z-norm) · RelAtive SpecTrAl (RASTA) · Within Class Covariance · Test-normalisation · Feature Warping Normalisation (WCCN) (T-Norm) · Feature Mapping · Linear Discriminant Analysis (LDA) · Probabilistic Linear Discriminant Analysis (PLDA) 12 APSIPA Distinguished Lecture Series @ IIU, Malaysia

Speaker Modelling MODELLING PROBABILITY DISTRIBUTION 9 FEATURE SPACE 0.4 8 0.2 7 Dimension 1 (C 0 ) 0 6 9 8 5 7 4 6 5 3 4 4.5 5 5.5 6 6.5 7 7.5 8 8.5 9 9 8 4 Dimension 2 (C 1 ) 7 6 5 3 4 0.16 All Weights Overall 0.14 must sum to 1 PDF Weighted Gaussian 1 0.12 0.1 Weighted Gaussian 3  Probability density function approximated by 3- 0.08 component Gaussian mixture models 0.06 Weighted Gaussian 2 0.04  Each Gaussian mixture consist of a mean (µ), 0.02 covariance ( Σ ) and weight ( w ) 0 -8 -6 -4 -2 0 2 4 6 8 10 13

APSIPA Asia-Pacific Signal and Information Processing Association Database for creating UBM (example) • Training set – 56 male speakers (each speaker consists of 2 minutes of active speech) for creating the UBM • Target set – 20 male speakers (each speaker consists of 2 minutes of active speech) for speaker-specific model • Test set – 250 male utterances (each speaker has many test utterances) with the known identity 14 APSIPA Distinguished Lecture Series @ IIU, Malaysia

APSIPA Asia-Pacific Signal and Information Processing Association Target Speaker Data Universal Background Weight = 0.2 Feature Dimension 2 Model (UBM) consists of 1024 Gaussian Covariance = 0.9 1024 mixtures 998 Mean = 0.9 2 UBM 1 Feature Dimension 1  Gaussian mixture Weight = 0.3 consists of a mean 1024 (µ), covariance ( Σ ) Feature Dimension 2 998 Target speaker model and weight ( w ) consists of 1024 Covariance = 0.5 Gaussian mixtures Target Model Mean = 0.8 Feature Dimension 1 15 APSIPA Distinguished Lecture Series @ IIU, Malaysia

Representing GMMs GMM Mixture 1 Mixture 2 Mixture 1024 39x1 Covariances 39x1 Covariances 39x1 Covariances 1x1 - Weight 1x1 - Weight 1x1 - Weight 39x1 Means 39x1 Means 39x1 Means 1x1024 Weight vector  The UBM and each speaker model 39x1024 Means matrix is a GMM  Each of them will be represented by a vector of weights, a matrix of means and a matrix of covariances 39x1024 Covariances matrix GMM REPRESENTATION 16

Decision Making Feature Extraction Determine level of Determine level of Speaker Models Universal Background Match Match Models Speaker 1 Model Generic Male Likelihood of Likelihood of John Generic Male Generic Female John’s Model Likelihood S came from speaker model Score, L = log Likelihood S did not come from speaker model 𝑴 ≶ 𝜾 Reject/Accept 17

The present and future of voiceprint based security Prof. - PowerPoint PPT Presentation

APSIPA APSIPA Asia-Pacific Signal and Information Processing Association Asia-Pacific Signal and Information Processing Association Speaker Verification The present and future of voiceprint based security Prof. Eliathamby Ambikairajah Head

The Future Internet is Present in Europe: The Future Internet is Present in Europe: The Future

Present and Powerful Present and Powerful Psalm 46:1 God is our refuge and strength, an

FUTURE PULL: Future Pull Creating Change From the THE FARMHOUSE IN MY FUTURE Future Back Bill

Drones: The Future? The past, present and future of drone technology in Ireland &

The Past, Present and Future of Irish Agriculture Brendan Kearney The Past, Present and Future of

The OIG and Hospice in Nursing Facilities: Past, Present and Future Present and Future Heather

Present and Future Present and Future Supercomputer Architectures Supercomputer Architectures

Israel: Israel: Past, Present, and Past, Present, and Future Future Ezekiel 5:5 Thus says

The Past, Present, and Future of the R Project Kurt Hornik Kurt Hornik useR! 2008 The Past,

Israel: Israel: Past, Present, and Past, Present, and Future Future The LO RD did not set

OR-PAST,PRESENT OR-PAST,PRESENT & FUTURE & FUTURE Do You Know Do You Know Where Your

20 Years of PaX PaX Team SSTIC 2012.06.06 20 Years of PaX About Past Present Future About

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

Whither NEAC An Overview of the Past, Present An Overview of the Past, Present and Future of

AI Present and Future Alan Smaill University of Edinburgh, School of Informatics

SOLODANCE SOLO FREE DANCE a PRESENT what a FUTURE? SOLO FREE DANCE: a present what a

Fellowship Applications: my experience Katherine Joy School of Earth Atmospheric and

CAMPARE and Cal-Bridge: Engaging Underrepresented Students in

Fraud: Detection & Prevention December 2017 Agenda IT Security Bill Golden, CIO

S9670 VIRTUAL DESKTOPS BY DAY, COMPUTATIONAL WORKLOADS BY NIGHT - AN EXAMPLE INFRASTRUCTURE

American College Counseling Association Conference 2019 C reating H ealthy A ctions T ogether An

shaving my head made me a better programmer @alexqin #shavedmyhead pt I. my story

Presented by: Islanders Bank Cybersecurity Awareness Cybersecurity Awareness Objectives:

CONNECT, INFORM AND PROTECT HARRIS.COM | #HARRISCORP Who we are NYSE: HRS International

Sambuz

Useful Links

Newsletter

Mail Us

The present and future of voiceprint based security Prof. - PowerPoint PPT Presentation

APSIPA APSIPA Asia-Pacific Signal and Information Processing Association Asia-Pacific Signal and Information Processing Association Speaker Verification The present and future of voiceprint based security Prof. Eliathamby Ambikairajah Head

The Future Internet is Present in Europe: The Future Internet is Present in Europe: The Future

Present and Powerful Present and Powerful Psalm 46:1 God is our refuge and strength, an

FUTURE PULL: Future Pull Creating Change From the THE FARMHOUSE IN MY FUTURE Future Back Bill

Drones: The Future? The past, present and future of drone technology in Ireland &amp;

The Past, Present and Future of Irish Agriculture Brendan Kearney The Past, Present and Future of

The OIG and Hospice in Nursing Facilities: Past, Present and Future Present and Future Heather

Present and Future Present and Future Supercomputer Architectures Supercomputer Architectures

Israel: Israel: Past, Present, and Past, Present, and Future Future Ezekiel 5:5 Thus says

The Past, Present, and Future of the R Project Kurt Hornik Kurt Hornik useR! 2008 The Past,

Israel: Israel: Past, Present, and Past, Present, and Future Future The LO RD did not set

OR-PAST,PRESENT OR-PAST,PRESENT &amp; FUTURE &amp; FUTURE Do You Know Do You Know Where Your

20 Years of PaX PaX Team SSTIC 2012.06.06 20 Years of PaX About Past Present Future About

THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR THE FUTURE TOUR Under the framework of

Whither NEAC An Overview of the Past, Present An Overview of the Past, Present and Future of

AI Present and Future Alan Smaill University of Edinburgh, School of Informatics

SOLODANCE SOLO FREE DANCE a PRESENT what a FUTURE? SOLO FREE DANCE: a present what a

Fellowship Applications: my experience Katherine Joy School of Earth Atmospheric and

CAMPARE and Cal-Bridge: Engaging Underrepresented Students in

Fraud: Detection &amp; Prevention December 2017 Agenda IT Security Bill Golden, CIO

S9670 VIRTUAL DESKTOPS BY DAY, COMPUTATIONAL WORKLOADS BY NIGHT - AN EXAMPLE INFRASTRUCTURE

American College Counseling Association Conference 2019 C reating H ealthy A ctions T ogether An

shaving my head made me a better programmer @alexqin #shavedmyhead pt I. my story

Presented by: Islanders Bank Cybersecurity Awareness Cybersecurity Awareness Objectives:

CONNECT, INFORM AND PROTECT HARRIS.COM | #HARRISCORP Who we are NYSE: HRS International

Sambuz

Useful Links

Newsletter

Mail Us

Drones: The Future? The past, present and future of drone technology in Ireland &

OR-PAST,PRESENT OR-PAST,PRESENT & FUTURE & FUTURE Do You Know Do You Know Where Your

Fraud: Detection & Prevention December 2017 Agenda IT Security Bill Golden, CIO