High-Performance Session Variability Compensation Session - PowerPoint PPT Presentation

High-Performance Session Variability Compensation Session Variability Compensation in Forensic Automatic Speaker Recognition Daniel Ramos Javier Gonzalez Dominguez Daniel Ramos , Javier Gonzalez-Dominguez, Eugenio Arevalo and Joaquin Gonzalez-Rodriguez ATVS – Biometric Recognition Group g p Universidad Autonoma de Madrid daniel.ramos@uam.es http://atvs.ii.uam.es http://atvs.ii.uam.es 3aSC5 Special Session on Forensic Voice Comparison and Forensic Acoustics @ 2nd Pan-American/Iberian Meeting on Acoustics, Cancún, México, 15–19 November, 2010 http://cancun2010.forensic-voice-comparison.net

Outline Forensic Automatic Speaker Recognition: Where are we?  State of the art dominated by high-performance session  variability compensation Some challenges affecting session var. comp.  Database mismatch  Sparse background data  Duration variability  Research trends  Facing the challenges  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 2

Where Are We? Automatic Speaker Recognition (ASpkrR) technology Automatic Speaker Recognition (ASpkrR) technology  Driven by NIST Speaker Recognition Evaluations (SRE)  St t State Of The Art dominated by Of Th A t d i t d b  Spectral systems  High-performance session variability compensation  Factor Analysis, flavors and evolutions  Data driven Data-driven  Currently a mature technology  Usable in many applications  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 3

Where Are We? Discrimination performance (DET plots)  ATVS single spectral system in NIST SRE 2010 ATVS single spectral system in NIST SRE 2010   i-Vectors, session variability compensation  Primary Male (EER=5.0%) Primary Male (EER 5.0%) Primary Female (EER=7.1%) Contrastive Male (EER=6.0%) Contrastive Female (EER=8.1%) 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 4

Where Are We? To consider in Forensic ASpkrR  Convergence to scientific standards  “Emulating DNA”, Likelihood Ratio (LR) paradigm  Unfavorable environment  Mostly uncontrolled conditions  Sparse amount of speech (comparison and background)  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 5

Where Are We? LR paradigm in Forensic ASpkrR  Speaker Score to LR LR LR Recognition Recognition Transformation Transformation System (calibration)     E  p p I  p p , , LR    Score taken as p E I d , Evidence ( E ) ( ) Two stages  Discrimination stage (standard, score-based architecture)  Calibration stage (LR computation)  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 6

Where Are We? Discrimination performance  Example with AhumadaIV-Baeza database Example with AhumadaIV Baeza database   Thanks to Guardia Civil Española  NIST-SRE-like task: comparison between NIST SRE like task: comparison between   120s of GSM or microphone  (controlled) speech (controlled) speech Acquired following  Guardia Civil protocols p 120s GSM-SITEL speech  Acquired using the SITEL Acquired using the SITEL   Spanish National wire- tapping system 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 7

NIST SRE vs. Forensic ASpkrR p Main commonalities  Highly variable environment (telephone different Highly variable environment (telephone, different   microphones, interview, etc.) LR paradigm LR paradigm   NIST SRE allow LR calibration (assessed by C llr )…  …although we believe this should be further encouraged although we believe this should be further encouraged   But in Forensic ASpkrR (and not in NIST SRE)  Typical lack of representative background data  NIST SRE: lots of speech from past SRE  Utterance duration is uncontrolled  NIST SRE: conditions of fixed, controlled duration  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 8

Challenges of Session Variability Comp. g y p Some typical forensic scenarios where session  variability compensation degrades y p g Strong database mismatch  Sparse background data Sparse background data   Extreme duration variability  S Scenarios not present in NIST SRE S S  Minor attention to these problems  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 9

Challenges: Database Mismatch g Speaker Score to LR LR LR Recognition Recognition Transformation Transformation System (calibration) S Q Background database conditions (different from Q and S conditions)  Database mismatch: background and comparison D t b i t h b k d d i (Questioned Q, Suspect S) databases are different  Additional problem to mismatch among Q and S Additi l bl t i t h Q d S  Degrades performance of session variability compensation Subspaces are not representative of comparison speech Subspaces are not representative of comparison speech   2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010

Challenges: Database Mismatch g Example in NIST SRE 2008 Example in NIST SRE 2008   Comparison of two speech  40 utterances utterances obability (in %) Speech from a single channel 20  (microphone m3 or m5) ( p ) 10 lse Rejection Proba Speech from any channel in 5  SRE08 2 False 1 1 Speech from m3/m5 included m5 match: EER−DET = 7.28  0.5 m5 mismatch no m5: EER−DET = 8.82 or not in background m3 match: EER−DET = 21.06 0.2 m3 mismatch no m3: EER−DET = 22.60 0.1 UBM, normalization and session  0.1 0.2 0.5 1 2 5 10 20 40 False Acceptance Probability (in %) variability compensation 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 11

Challenges: Database Mismatch Example: AhumadaIV-Baeza  Background: NIST SRE telephone-only speech g p y p  Bad performance for low  FA rates when FA rates when microphonic speech is used for training Even when microphone  speech is controlled and of higher quality Following the standard  acquisition procedures of acquisition procedures of Guardia Civil Española 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 12

Database Mismatch: Research Need of collection of more representative databases  Case study: continuous efforts of Guardia Civil Española  Ahumada-Gaudi (2000, Ahumada Gaudi (2000,  spontaneous speech, landline telephone and microphone) AhumadaIII (2008, real forensic  cases, multidialect, GSM over magnetic tape) magnetic tape) AhumadaIV (2009, speech from  SITEL) …  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 13

Database Mismatch: Research Predictors of database mismatch  E g : log likelihood with respect to UBM (UBML) E. g. : log-likelihood with respect to UBM (UBML)  Low UBML indicates database mismatch  Performance degrades f  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 14

Challenges: Sparse Background Data g p g Typical in forensics: some representative background  data is available But typically a sparse corpus  Optimal use of this background data for session Optimal use of this background data for session   variability compensation Speaker Score to LR LR Recognition Transformation System System (calibration) (calibration) Background database Background database S Q 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 15

Sparse Background Data: Research Example: simulation using NIST SRE 2008  Wealth background corpus of telephone data g p p  Sparse background corpus of microphone data  Microphone and telephone data to be compared Microphone and telephone data to be compared   Session variability compensation strategies  Joining compensation matrices  Pooling Gaussian statistics  Scaling Gaussian statistics  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 16

Sparse Background Data: Research p g Combination strategies of available data  Wealth corpus telephone data (dTel) Wealth corpus, telephone data (dTel)  Small corpus, sparse microphone data (dMic3)  12 10 8 1conv4w 1conv4w EER 1conv4w 1mic 6 1mic 1conv4w 1mic 1mic 4 2 0 U=0 dTel dMic3 Joint Pooling Scaling 2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 17

Challenges: Duration Variability g y Impact in session variability compensation and Impact in session variability compensation and   score normalization Subspaces/cohorts trained with long utterances S b / h t t i d ith l tt  Comparison with short utterances  Other effects  Misalignment in the scores due to duration variability g y  Degrades global discrimination performance  Seriously affects calibration  2nd Pan-American Meeting on Acoustics, ASA Cancun, Mexico, November 2010 18

High-Performance Session Variability Compensation Session - PowerPoint PPT Presentation

High-Performance Session Variability Compensation Session Variability Compensation in Forensic Automatic Speaker Recognition Daniel Ramos Javier Gonzalez Dominguez Daniel Ramos , Javier Gonzalez-Dominguez, Eugenio Arevalo and Joaquin

DC/Win Compensation 11/15/2007 Compensation in DC/Win Presented by: Kristina Kananen, QPA

VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF HAWAIIAN WINTER RAINFALL VARIABILITY OF

Variability of an artificial tandem repeat Ted Pak HURS 2007 Variability of an artificial tandem

CONFERENCE BENEFITS 1 COMPENSATION How do I get my compensation to the Benefits office?

Craft a Compensation Philosophy Lay the groundwork for compensation that is competitive,

Topics: Compensation Defaulting from Position to Hire/Job Change (Slides 2-5) 1. Position

EMPLOYEE COMPENSATION BUDGET WORK SESSION APRIL 4, 2013 TOPICS FOR DISCUSSION COMPENSATION

Variability Extraction and Analysis Toolkit (VEXA) VEXA Introduction The Variability Extraction

Climate Variability in South Asia V. Niranjan, M. Dinesh Kumar, and Nitin Bassi Institute for

Introduction Variability in Data Summarizing variability in a data set CS 239

Chapter 4: Variability Variability Provides a quantitative measure of the degree to which

TITLE AND TOTAL COMPENSATION Project Update TITLE AND TOTAL COMPENSATION PROJECT 01 | CORE

Compensation and Ratings for Permanent Disability in California Workers Compensation: Theory

Additional Compensation Training Ryan Bernarduci George Hibbler Learning Objectives What

1 Annual Workers Compensation Conference History 1. European origins 2. Original PA Act

Dalhousie University - Total Compensation September 26, 2018 1 1 Total Compensation Portfolio

Your Toolbar In seAngs you will find your Microphone and Camera Set-up and trouble shoo.ng

ECEU530 Projects ECE U530 Individual project implementing a design in VHDL Digital Hardware

A Formal Study of Power Variability Issues and Side-Channel Attacks for Nanoscale Devices

Efficient Layout Hotspot Detection via Binarized Residual Neural Network Yiyang Jiang 1 , Fan Yang

RoboX An End-to-End Solution to Accelerate Autonomous Control in Robotics Alternative Computing

MSGC Chamber Micro-Strip Gas Counter What is MSGC ? MSGC : a fragile structure MSGC DISCHARGE

Microwave Ovens Microwave Ovens They often cook foods unevenly They often cook foods unevenly

The Idea and Motivation o Microwaves caused ~7,400 fires in 2005 n 87 injuries and $18

Sambuz

Useful Links

Newsletter

Mail Us