Beyond the Equal Error Rate
About the inter-relationship between algorithm and application Renana Peres Comverse Technology
- ISCA Archive
Beyond the Equal Error Rate About the inter-relationship between - - PowerPoint PPT Presentation
ISCA Archive
Beyond the Equal Error Rate
About the inter-relationship between algorithm and application Renana Peres Comverse Technology
authentication
Market needs
Effective authentication tools for remote services Direct banking Service Centers Home shopping Calling Cards E commerce Mobile Commerce Smart cards
RF Signatures Questions Profiles PIN codes
U N S A F E E X P E N S I V E N O T F R I E N D L Y
Telecom Service centers Calling Cards, 97’ US Visa, 96’, US AT&T, 94’, US 10-30 B$/Y 1B$ 2B$ 0.5B$
IMAGE SERVICE ECONOMIC
Customer Satisfaction Expenses Profitability
Effective authentication
The barrier in the expansion of remote commerce services
Operational Scenarios
Free speech and vocal password applications
Applications: Call Centers Cellular Roamers Calling cards Voice / IP
Claimed id.
Verify Accept/Reject
Free speech (Text Independent)
Applications: Credit Cards IVR Interactions Physical Access E-commerce
Verify Accept/Reject
Claimed id.
V
a l p a s s w
d ( T e x t D e p e n d e n t )
Voice based verification
Authentication solution for any remote services
Friendly
Combines with transaction flow No passwords Use of natural speech
Personal, biometric verification
Safe Saves Costs
Fraud prevention Reduce bureaucracies Increase service volumes Shortens call duration
Typical Architecture
Integrated into the service provider infrastructures
Audio system Storage system Calling application Management Processing units Coordinator
Research challenges Speaker separation and segmentation Segmentation with unknown
Non-speech and silence vox Non-password speech vox
Audio issues
IVR Transfer volumes: Call Center: 100 - 3000 agents = 100 concurrent calls Telecom: 10 - 30 trunks= 300-900 concurrent calls
Audio system Storage system Calling application Management Processing units Coordinator
Free speech Vocal password
Storage
Internal vs. external storage architecture
Internal storage
Audio system Processing units
External storage
Audio system Processing units Coordinator Coordinator Audio system Processing units Coordinator
Disk chase
Storage
Coordinator Processing units Audio System World models & data Claimed identities Verification results statistics Verification audio Storage issues: Large storage volumes: 1 minute audio = 0.5 Mbyte (PCM) Storage of audio objects Backup, redundancy Voice signature maintenance
Large, dynamic storage, containing voice & data
Storage operations Create new VS Add audio to VS Remove session from VS Remove VS Add audio to world model Store speaker model Modify claimed id. Get VS data Voice signatures Speaker models Audio
Add to VS Verify car T r a i n i n g T r a i n i n g
Storage
Voice signature maintenance
Time call1 call2 call3 T r a i n i n g Research challenges Time evolution of VS VS update policy Identification of faulty sessions Re-training without audio Compact speaker models
Audio sessions are added to VS; VS is re-trained
cellular T r a i n i n g
Recognition phases
Calibration, enrolment, verification
Time Add to VS Verify Time Add to Cal Verify subscribers
Train Train Train Train Train
Add to VS Calibration Enrolment Calibration : Initial parameter settings, creating world models Enrolment: VS data accumulation, creating speaker model Verification: Match an incoming call against a claimed identity
Calibration
Initial parameter setting
Calibration data: world models, tuning data,other params Large amount of audio Heavy computation No source labeling Research challenges Calibration with mixed source data (unsupervised clustering ?) Time evolution of world models Calibration for text-dependent applications (no impostor repetitions, no language info)
Enrolment
Voice Signature for each subscriber During enrolment, alternative authentication methods are used
Research challenges Minimum user involvement Signature robustness Mixed source signature Mixed source corpora Measurement for VS quality First 2-3 calls Add call to VS Train VS ready for verification Problem in VS Off-line operation Free speech Enrolment session Repeat password Train More audio? VS ready for verification Vocal Password
Verification
The most frequent mission
DTMF Verification API Verify Claimed id. Another trial required Accept / Reject Speech recognition CLI Research challenges Multi trial verification Share info between trials
Result update policies
For free speech applications
Call Start Transaction 1 Call End Transaction 2
Upon request Fixed intervals Confidence level
Decision Policy
Service oriented Security oriented
FA FR
Threshold
Algorithmic results + application cost function
Decision and Scoring
Research challenges Effective scoring Likelihood ratio -> FA / FR
Intra speaker Inter speaker
Likelihood Ratio
Decision Threshold
FA FR
Posterior Probability
Management
Tools for system monitoring and maintenance
General information System status Mission status Loads Speaker Recognition information
Data collection status Rejection cases Performance measurements Feedback
Summary
Algorithms Applications
Telephony Transfer volumes Service User behavior New research challenges
Summary of Research challenges
segmentation
Audio
Enrolment
Verification
Storage
(unsupervised clustering ?)
applications (no impostor repetitions, no language info)
Calibration
Scoring
Corpora