Oxford Wave Research Ltd WHO WE ARE Oxford Wave Research Ltd (OWR) - - PowerPoint PPT Presentation

▶

Oct 07, 2022 574 likes •728 views

? ? ? ? Anil Alexander OPENING THE BLACK BOX FOR FORENSIC and AUTOMATIC SPEAKER RECOGNITION Finnian Kelly Oxford Wave Research Ltd WHO WE ARE Oxford Wave Research Ltd (OWR) is an audio and speech R&D company based in Oxford, UK.

SLIDE 1

OPENING THE BLACK BOX FOR FORENSIC AUTOMATIC SPEAKER RECOGNITION

Anil Alexander and Finnian Kelly Oxford Wave Research Ltd

? ? ? ?

SLIDE 2

 Oxford Wave Research Ltd (OWR) is an audio and speech R&D company based in Oxford, UK.  We develop systems for:  Automatic Speaker Recognition  Speaker Diarization  Audio Fingerprinting Our products are used by law enforcement the UK, US, Europe and the Middle East including the MET police, UK MoD, Netherlands Forensic Institute, German BKA etc.

WHO WE ARE

SLIDE 3

“AUTOMATIC SPEAKER RECOGNITION IS A …”

SLIDE 4

“AUTOMATIC SPEAKER RECOGNITION IS A …”

Lay people Juries, judges and lawyers Even forensic experts!

SLIDE 5

FORENSICS AND ‘THE BLACK BOX’

Recent advances in Speaker Recognition involve a huge number of variables – training and evaluation data, feature modelling and parameter choices. A lot of the focus has been on incremental improvements on large datasets of the variability is designed or controlled. How does this sit in in the context of opening the black box in real forensic casework?

SLIDE 6

THE ENFSI GUIDELINES (2015)

Balance Logic Robustness Transparency

SLIDE 7

A TYPICAL AUTOMATIC PIPELINE

SLIDE 8

SOURCES OF DATA VARIABILITY

Multiple data selection decisions before you even get started

LDA/PLDA UBM Training TV Matrix

How does this affect the Likelihood Ratios or the Strength of evidence?

SLIDE 9

BENEFITTING FROM THE EXPERTISE OF FORENSIC PHONETICIANS

Most of the forensic speaker recognition case-work is performed by forensic phoneticians who

Have a lot of experience and knowledge in voice comparison

and an understanding of the legal requirements in their area

Want to include automatic methods, but do not have any

straight-forward means of incorporating their knowledge into an automatic analysis.

Would like to make their speaker recognition analysis more
bjective using likelihood ratios and evaluating system

performance for each case.

SLIDE 10

SEMI-AUTOMATIC AND AUTOMATIC SPEAKER RECOGNITION

*LTF illustration from Catalina Manual

SLIDE 11

TOWARDS A COMMON METHODOLOGICAL PLATFORM

Bayesian Likelihood Ratios

SLIDE 12

OPENING UP THE BLACK BOX

The ‘black box’ creates a situation in which the forensic expert is

unable to look, or indeed adapt the automatic system to their own requirements.

The expert should able to change the system parameters and

introduce new data at every step of the speaker recognition process.

The expert should not limited to manufacturer-provided models or

configurations, and has the ability to train the system specifically for their problem domain.

SLIDE 13

OUR APPROACH

VOCALISE Voice Comparison and Analysis of the Likelihood of Speech Evidence Flexible Features  ‘Automatic’ spectral features  ‘Traditional’ forensic phonetic parameters  ‘User’- provided features Flexible Modeling  State of the art ivector/PLDA  ‘Classical’ – GMM/GMM-UBM The ‘Session’ Concept:

Pre-trained and optimised models provided
Ability to introduce data at all stages of the ivector pipeline
Ability to adapt the system to the conditions of the case