Odyssey 2016 The Speaker and Language Recognition Workshop June - - PowerPoint PPT Presentation

odyssey 2016
SMART_READER_LITE
LIVE PREVIEW

Odyssey 2016 The Speaker and Language Recognition Workshop June - - PowerPoint PPT Presentation

Odyssey 2016 The Speaker and Language Recognition Workshop June 21-24, 2016, Bilbao, Spain A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients Massimiliano Todisco, Hctor Delgado and Nicholas Evans


slide-1
SLIDE 1

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Odyssey 2016

The Speaker and Language Recognition Workshop

June 21-24, 2016, Bilbao, Spain

The OCTAVE project has received funding from the European Union’s Horizon 2020 research and innovation programme under grant agreement No 647850.

Massimiliano Todisco, Héctor Delgado and Nicholas Evans Department of Digital Security EURECOM, Sophia Antipolis, France

slide-2
SLIDE 2

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Introduction

  • Spoofing is a manipulation of a biometric system by a fraudulent user
  • Automatic speaker verification is vulnerable to spoofing
  • Spoofing algorithm cannot be known in advance
  • Need for generalised countermeasures
  • New feature based on constant Q transform were proposed
slide-3
SLIDE 3

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

From Fourier to constant Q

  • Fourier transform may lack

frequency resolution

  • In STFT, the time and frequency

resolutions are constant

  • Constant Q transform (CQT) is an

alternative which reflects more closely human perception

  • CQT employs a variable

time/frequency resolution:

  • greater time resolution for higher

frequencies

  • greater frequency resolution for

lower frequencies

slide-4
SLIDE 4

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Constant Q Cepstral Coefficients (CQCC)

Block diagram of CQCC feature extraction

  • Combining CQT (in place of STFT) with traditional cepstral analysis
  • Issue: discrete cosine transform (DCT) cannot be directly applied

§ CQT and DCT have different scale (geometric vs linear) § Geometric DCT bases are no longer orthogonal

  • Solution: Uniformly resample the non-uniform frequency scale of CQT to a linear

frequency scale

Constant-Q Transform Power spectrum LOG DCT

speech signal

Uniform resampling

CQCC

slide-5
SLIDE 5

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Experiments & Results on ASVspoof 2015 Database

Comparison of results (EER [%]) on ASVspoof2015 Database

Front-end: CQCC-A (19+0th second derivative coefficients) Back-end: 2 GMMs (512 components, EM training), one for human speech and one for spoofed speech

  • Known attacks: all the systems deliver excellent error rates
  • Unknown attacks: CQCC features give best performance

§ Attack S10 (unit selection synthesis): EER = 1.065% à 87% relative improvement

  • Best spoofing detection performance (72% relative improvement) reported to date
slide-6
SLIDE 6

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Poster place

slide-7
SLIDE 7

A new feature for automatic speaker verification anti-spoofing: constant Q cepstral coefficients

Poster place