Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt - PowerPoint PPT Presentation

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universität zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory

Feature Extraction • Contents ❑ Introduction ❑ Features for speech and speaker recognition ❑ Fundamental frequency ❑ Spectral envelope ❑ Representation of the spectral envelope ❑ Predictor coefficients ❑ Cepstral coefficients ❑ Mel-filtered cepstral coefficients (MFCCs) Slide 2 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Introduction Data bank Previously trained with models data bank with models Speech encoding Feature extraction Data bank with models Speech recognition Preprocessing for Feature reduction of distortions extraction (Noise reduction, Data bank with models beamforming) Speaker encoding Slide 3 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Literature Estimation of the fundamental frequency ❑ W. Hess: Pitch Determination of Speech Signals: Algorithms and Devices , Springer, 1983 Prediction ❑ M. S. Hayes: Statistical Digital Signal Processing and Modeling – Chapter 4 and 5 (Signal Modeling, The Levinson Recursion), Wiley, 1996 ❑ E. Hänsler, G. Schmidt: Acoustic Echo and Noise Control – Chapter 6 (Linear Prediction), Wiley, 2004 Mel-filtered cepstral coefficients ❑ E Schukat-Talamanzzini: Automatische Spracherkennung – Grundlagen, statistische Modelle und effiziente Algorithmen , Vieweg, 1995 (in German) ❑ L. Rabiner, B.-H. Juang: Fundamentals of Speech Recognition , Prentice-Hall, 1993 Slide 4 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Features for Speech and Speaker Recognition – Fundamental Frequency Fundamental frequency: ❑ Feature extraction mostly with autocorrelation based methods . ❑ Used for (rough) discrimination between male, female, and children‘s speech . ❑ The contour of the fundamental frequency be used for estimating accentuations in speech (helpful for recognizing questions, grouped phone numbers) or the emotional state of the speaker . ❑ Certain types of noise can be distinguished from speech by estimating the fundamental frequency (e.g. „GSM buzz“) ❑ It can be of advantage to „ normalize “ the frequency axis to the average fundamental frequency of a speaker. Slide 5 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Features for Speech and Speaker Recognition – Spectral Envelope Spectral envelope ❑ The spectral envelope is currently the most important feature in speech and speaker recognition. ❑ The spectral envelope is extracted every 10 to 20 ms and then used in subsequent algorithms such as speech recognition or coding. ❑ In order to reduce the computational complexity of the subsequent signal processing, the envelope should be computed compact (with a low number of relevant parameters) and in a form that a suitable for a cost function. ❑ Some signal processing techniques (e.g. bandwidth extension, speech reconstruction) need a representation of the spectral envelope that can also be used in the signal path . Other methods (e.g. speech and speaker recognition) are not bound to this condition. ❑ Typically, either cepstral coefficients , so called mel-filtered cepstral coefficients or mel-frequency cepstral coefficients ( MFCCs ) are used. Slide 6 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients Block extraction, downsampling Estimation of the (possibly windowing) auto correlation Conversion into Computation of the cepstral coefficients predictor coefficients Slide 7 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Predictor Error Filter – Part 1 Structure of a prediction error filter: Cost function for optimizing the coefficients: Frequency components with high signal power will be attenuated first (Parseval). This causes spectral flattening (whitening) of the spectrum. Slide 8 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Predictor Error Filter – Part 2 Structure of a prediction error filter and an inverse filter: The FIR version of the filter removes the spectral envelope. The IIR version of the filter reconstructs it. Slide 9 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Predictor Error Filter – Part 3 Frequency responses of inverse predictor error filters: Typically, prediction orders between 10 and 20 are used for representing the spectral envelope. Slide 10 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Computation of the Predictor Coefficients – Part 1 Derivation: ❑ Cost function ❑ Error signal: ❑ Differentiating the cost function: Slide 11 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Computation of the Predictor Coefficients – Part 2 Derivation: ❑ Differentiating the cost function resulted in: ❑ Setting the derivative to zero: Slide 12 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Computation of the Predictor Coefficients – Part 3 Derivation: ❑ Setting the derivative to zero resulted in: ❑ Equation system with N equations: Slide 13 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Computation of the Predictor Coefficients – Part 4 Derivation: ❑ Matrix-vector notation: ❑ Compact notation: Computationally efficient and robust solution of the equation system e.g. using Levinson-Durbin-Recursion. Slide 14 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Computation of the Predictor Coefficients – Part 5 Matlab example: Slide 15 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 1 Requirements: ❑ A cost function should capture „ distances “ between spectral envelopes. Similar envelopes should cause a small distance, envelopes that differ a lot should lead to large distances, and identical envelopes should cause a distance of zero. ❑ The cost function should be invariant to variations in the recording level/gain of the input signal. ❑ The cost function should be „easy“ to compute. ❑ The cost function should be similar to the human perception of sound (e.g. regarding the logarithmic loudness perception). Ansatz: Cepstral distance Slide 16 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 2 Ansatz: Cepstral distance Envelope 1 Envelope 2 Frequency in Hz Slide 17 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 3 A well-known alternative – the quadratic distance: Quadratic distance Envelope 1 Envelope 2 Frequency in Hz Slide 18 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 4 Cepstral distance: Parseval mit Slide 19 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 5 Computationally efficient transformation from prediction to cepstral coefficients: ❑ Definition ❑ Fourier-Transform for time-discrete signals and systems ❑ Replacing by Slide 20 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 6 Computationally efficient transformation from prediction to cepstral coefficients: ❑ Result so far ❑ Inserting the structure of the inverse prediction error filter Slide 21 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Feature Extraction • Representation of the Spectral Envelope Using Cepstral Coefficients – Part 7 Computationally efficient transformation from prediction to cepstral coefficients: ❑ Result so far ❑ Computation of the coefficients with non-negative indices Insert ❑ Using the series Slide 22 Digital Signal Processing and System Theory | Pattern Recognition | Feature Extraction

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt - PowerPoint PPT Presentation

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Feature Extraction

Part 5 pattern recognition pattern recognition track pattern recognition: associate hits

Feature Selection Pattern Recognition: The Early Days Pattern Recognition: The Early Days Only

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Pattern Recogniton Pattern: Any

CS 7616 Pattern Recognition Introduction Aaron Bobick School of Interactive Computing

Pattern Recognition CSE 802 Michigan State University Spring 2017 Lecture 1, January 9, 2017

Applications of Pattern Recognition in Computational Biology Pattern Recognition Course

Pattern Recognition: An Overview Prof. Richard Zanibbi Pattern Recognition (One) Definition

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

CS 7616 Pattern Recognition Bayesian Decision Theory Aaron Bobick School of Interactive Computing

Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders

An NFR Pattern Approach to Dealing An NFR Pattern Approach to Dealing An NFR Pattern Approach to

Scope Constrained Frequent Pattern Mining: Constrained Frequent Pattern Mining: A A

A common pattern: map Another common pattern: filter Pattern: take a list and produce a new list,

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Pattern Recognition Theory Lecture 12 : Correlation Filters Pattern Matching a How to match

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from

FUTURE INTERNET Testbed @TWAREN Che-Nan Yang NCHC,Taiwan Overview OpenFlow Testbed in

C oprocessor A ccelerated F ilterbank Extension Library Mummy, are we there yet Jan Kr amer

A sound curation in musical instrument conservation Gea O.F . Parikesit, Nicole A. Tse, Rong Wei

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt - PowerPoint PPT Presentation

Pattern Recognition Part 4: Feature Extraction Gerhard Schmidt Christian-Albrechts-Universitt zu Kiel Faculty of Engineering Institute of Electrical and Information Engineering Digital Signal Processing and System Theory Feature Extraction

Part 5 pattern recognition pattern recognition track pattern recognition: associate hits

Feature Selection Pattern Recognition: The Early Days Pattern Recognition: The Early Days Only

PATTERN RECOGNITION AND MACHINE LEARNING CHAPTER 1: INTRODUCTION Pattern Recogniton Pattern: Any

CS 7616 Pattern Recognition Introduction Aaron Bobick School of Interactive Computing

Pattern Recognition CSE 802 Michigan State University Spring 2017 Lecture 1, January 9, 2017

Applications of Pattern Recognition in Computational Biology Pattern Recognition Course

Pattern Recognition: An Overview Prof. Richard Zanibbi Pattern Recognition (One) Definition

CS 7616 Pattern Recognition Linear, Linear, Linear Aaron Bobick School of Interactive

CS 7616 Pattern Recognition Bayesian Decision Theory Aaron Bobick School of Interactive Computing

Pattern Recognition 2018 Support Vector Machines Ad Feelders Universiteit Utrecht Ad Feelders

An NFR Pattern Approach to Dealing An NFR Pattern Approach to Dealing An NFR Pattern Approach to

Scope Constrained Frequent Pattern Mining: Constrained Frequent Pattern Mining: A A

A common pattern: map Another common pattern: filter Pattern: take a list and produce a new list,

A summary of deep models for face recognition Qianli Liao Face recognition Face recognition:

8-Speech Recognition Speech Recognition Concepts Speech Recognition Approaches

Pattern Recognition Theory Lecture 12 : Correlation Filters Pattern Matching a How to match

Linear Predictive Coding and Cepstrum coefficients for mining time variant information from

FUTURE INTERNET Testbed @TWAREN Che-Nan Yang NCHC,Taiwan Overview OpenFlow Testbed in

C oprocessor A ccelerated F ilterbank Extension Library Mummy, are we there yet Jan Kr amer

A sound curation in musical instrument conservation Gea O.F . Parikesit, Nicole A. Tse, Rong Wei

AB Feature Extraction Experiments Discussion Noise Robust LVCSR Feature Extraction Based on

MUSIC CLASSIFICATION USING DNNS Course Project for CS365 Chaitanya Ahuja Amlan Kar Mentored by

Analysis of speech Dr. Anil Kumar Vuppala IIIT Hyderabad Analysis of speech Representing speech

SDS: ASR, NLU, &amp; VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System

SDS: ASR, NLU, & VXML Ling575 Spoken Dialog April 14, 2016 Roadmap Dialog System