filter banks
play

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis - PowerPoint PPT Presentation

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis Models (a) Pattern Recognition (b) Acoustic phonetic approaches to speech recognition 3 Spectral Analysis Models LPC analysis model 4 THE BANK-OF-FILTERS FRONT- END


  1. Filter Banks SPEECH RECOGNITION 40833 1

  2. 2 Spectral Analysis Models  (a) Pattern Recognition  (b) Acoustic phonetic approaches to speech recognition

  3. 3 Spectral Analysis Models  LPC analysis model

  4. 4 THE BANK-OF-FILTERS FRONT- END PROCESSOR  Complete bank-of-filter analysis model

  5. 5 THE BANK-OF-FILTERS FRONT- END PROCESSOR

  6. 6 THE BANK-OF-FILTERS FRONT- END PROCESSOR Typical waveforms and spectra for analysis of a pure sinusoid in the filter-bank model

  7. THE BANK-OF-FILTERS FRONT- END 7 PROCESSOR Typical waveforms and spectra of a voice speech signal in the bank-of-filters analysis model

  8. THE BANK-OF-FILTERS FRONT- END 8 PROCESSOR Ideal (a) and realistic (b) set of filter responses of a Q-channel filter bank covering the frequency range Fs/N to (Q+1/2)Fs/N

  9. 9 Types of Filter Bank Used for Speech Recognition F    s f i, 1 i Q i N  Q N/2 F  s b i N

  10. Non-uniform Filter Banks  b c 1    α b b , 2 i Q  i i 1   i 1 (b b )     i 1 f f b , i 1 j 2  j 1 10

  11. 11 Nonuniform Filter Banks   Filter 1 : f 300 Hz, b 200Hz 1 1   Filter 2 : f 600 Hz, b 400 Hz 2 2   Filter 3 : f 1200 Hz, b 800 Hz 3 3   Filter 4 : f 2400 Hz, b 1600Hz 4 4

  12. 12 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (a), a 12-channel third-octave band filter bank (b), and a 7-channel critical band scale filter bank

  13. 13 Types of Filter Bank Used for Speech Recognition Ideal specification of a 4-channel octave band-filter bank (c) covering the telephone bandwidth range (200-3200 HZ) The variation of bandwidth with frequency for the perceptually based critical band scale

  14. 14 Implementations of Filter Banks  Instead of direct convolution, which is computationally expensive, we assume each bandpass filter impulse response to be represented by:  hi(n), i-th bandpass filter impulse response, is represented by a fixed  j n e lowpass window, w(n), modulated by complex exponential i  Where w(n) is a fixed lowpass window representing the

  15. 15 Implementations of Filter Banks The signals s(m) and w(n-m) used in evaluation of the short-time Fourier transform

  16. 16 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE LOG MAGNITUDE (dB) SAMPLE FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) Hamming window on a section of voiced speech

  17. 17 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of voiced speech

  18. 18 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a long (500 points or 50 msec) hamming window on a section of unvoiced speech

  19. 19 Frequency Domain Interpretation of the Short-Time Fourier Transform VALUE SAMPLE LOG MAGNITUDE (dB) FREQUENCY Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of unvoiced speech

  20. 20 Linear Filter Interpretation of the STFT ~ n s ( ) s ( n )   j S n e ( ) 1 w ( n )   j e i

  21. 21 FFT Implementation of a Uniform Filter Bank FFT implementation of a uniform filter bank

  22. 22 Direct implementation of an arbitrary filter bank X 1 n ( ) h 1 n ( ) X 2 n ( ) h 2 n ( ) s ( n )  X Q ( n ) h Q ( n )

  23. 23 Nonuniform FIR Filter Bank Implementations Two arbitrary nonuniform filter-bank filter specifications consisting of eighter 3 bands (part a) or 7 bands (part b).

  24. 24 Tree Structure Realizations of Nonuniform Filter Banks

  25. 25 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)

  26. 26 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window smoothed lowpass window (after Dautrich et al).

  27. 27 Practical Examples of Speech-Recognition Filter Banks VALUE TIME IN SAMPLES MAGNITUDE (dB) FREQUENCY (kHz)

  28. 28 Practical Examples of Speech-Recognition Filter Banks MAGNITUDE (dB) FREQUENCY (kHz) Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window directly as the lowpass window (after Dautrich etal).

  29. 29 Generalizations of Filter-Bank Analyzer

  30. Generalizations of Filter-Bank Analyzer 30

  31. 31 Generalizations of Filter-Bank Analyzer

  32. Generalizations of Filter-Bank Analyzer 32

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend