Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis - - PowerPoint PPT Presentation

filter banks
SMART_READER_LITE
LIVE PREVIEW

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis - - PowerPoint PPT Presentation

Filter Banks SPEECH RECOGNITION 40833 1 2 Spectral Analysis Models (a) Pattern Recognition (b) Acoustic phonetic approaches to speech recognition 3 Spectral Analysis Models LPC analysis model 4 THE BANK-OF-FILTERS FRONT- END


slide-1
SLIDE 1

Filter Banks

SPEECH RECOGNITION 40833

1

slide-2
SLIDE 2

Spectral Analysis Models

2

  • (a) Pattern Recognition
  • (b) Acoustic phonetic

approaches to speech recognition

slide-3
SLIDE 3

Spectral Analysis Models

3

  • LPC analysis model
slide-4
SLIDE 4

THE BANK-OF-FILTERS FRONT- END PROCESSOR

4

  • Complete bank-of-filter analysis model
slide-5
SLIDE 5

THE BANK-OF-FILTERS FRONT- END PROCESSOR

5

slide-6
SLIDE 6

THE BANK-OF-FILTERS FRONT- END PROCESSOR

6

Typical waveforms and spectra for analysis of a pure sinusoid in the filter-bank model

slide-7
SLIDE 7

THE BANK-OF-FILTERS FRONT- END PROCESSOR

7

Typical waveforms and spectra of a voice speech signal in the bank-of-filters analysis model

slide-8
SLIDE 8

THE BANK-OF-FILTERS FRONT- END PROCESSOR

8

Ideal (a) and realistic (b) set of filter responses of a Q-channel filter bank covering the frequency range Fs/N to (Q+1/2)Fs/N

slide-9
SLIDE 9

Types of Filter Bank Used for Speech Recognition

N F b N/2 Q Q i 1 i, N F f

s i s i

    

9

slide-10
SLIDE 10

Non-uniform Filter Banks

  

       

1 i 1 j 1 i j 1 i 1 i i 1

, 2 ) b (b b f f Q i 2 , b α b c b

10

slide-11
SLIDE 11

Nonuniform Filter Banks

1600Hz b Hz, 2400 f : 4 Filter Hz 800 b Hz, 1200 f : 3 Filter Hz 400 b Hz, 600 f : 2 Filter 200Hz b Hz, 300 f : 1 Filter

4 4 3 3 2 2 1 1

       

11

slide-12
SLIDE 12

Types of Filter Bank Used for Speech Recognition

12

Ideal specification of a 4-channel octave band-filter bank (a), a 12-channel third-octave band filter bank (b), and a 7-channel critical band scale filter bank

slide-13
SLIDE 13

Types of Filter Bank Used for Speech Recognition

13

The variation of bandwidth with frequency for the perceptually based critical band scale Ideal specification of a 4-channel octave band-filter bank (c) covering the telephone bandwidth range (200-3200 HZ)

slide-14
SLIDE 14

Implementations of Filter Banks

14

  • Instead of direct convolution, which is computationally expensive, we

assume each bandpass filter impulse response to be represented by:

  • hi(n), i-th bandpass filter impulse response, is represented by a fixed

lowpass window, w(n), modulated by complex exponential

  • Where w(n) is a fixed lowpass window representing the

i

j n

e

slide-15
SLIDE 15

Implementations of Filter Banks

15

The signals s(m) and w(n-m) used in evaluation of the short-time Fourier transform

slide-16
SLIDE 16

Frequency Domain Interpretation of the Short-Time Fourier Transform

16

Short-time Fourier transform using a long (500 points or 50 msec) Hamming window on a section of voiced speech

FREQUENCY SAMPLE VALUE LOG MAGNITUDE (dB)

slide-17
SLIDE 17

Frequency Domain Interpretation of the Short-Time Fourier Transform

17

Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of voiced speech

FREQUENCY SAMPLE VALUE LOG MAGNITUDE (dB)

slide-18
SLIDE 18

Frequency Domain Interpretation of the Short-Time Fourier Transform

18

FREQUENCY SAMPLE VALUE LOG MAGNITUDE (dB)

Short-time Fourier transform using a long (500 points or 50 msec) hamming window on a section of unvoiced speech

slide-19
SLIDE 19

Frequency Domain Interpretation of the Short-Time Fourier Transform

19

FREQUENCY SAMPLE VALUE LOG MAGNITUDE (dB)

Short-time Fourier transform using a short (50 points or 5 msec) hamming window on a section of unvoiced speech

slide-20
SLIDE 20

Linear Filter Interpretation of the STFT

20

) (

~ n

s

) (n s

) (n w

i

j

e

 

) (

1

 j n e

S

slide-21
SLIDE 21

FFT Implementation of a Uniform Filter Bank

21

FFT implementation of a uniform filter bank

slide-22
SLIDE 22

Direct implementation of an arbitrary filter bank

22

) (n s

) (

1 n

h ) (n X Q ) (

2 n

h

) (n hQ

) (

1 n

X

) (

2 n

X

slide-23
SLIDE 23

Nonuniform FIR Filter Bank Implementations

23

Two arbitrary nonuniform filter-bank filter specifications consisting

  • f eighter 3 bands (part a) or 7 bands (part b).
slide-24
SLIDE 24

Tree Structure Realizations of Nonuniform Filter Banks

24

slide-25
SLIDE 25

Practical Examples of Speech-Recognition Filter Banks

25

FREQUENCY (kHz) TIME IN SAMPLES VALUE MAGNITUDE (dB)

slide-26
SLIDE 26

Practical Examples of Speech-Recognition Filter Banks

26

Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window smoothed lowpass window (after Dautrich et al).

FREQUENCY (kHz) MAGNITUDE (dB)

slide-27
SLIDE 27

Practical Examples of Speech-Recognition Filter Banks

27

FREQUENCY (kHz) TIME IN SAMPLES VALUE MAGNITUDE (dB)

slide-28
SLIDE 28

Practical Examples of Speech-Recognition Filter Banks

28

Window sequence, w(n), (part a), the individual filter response (part b), and the composite response (part c) of a Q = 15 channel, uniform filter bank, designed singal 101-point Kaiser window directly as the lowpass window (after Dautrich etal).

FREQUENCY (kHz) MAGNITUDE (dB)

slide-29
SLIDE 29

Generalizations of Filter-Bank Analyzer

29

slide-30
SLIDE 30

30

Generalizations of Filter-Bank Analyzer

slide-31
SLIDE 31

31

Generalizations of Filter-Bank Analyzer

slide-32
SLIDE 32

32

Generalizations of Filter-Bank Analyzer