Linear Prediction Analysis of Speech Sounds Berlin Chen 2004 - - PowerPoint PPT Presentation

linear prediction analysis of speech sounds
SMART_READER_LITE
LIVE PREVIEW

Linear Prediction Analysis of Speech Sounds Berlin Chen 2004 - - PowerPoint PPT Presentation

Linear Prediction Analysis of Speech Sounds Berlin Chen 2004 References: 1. X. Huang et. al., Spoken Language Processing , Chapters 5, 6 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals , Chapters 4-6 3. J. W. Picone,


slide-1
SLIDE 1

Linear Prediction Analysis of Speech Sounds

References:

  • 1. X. Huang et. al., Spoken Language Processing, Chapters 5, 6
  • 2. J. R. Deller et. al., Discrete-Time Processing of Speech Signals, Chapters 4-6
  • 3. J. W. Picone, “Signal modeling techniques in speech recognition,”

proceedings of the IEEE, September 1993, pp. 1215-1247

Berlin Chen 2004

slide-2
SLIDE 2

2004 SP- Berlin Chen 2

Linear Predictive Coefficients (LPC)

  • An all-pole filter with a sufficient number of poles is a

good approximation to model the vocal tract (filter) for speech signals

– It predicts the current sample as a linear combination of its several past samples

  • Linear predictive coding, LPC analysis, auto-regressive

modeling

Vocal Tract Parameters

p

a a a ,..., ,

2 1

( ) ( ) ( ) ( ) [ ] [ ] [ ] [ ] [ ]

∑ ∑ ∑

= = = −

− = + − = ∴ = − = =

p k k p k k p k k k

k n x a n x n e k n x a n x z A z a z E z X z H

1 1 1

~ 1 1 1

[ ]

n x

[ ]

n e

slide-3
SLIDE 3

2004 SP- Berlin Chen 3

Short-Term Analysis: Algebra Approach

  • Estimate the corresponding LPC coefficients as those

that minimize the total short-term prediction error (minimum mean squared error)

[ ] [ ] [ ]

( )

[ ] [ ]

∑ ∑ ∑ ∑

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − = − = =

= n p j m j m n m m n m m

j n x a n x n x n x n e E

2 1 2 2

~

Framing/Windowing, The total short-term prediction error for a specific frame m

[ ] [ ] [ ] [ ] [ ] [ ] [ ] { }

1 , 1 , 1 ,

1 2 1

p i i n x n e p i i n x j n x a n x p i a j n x a n x a E

n m m n m p j m j m i n p j m j m i m

≤ ≤ ∀ = − ≤ ≤ ∀ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − ≤ ≤ ∀ = ∂ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎣ ⎡ ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − ∂ = ∂ ∂

∑ ∑ ∑ ∑ ∑

= =

1 , − ≤ ≤ N n

Take the derivative

[ ]

n e m

The error vector is orthogonal to the past vectors

This property will be used later on!

slide-4
SLIDE 4

2004 SP- Berlin Chen 4

Short-Term Analysis: Algebra Approach

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

≤ ≤ ∀ − = − − ⇒ ≤ ≤ ∀ − = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − − ⇒ ≤ ≤ ∀ = ⎥ ⎦ ⎤ ⎢ ⎣ ⎡ − ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − −

= = = n m m p j n m m j n m m n p j m m j n m p j m j m

p i n x i n x j n x i n x a p i n x i n x j n x i n x a p i i n x j n x a n x 1 , 1 , 1 0,

1 1 1

Ψ Φa = ⇒

[ ] [ ] [ ] [ ]

− − =

n m m m

j n x i n x j i, : ts coefficien n correlatio Define φ

[ ] [ ]

1 , , ,

1

p i i j i a

m p j m j

≤ ≤ ∀ = ⇒ ∑

=

φ φ

To be used in next page !

i m

a E ∂ ∂

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]⎥

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ , . . . , 2 , 1 . . . , ... 2 , 1 , . ... . . . ... . . . ... . . , 2 ... 2 , 2 1 , 2 , 1 ... 2 , 1 1 , 1

2 1

p a a a p p p p p p

m m m P m m m m m m m m m

φ φ φ φ φ φ φ φ φ φ φ φ

Φ a Ψ

slide-5
SLIDE 5

2004 SP- Berlin Chen 5

Short-Term Analysis: Algebra Approach

  • The minimum error for the optimal,

p j a j ≤ ≤ 1 ,

[ ] [ ] [ ] ( ) [ ] [ ] [ ] [ ] [ ] [ ] [ ]

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − + ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − − = − = =

= = = = n n p j p k m k m j p j m j m n m n p j m j m n m m n m m

k n x a j n x a j n x a n x n x j n x a n x n x n x n e E

1 1 1 2 2 1 2 2

2 ~

[ ] [ ] [ ] [ ] ( ) [ ] [ ]

∑ ∑ ∑ ∑ ∑ ∑ ∑ ∑

= = = = =

− = ⎭ ⎬ ⎫ ⎩ ⎨ ⎧ − − = ⎟ ⎠ ⎞ ⎜ ⎝ ⎛ − −

p j n m m j p j p k n m m k j n p j p k m k m j

n x j n x a k n x j n x a a k n x a j n x a

1 1 1 1 1

[ ] [ ] [ ] ( ) [ ] [ ]

∑ ∑ ∑ ∑

= =

− = − − =

p j m j m p j n m m j n m m

j a j n x n x a n x E

1 1 2

, , φ φ equal Total Prediction Error The error can be monitored to help establish p

Use the property derived in the previous page !

slide-6
SLIDE 6

2004 SP- Berlin Chen 6

Short-Term Analysis: Geometric Approach

  • Vector Representations of Error and Speech Signals

p i , ≤ ≤ ∀ = 1 ,

i m

m

x e

The prediction error vector must be

  • rthogonal to the past vectors

[ ] [ ] [ ]

1 ,

1

N- n n e k n x a n x

m p k m k m

≤ ≤ + − = ∑

=

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]⎥

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − + ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − − − − − − − − − − − − 1 . . . 1 1 . . . 1 . . . 1 ... 2 1 1 1 . ... . . . ... . . . ... . . 1 ... 2 1 1 1 ... 2 1

2 1

N x x x N e e e a a a p N x N x N x p x x x p x x x

m m m m m m p m m m m m m m m m

[ ] [ ] [ ] ( ) [ ] [ ] [ ] ( )

i N x i x i x N e e e

m m m T m m m T

  • 1
  • ,...,
  • 1

,

  • 1
  • ,...,

1 , = =

i m

m

x e

m

e a

[ ] ( )

P m m m

x x x X ....

2 1

=

m

x

( )

( )

m T T m T T m T m T m m m

x X X X a x X Xa X Xa x X e X e x e Xa

1

if minimal is

= ⇒ = ⇒ = − ⇒ = = +

This property has been shown previously (P.3)

the past vectors are as column vectors

1 m

x

1 m

x

slide-7
SLIDE 7

2004 SP- Berlin Chen 7

Short-Term Analysis: Autocorrelation Method

  • is identically zero outside 0 ≤ n ≤ N-1
  • The mean-squared error is calculated within n=0~N-1+p

[ ]

n xm

[ ] [ ] [ ]

⎩ ⎨ ⎧ − ≤ ≤ + =

  • therwise

, 1 n , N n w mL n x n xm

L: Frame Period , the length

  • f time between successive

frames

[ ]

n x

mL+N-1

[ ] [ ]

mL n x n x m + = ~

mL

shift

N-1 N-1

Framing/Windowing

[ ] [ ] [ ]

n w n x n x

m m

~ =

slide-8
SLIDE 8

2004 SP- Berlin Chen 8

Short-Term Analysis: Autocorrelation Method

  • The mean-squared error will be:

[ ] [ ] [ ] ( )

∑ ∑

+ − = + − =

− = =

p N n m m p N n m m

n x n x n e E

1 2 1 2

~

Why?

N+P-1 N-1 N+P-1 N-1

[ ]

n x m

[ ]

n e m

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] ( ) [ ]

( )

∑ ∑ ∑ ∑

− − − = + − = − + = =

− + = − − = − − = ≤ ≤ ∀ = ⇒

j i N n m m j N i n m m p N n m m m m p j m j

j i n x n x j n x i n x j n x i n x j i p i i j i a

1 1 1 1

, 1 , , , φ φ φ

j N-1+j i N-1+i

[ ]

j n x m −

[ ]

i n x m −

N-1+p N-1

[ ]

n x m

Take the derivative:

i m

a E ∂ ∂

j i ≥

slide-9
SLIDE 9

2004 SP- Berlin Chen 9

Short-Term Analysis: Autocorrelation Method

  • Alternatively,

– Where is the autocorrelation function of – And

  • Therefore:

[ ] [ ]

j i R j i

m

− = , φ

[ ] [ ] [ ]

− − =

+ =

k N n m m m

k n x n x k R

1

[ ]

n x m

[ ] [ ]

k R k R

m m

− =

[ ] [ ]

[ ]

[ ]

1 , 1 , , ,

1 1

p i i R j i R a p i i j i a

m p j m j m p j m j

≤ ≤ ∀ = ∑ − ⇒ ≤ ≤ ∀ = ∑

= =

φ φ

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]⎥

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ − − − − p R R R a a a R P x P R p R R R p R R R

m m m p m m m m m m m m m

. . . 2 1 . . . ... 2 1 . ... . . . ... . . . ... . . 2 ... 1 1 ... 1

2 1

A Toeplitz Matrix:

symmetric and all elements

  • f the diagonal are equal

Why?

slide-10
SLIDE 10

2004 SP- Berlin Chen 10

Short-Term Analysis: Autocorrelation Method

  • Levinson-Durbin Recursion

1.Initialization

  • 2. Iteration: For i=1…,p do the following recursion
  • 3. Final Solution:

( )

[ ]

m

R E =

( ) [ ] ( ) [ ] ( )

1 1

1 1

− − − − =

− =

i E j i R i a i R i k

m i j j m

( ) ( ) [ ]

( ) (

) ( )

1 1

  • where

, 1 1

2

≤ ≤ − − = i k i E i k i E

( ) ( ) ( ) ( )

1 1 for , 1 1 i- j i a i k i a i a

j i j j

≤ ≤ − − − =

( ) ( )

i k i a i =

( )

p j p a a

j j

≤ ≤ = 1 for

A new, higher order coefficient is produced at each iteration i

[ ] [ ] [ ] ( ) [ ] [ ]

∑ ∑ ∑ ∑

= =

− = − − =

p j m j m p j n m m j n m m

j a j n x n x a n x E

1 1 2

, , φ φ

slide-11
SLIDE 11

2004 SP- Berlin Chen 11

Short-Term Analysis: Covariance Method

  • is not identically zero outside 0 ≤ n ≤ N-1

– Window function is not applied

  • The mean-squared error is calculated within n=0~N-1
  • The mean-squared error will be:

[ ]

n xm [ ]

n x

mL+N-1

[ ] [ ]

mL n x n x m + =

mL

shift

N-1

[ ] [ ] [ ] ( )

∑ ∑

− = − =

− = =

1 2 1 2

~

N n m m N n m m

n x n x n e E

slide-12
SLIDE 12

2004 SP- Berlin Chen 12

Short-Term Analysis: Covariance Method

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] ( ) [ ]

∑ ∑ ∑ ∑

− − − = − = − = =

− + = − − = − − = ≤ ≤ ∀ = ⇒

i N i n m m N n m m N n m m m m p j m j

j i n x n x j n x i n x j n x i n x j i p i i j i a

1 1 1 1

, 1 , , , φ φ φ

j N-1+j i N-1+i

[ ]

j n xm −

[ ]

i n xm −

N-1 N-1

[ ] [ ]

P i i j i a

m P j m j

≤ ≤ ∀ =

=

1 , , ,

1

φ φ

[ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ] [ ]⎥

⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ = ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎥ ⎦ ⎤ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎢ ⎣ ⎡ , . . . , 2 , 1 . . . , ... 2 , 1 , . ... . . . ... . . . ... . . , 2 ... 2 , 2 1 , 2 , 1 ... 2 , 1 1 , 1

2 1

p a a a p p p p p p

m m m P m m m m m m m m m

φ φ φ φ φ φ φ φ φ φ φ φ

Not A Toeplitz Matrix:

symmetric and but not all elements

  • f the diagonal are equal

[ ] [ ] [ ]

p p

m m m

, ... 2 , 2 1 , 1 φ φ φ ≠ ≠

Take the derivative:

i m

a E ∂ ∂

slide-13
SLIDE 13

2004 SP- Berlin Chen 13

( ) ( )

jw jw

e H G e H ⋅ = ′

LPC Spectra

  • LPC spectrum matches more closely the peaks than

the valleys

– Because the regions where contribute more to the error than those where

[ ]

( ) ( ) ( )

ω π ω π

π π ω ω π π ω

d e H e X G d e E n e E

j j m p N n j m m m

∫ ∑ ∫

− + − = −

= = = =

2 2 1 2 2 2

2 1 2 1

( ) ( )

ω ω j j m

e H e X >

( ) ( )

ω ω j m j

e X e H >

Parseval’s theorem

slide-14
SLIDE 14

2004 SP- Berlin Chen 14

LPC Spectra

  • LPC provides estimate of a gross shape of the

short-term spectrum

Order = 6 Order = 14 Order = 24 Order = 128

slide-15
SLIDE 15

2004 SP- Berlin Chen 15

LPC Prediction Errors

slide-16
SLIDE 16

2004 SP- Berlin Chen 16

MFCC vs. LPC Cepstrum Coefficients

  • MFCC outperforms LPC Cepstrum coefficients

– Perceptually motivated mel-scale representation indeed helps recognition

  • Higher-order MFCC does not further reduce the error rate in

comparison with the 13-order MFCC

  • Another perceptually motivated features such as first- and

second-order delta features can significantly reduce the recognition errors

slide-17
SLIDE 17

2004 SP- Berlin Chen 17

Homework 7 Fall 2004

  • Try to implement the short-term linear prediction coding

(LPC) for speech signals

  • You should follow the following instructions:
  • 1. Using the autocorrelation method with Levinson-Durbin

Recursion and Rectangular/Hamming windowing

  • 2. Analyzing the vowel (or FINAL) portions of speech signal with

different model orders (different P, e.g. P=6, 14, 24 and 128)

  • 3. Plotting the LPC spectra as well as the original speech spectrum
  • 4. Using the speech wave file, bk6_1.wav (no header, PCM 16KHz

raw data), as the exemplar

slide-18
SLIDE 18

2004 SP- Berlin Chen 18

Homework 7 Fall 2004

  • Hints:
  • 1. When the LPC coefficients aj are derived, you can construct

impulse response signal h[n], 0≤n ≤ N-1 (N: frame size) by:

  • 2. The prediction Error E can be expressed by the autocorrelation

function:

slide-19
SLIDE 19

2004 SP- Berlin Chen 19

Homework 7 Fall 2004

  • 3. The MATLab example code:

x=[184.6400 184.1251 . . . . . . . 197.7890 -26.8000 ]; % original signal, dimension: frame size y=[1.0000 2.0105 . . . . . . . 0.0738 0.0565 ]; % filter's impulse response h[n], dimension: frame size gain=valG; % valG: the prediction Error E X=fft(x,512); % fast Fourier Transform, so the frame size < 512 Y=fft(y,512); % fast Fourier Transform X(1)=[]; % remove the X(1), the DC Y(1)=[]; % remove the Y(1), the DC M=512; powerX=abs(X(1:M/2)).^2; % the power spectrum of X logPX=10*log(powerX); % the power spectrum of X in dB powerY=abs(Y(1:M/2)).^2; % the power spectrum of Y logPY=10*log(powerY)+10*log(gain); % the power spectrum of Y in dB % plus the gain (Error) in dB nyquist=8000; % maximal frequency index freq=(1:M/2)/(M/2)*nyquist; % an array store the frequency indices figure(1); plot(freq,logPX,'b',freq,logPY,'r'); % plot the result, % b: blue line for the power spectrum of the original signal % r: red line for the power spectrum of the filter

slide-20
SLIDE 20

2004 SP- Berlin Chen 20

Homework 7 Fall 2004

  • Example Figures of LPC Spectra

Order = 6 Rectangle window No pre-emphasis Order = 14 Rectangle window No pre-emphasis Order = 24 Rectangle window No pre-emphasis Order = 128 Rectangle window No pre-emphasis Order = 128 Rectangle window Pre-emphasis Order = 128 Hamming window Pre-emphasis Order = 128 Hamming window No pre-emphasis

Plotted by Roger Kuo, Fall 2002