1
Linear Prediction 1 Outline Windowing LPC Introduction to - - PowerPoint PPT Presentation
Linear Prediction 1 Outline Windowing LPC Introduction to - - PowerPoint PPT Presentation
Linear Prediction 1 Outline Windowing LPC Introduction to Vocoders Excitation modeling Pitch Detection Short-Time Processing Speech signal is inherently non-stationary For continuant phonemes there are stationary
Outline
Windowing LPC Introduction to Vocoders Excitation modeling
Pitch Detection
Short-Time Processing
Speech signal is inherently non-stationary For continuant phonemes there are stationary periods of
at least 20-25ms
The short-time speech frames are assumed stationary The frame length should be chosen to include just one
phoneme or allophone
Frame lengths are usually chosen to be between 10-
50ms
We consider rectangular and Hamming windows here
3
Rectangular Window
Hamming Window
Comparison of Windows
6
Comparison of Windows (cont’d)
Linear Prediction Coding (LPC)
Based on all-pole model for speech production system: In time domain, we get: In other words, we can predict s[n] as a function of p
previous signal samples (and the excitation).
The set of {ak} is one way of representing the time
varying filter. There are many other ways to represent this filter (e.g., pole value, Lattice filter value, LSP, …).
p k k k z
a A z H
1
. 1 ) (
] [ ] [ . ] [
1
n Au k n s a n s
g p k k
LPC parameter estimation
There are many methods to estimate the
LPC parameters:
Autocorrelation method: results in the
- ptimization of a in a set of p linear equations.
Covariance method
Procedures (such as Levinson-Durbin,
Burg, Le Roux) obtain efficient estimation
- f these parameters.
LPC Parameters in Coding (vocoders)
DT impulse generator G(z) glottal filter white noise generator H(z) vocal tract filter R(z) lip radiation filter s(n) speech signal voiced unvoiced Θ0 gain Θ0 gain Pitch period, P DT impulse generator white noise generator all-pole filter s(n) speech signal voiced unvoiced Θ0 gain Pitch period, P V UV V UV
11
Linear Prediction (Introduction):
The object of linear prediction is to
estimate the output sequence from a linear combination of input samples, past output samples or both :
The factors a(i) and b(j) are called predictor
coefficients.
p i q j
i n y i a j n x j b n y
1
) ( ) ( ) ( ) ( ) ( ˆ
12
Linear Prediction (Introduction):
Many systems of interest to us are describable
by a linear, constant-coefficient difference equation :
If Y(z)/X(z)=H(z), where H(z) is a ratio of
polynomials N(z)/D(z), then
Thus the predictor coefficients give us immediate access to
the poles and zeros of H(z).
q j p i
j n x j b i n y i a ) ( ) ( ) ( ) (
p i i q j j
z i a z D z j b z N ) ( ) ( and ) ( ) (
13
Linear Prediction (Types of System Model):
There are two important variants :
All-pole model (in statistics, autoregressive
(AR) model ) :
The numerator N(z) is a constant.
All-zero model (in statistics, moving-average
(MA) model ) :
The denominator D(z) is equal to unity.
The mixed pole-zero model is called the
autoregressive moving-average (ARMA) model.
14
Linear Prediction (Derivation of LP equations):
Given a zero-mean signal y(n), in the AR
model :
The error is : To derive the predictor we use the orthogonality
principle, the principle states that the desired coefficients are those which make the error
- rthogonal to the samples y(n-1), y(n-2),…, y(n-p).
p i
i n y i a n y
1
) ( ) ( ) ( ˆ
p i
i n y i a n y n y n e ) ( ) ( ) ( ˆ ) ( ) (
15
Linear Prediction (Derivation of LP equations):
Thus we require that
Or, Interchanging the operation of averaging and
summing, and representing < > by summing over n, we have
The required predictors are found by solving these
equations.
p ..., 2, 1, j for ) ( ) ( n e j n y
) ( ) ( ) (
p i
i n y i a j n y p 1,..., j , ) ( ) ( ) (
n p i
j n y i n y i a
16
Linear Prediction (Derivation of LP equations):
The orthogonality principle also states that resulting
minimum error is given by
Or,
We can minimize the error over all time :
where
E r i a
p i i
0
) (
) ( ) ( ) (
2
n e n y n e E E n y i n y i a
n p i
) ( ) ( ) (
n i
i n y n y r ) ( ) (
, ...,p , j r i a
j i p i
2 1 , ) (
17
Linear Prediction (Applications):
Autocorrelation matching :
We have a signal y(n) with known
autocorrelation . We model this with the AR system shown below : ) (n ryy
p i i iz
a z A z H
1
1 ) ( ) (
) (n e σ 1-A(z) ) (n y
18
Linear Prediction (Order of Linear Prediction):
The choice of predictor order depends on
the analysis bandwidth. The rule of thumb is :
For a normal vocal tract, there is an average
- f about one formant per kilo Hertz of BW.
One formant requires two complex conjugate
poles.
Hence for every formant we require two
predictor coefficients, or two coefficients per kilo Hertz of bandwidth.
c BW p 1000 2
19
Linear Prediction (AR Modeling of Speech Signal):
True Model:
DT Impulse generator G(z) Glottal Filter
Uncorrelated
Noise generator H(z) Vocal tract Filter R(z) LP Filter Voiced Unvoiced Pitch Gain Gain V U
U(n) Voiced Volume velocity
s(n) Speech Signal
20
Linear Prediction (AR Modeling of Speech Signal):
Using LP analysis :
DT Impulse generator White Noise generator All-Pole Filter (AR) Voiced Unvoiced Pitch
Gain estimate
V U
H(z)
s(n) Speech Signal
Introduction to Vocoders
Beside the estimation of the vocal tract parameters, a
vocoder needs excitation estimation.
In early vocoders, this has been achieved by the
estimation of V/UV, pitch, and gain.
More modern vocoders involve more sophisticated
estimation of the excitation, such as in CELP, where vector quantization is used.
vocoder analysis Channel (or storage) vocoder synthesizer ŝ(n) synthesized speech signal V/UV pitch filter parameters s(n)
- riginal
speech signal
Pitch Detection
Since speech signal in voiced frames is
quasi-periodic (and not fully periodic), the pitch detection is not always easy.
Especially in some phonemes that manifest
less periodic behavior, pitch detection is difficult.
Some pitch detection methods:
AMDF (Average Magnitude Difference Function) Autocorrelation with center clipping