Cepstral analysis in speech processing From speech production model, - - PDF document

▶

Nov 01, 2022 248 likes •319 views

Lecture-oct4-a 03 October 2010 11:20 Cepstral analysis in speech processing From speech production model, we have: s[n] = (p[n]*g[n] + u[n]) * v[n] *r[n] p[n] => periodic impulse train u[n] => random white noise g[n] => glottal

SLIDE 1

Cepstral analysis in speech processing

s[n] = (p[n]g[n] + u[n]) v[n] *r[n] p[n] => periodic impulse train u[n] => random white noise g[n] => glottal filter impulse response v[n] => vocal tract impulse response r[n] => lip radiation system impulse response From speech production model, we have:

Consider voiced speech: s[n] = p[n] * g[n] * v[n] * r[n] => S(z) = P(z)H(z) where H(z) = G(z)V(z)R(z)

The convolved components p[n] and h[n] are additive in the complex cepstrum H(z) will give a complex cepstrum which is non-zero for both positive and negative time

which decays rapidly for large n
P(z) gives a complex cepstrum

consisting of decaying impulses at multiples of the pitch period

The real cepstrum is the even part of the complex cepstrum

Screen clipping taken: 25-09-2013, 15:59

Lecture-oct4-a

03 October 2010 11:20 Class A Page 1

SLIDE 2

Screen clipping taken: 03-10-2010, 11:48

From Oppenheim and Schafer, Discrete-time Signal Processing, PHI, 1989

Example of some real cepstra:

Screen clipping taken: 25-09-2013, 16:00

From: cepstrum*murphy.pdf

The example suggests that the a window applied to the cepstrum can separate the 2 components.

Class A Page 2

SLIDE 3

Class A Page 3

SLIDE 4

Speech parameter estimation Formant estimation

Pitch and voicing detection
(From O&S, DT signal processing, PHI, 1989

Short-time analysis needed for: The low-quefrency part of the cepstrum corresponds primarily to the vocal tract, glottal shaping and radiation. The high-quefrency part is due primarily to the excitation.

Part of "chase" [y-axis:increasing time] Lecture-oct4-c

03 October 2010 12:43 Class A Page 4

SLIDE 5

Class A Page 5