DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A - - PowerPoint PPT Presentation

desi gn of a celp coder and a study desi gn of a celp
SMART_READER_LITE
LIVE PREVIEW

DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A - - PowerPoint PPT Presentation

DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A STUDY OF I TS PERFORMANCE USI NG VARI OUS OF I TS PERFORMANCE USI NG VARI OUS QUANTI ZATI ON METHODS QUANTI ZATI ON METHODS EECS 651: PROJECT PRESENTATI ON UNI VERSI TY OF MI


slide-1
SLIDE 1

DESI GN OF A CELP CODER AND A STUDY DESI GN OF A CELP CODER AND A STUDY OF I TS PERFORMANCE USI NG VARI OUS OF I TS PERFORMANCE USI NG VARI OUS QUANTI ZATI ON METHODS QUANTI ZATI ON METHODS

EECS 651: PROJECT PRESENTATI ON

UNI VERSI TY OF MI CHI GAN, ANN ARBOR APRI L 18, 2005

By By Awais M. Kamboh Awais M. Kamboh Krispian Krispian C. Lawrence

  • C. Lawrence

Aditya Aditya M. Thomas

  • M. Thomas

Philip I. Tsai Philip I. Tsai

slide-2
SLIDE 2

PROJECT GOALS PROJECT GOALS

  • To design and implement a CELP

To design and implement a CELP coder in coder in matlab matlab

  • To use different quantization methods

To use different quantization methods to quantize the LP parameters of the to quantize the LP parameters of the coder coder

  • To evaluate the performance of the

To evaluate the performance of the coder in terms of MSE and coder in terms of MSE and ‘ ‘perceptual perceptual MSE MSE’ ’ using the various methods of using the various methods of quantization quantization

slide-3
SLIDE 3

Presentation Outline Presentation Outline

  • Introduction to Speech coding

Introduction to Speech coding

  • CELP

CELP

  • CELP coder

CELP coder

  • Quantization Methods

Quantization Methods

  • Results and Comparisons

Results and Comparisons

  • Conclusions and recommendations

Conclusions and recommendations

  • Q&A

Q&A

slide-4
SLIDE 4

I ntroduction to Speech I ntroduction to Speech Coding Coding

  • Concerned with obtaining compact

Concerned with obtaining compact digital representation of voice signals digital representation of voice signals for more efficient transmission or for more efficient transmission or smaller storage size. smaller storage size.

  • Objective is to represent speech signal

Objective is to represent speech signal with minimum number of bits yet with minimum number of bits yet maintain the perceptual quality. maintain the perceptual quality.

slide-5
SLIDE 5

Speech Production Speech Production

  • Speech

Speech – – Air pushed from the lungs past Air pushed from the lungs past the vocal cords and along the the vocal cords and along the vocal tract vocal tract – – The basic vibrations The basic vibrations – – vocal vocal cords cords – – The sound is altered by the The sound is altered by the disposition of the vocal tract disposition of the vocal tract ( tongue and mouth) ( tongue and mouth)

  • Model the vocal tract as a filter

Model the vocal tract as a filter – – The shape changes relatively The shape changes relatively slowly slowly

  • The vibrations at the vocal cords

The vibrations at the vocal cords – – The excitation signal The excitation signal

slide-6
SLIDE 6

Speech sounds Speech sounds

  • Voiced sound

Voiced sound

– – The vocal cords vibrate open and close The vocal cords vibrate open and close – – Quasi Quasi-

  • periodic pulses of air

periodic pulses of air – – The rate of the opening and closing The rate of the opening and closing – – the pitch the pitch

  • Unvoiced sounds

Unvoiced sounds

– – Forcing air at high velocities through a constriction Forcing air at high velocities through a constriction – – Noise Noise-

  • like turbulence

like turbulence – – Show little long Show little long-

  • term periodicity

term periodicity – – Short Short-

  • term correlations still present

term correlations still present

  • Plosive sounds

Plosive sounds

– – A complete closure in the vocal tract A complete closure in the vocal tract – – Air pressure is built up and released suddenly Air pressure is built up and released suddenly

slide-7
SLIDE 7

Code Code-

  • Excited Linear Predictor (CELP)

Excited Linear Predictor (CELP)

  • Variants of CELP (LD

Variants of CELP (LD-

  • CELP, ACELP etc.)

CELP, ACELP etc.)

  • Main difference in generation of excitation

Main difference in generation of excitation signal, Filters and Bit rate. signal, Filters and Bit rate.

  • Performance

Performance

– – 4kbps or lower bit

4kbps or lower bit-

  • rates give synthetic quality

rates give synthetic quality speech / mechanical speech. speech / mechanical speech.

– – Most modern CELP variants produce relatively

Most modern CELP variants produce relatively higher bit higher bit-

  • rates and good quality speech.

rates and good quality speech.

– – Performance cannot be judged by MSE alone.

Performance cannot be judged by MSE alone.

slide-8
SLIDE 8

Linear Predictive Coding. Linear Predictive Coding.

  • Lungs generate an excitation signal which is

Lungs generate an excitation signal which is modeled as white noise. modeled as white noise.

  • Vocal cords either remain open or vibrate with

Vocal cords either remain open or vibrate with some frequency, called some frequency, called ‘ ‘Pitch Pitch’ ’. .

  • The resulting speech is either unvoiced or voiced

The resulting speech is either unvoiced or voiced respectively. respectively.

  • Vocal tract acts as an IIR filter.

Vocal tract acts as an IIR filter.

slide-9
SLIDE 9

CELP Parameters (I n this I mplementation) CELP Parameters (I n this I mplementation)

  • Excitation Signal:

Excitation Signal: A number of signals are stored in

A number of signals are stored in a codebook. We choose the signal that best suits a particular a codebook. We choose the signal that best suits a particular chunk of data (frame). chunk of data (frame).

  • LP Coefficients:

LP Coefficients: The coefficients of vocal tract filter.

The coefficients of vocal tract filter.

  • Gain

Gain: Represents the loudness/energy of speech.

: Represents the loudness/energy of speech.

  • Pitch Filter Coefficient

Pitch Filter Coefficient : We determine pitch by

: We determine pitch by modeling it as a long delay correlation filter which produces modeling it as a long delay correlation filter which produces quasi quasi-

  • periodic signals when excited.

periodic signals when excited.

  • Pitch:

Pitch: Pitch of the sound. In the range 50Hz to 500Hz. In

Pitch of the sound. In the range 50Hz to 500Hz. In this case it is referred to as Pitch Delay measured in # of this case it is referred to as Pitch Delay measured in # of samples samples

slide-10
SLIDE 10

Rate of CELP Rate of CELP

Frame Size: 160 samples. (20 ms) Subframe Size: 40 samples (5 ms) LP coefficients are transmitted once per frame. All others are transmitted once per subframe. Code Book : 512 entries; 9 bits Gain: Generally between -2 to + 2: 8 bits Pitch: 50Hz to 500Hz = > 16 to 160 samples (at 8KHz Sampling): 8 bits Pitch filter Coeff: 0 to 1.4: 6 bits LP Coefficients: Different for different Rates.

slide-11
SLIDE 11

CELP Encoder CELP Encoder

LP Analyzer LP Coefficients ‘a’

Code Book Excitation Sequence Pitch Filter

Reconstruction Filter Perceptual Filter

X

  • Gain

Speech Select Min Energy

Speech E

) / ( ) ( c z A z A

) ( 1 z A

X ek

  • P

bz − − 1 1

Gain Speech

. min

E

slide-12
SLIDE 12

CELP Encoder (Contd.) CELP Encoder (Contd.)

Gain ‘G’ Pitch Filter Coefficient ‘b’ Pitch Delay ‘P’ Excitation Sequence ‘k’ Linear Predictor Coefficients ‘a’ Scalar Quantizer SQ VQ DPCM Binary Encoded Data

slide-13
SLIDE 13

CELP Decoder CELP Decoder

Reconstruction Gain ‘G’ Pitch Filter Coefficient ‘b’ Pitch Delay ‘P’ Excitation Sequence ‘k’ Reconstruction Linear Predictor Coefficients ‘a’

) ( 1 z A

X

P

bz− − 1 1

Gain

ek Reconstructed Speech

Binary Decoding

slide-14
SLIDE 14

Perceptual Perceptual Filtering Filtering

) / ( ) ( ) ( c z A z A z H = ) / ( ) ( c z A z A red =

c = 0.8

) ( 1 ) / ( 1 ) / ( ) ( z A blue c z A green c z A z A red = = =

Frequency (Hz)

slide-15
SLIDE 15

Perceptual Filtering (Contd.) Perceptual Filtering (Contd.)

) / ( ) ( c z A z A

Different values of ‘c’ in Perceptual filter.

slide-16
SLIDE 16

Performance of CELP (Unquantized) mse = 0.0041

Original Unquantized

slide-17
SLIDE 17

Performance of CELP (Quantized) mse = 0.0120

LP Coefficients: Unquantized Other Parameters: Quantized

slide-18
SLIDE 18

Quantization Methods Used Quantization Methods Used

  • Scalar Quantization

Scalar Quantization

  • DPCM

DPCM

  • Vector Quantization

Vector Quantization

  • TSVQ

TSVQ

slide-19
SLIDE 19

Scalar Quantization Scalar Quantization

  • Quantize one sample at a time

Quantize one sample at a time

  • The simplest quantization scheme

The simplest quantization scheme

  • Design

Design quantizers quantizers with sizes M = 2, 4 , with sizes M = 2, 4 , 8, 16, 32, 64, 128, 256 8, 16, 32, 64, 128, 256

slide-20
SLIDE 20

Scalar Scalar Quantizer Quantizer Design Design

  • Lloyd algorithm

Lloyd algorithm

  • Initial guess:

Initial guess: a uniform codebook a uniform codebook

slide-21
SLIDE 21

Scalar Scalar Quantizer Quantizer Design Design

  • Training data:

Training data: 15000 samples of LP coefficients 15000 samples of LP coefficients generated from different speech generated from different speech sources sources 15000/256 = 58 points/cell for M= 256 15000/256 = 58 points/cell for M= 256 15000/2 = 7500 points/cell for M= 2 15000/2 = 7500 points/cell for M= 2

slide-22
SLIDE 22

Performance of the SQ Performance of the SQ

slide-23
SLIDE 23

DPCM DPCM

  • Quantizing the prediction error, once

Quantizing the prediction error, once at a time at a time

  • Essentially a scalar

Essentially a scalar quantizer quantizer

  • Good for slowly varying sources

Good for slowly varying sources

  • Need a model for the source to design

Need a model for the source to design the linear predictor the linear predictor

slide-24
SLIDE 24

DPCM Design DPCM Design – – Predictor Predictor

  • Assume a source model

Assume a source model

  • First

First-

  • order AR, zero
  • rder AR, zero-
  • mean Gaussian

mean Gaussian

slide-25
SLIDE 25

DPCM Design DPCM Design – – Predictor Predictor

  • Gaussian?

Gaussian? Many different kinds of speech, and Many different kinds of speech, and LP coefficients LP coefficients

  • Zero

Zero-

  • mean?

mean? Empirical mean is near to zero Empirical mean is near to zero

slide-26
SLIDE 26

DPCM Design DPCM Design – – Predictor Predictor

  • First

First-

  • order AR?
  • rder AR?

Correlation analysis indicates a large Correlation analysis indicates a large first first-

  • order correlation coefficient, near
  • rder correlation coefficient, near

0.8, and small higher 0.8, and small higher-

  • order
  • rder

coefficients, smaller than 0.01 coefficients, smaller than 0.01

– –

slide-27
SLIDE 27

DPCM Design DPCM Design – – Quantizer Quantizer

  • Designed to be optimal for the random

Designed to be optimal for the random variables variables V Vi

i = X

= Xi

i –

– a a1

1X

Xi

i-

  • 1

1

  • Extract a

Extract a1

1 from correlation analysis,

from correlation analysis, like solving the Yule like solving the Yule-

  • Walker equation

Walker equation

  • Avoid calculating the limiting density

Avoid calculating the limiting density

  • f the prediction error
  • f the prediction error
slide-28
SLIDE 28

DPCM Performance DPCM Performance

slide-29
SLIDE 29

SQ vs. DPCM SQ vs. DPCM

slide-30
SLIDE 30

SQ vs. DPCM SQ vs. DPCM

slide-31
SLIDE 31

SQ vs. DPCM SQ vs. DPCM

For DPCM: For DPCM:

  • Significant improvement for lower rate

Significant improvement for lower rate than SQ than SQ

  • The simple models for sources and

The simple models for sources and quantizer quantizer input are effective input are effective

slide-32
SLIDE 32

Vector Quantization Vector Quantization

  • Key challenge

Key challenge

– – Given a source Given a source distribution, how to distribution, how to select codebook ( select codebook (* * ) ) and partitions ( and partitions (---

  • --)

) to result in smallest to result in smallest average distortion average distortion

slide-33
SLIDE 33

VQ Design VQ Design

  • LBG algorithm was designed and

LBG algorithm was designed and implemented in implemented in Matlab Matlab

  • Computes a codebook of a desired size

Computes a codebook of a desired size given a training sequence given a training sequence

slide-34
SLIDE 34

Performance of the CELP coder Performance of the CELP coder

  • MOS, Mean Opinion Score

MOS, Mean Opinion Score

– – A sample of 20 people A sample of 20 people – – Listen to reconstructed speech sample Listen to reconstructed speech sample and rate the intelligibility and rate the intelligibility

  • Excellent

Excellent – – 5 5

  • Good

Good – – 4 4

  • Fair

Fair – – 3 3

  • Poor

Poor – – 2 2

  • Bad

Bad – – 1 1

slide-35
SLIDE 35

Performance of Coder Performance of Coder with DPCM with DPCM

M = 2 MOS = 1 M = 4 MOS = 1 M = 8 MOS = 1 Original M = 16 MOS = 1 M = 32 MOS = 2.3 M = 64 MOS = 3.1 M = 128 MOS = 3.9 M = 256 MOS = 4.5

slide-36
SLIDE 36

Performance of Coder Performance of Coder with SQ with SQ

M = 2 M = 2 MOS = 1 M = 4 M = 4 MOS = 1 M = 8 M = 8 MOS = 1 Original M = 16 M = 16 MOS = 1 M = 32 M = 32 MOS = 1.8 M = 64 M = 64 MOS = 2.9 M = 128 M = 128 MOS = 3.6 M = 256 M = 256 MOS = 4.1

slide-37
SLIDE 37

Performance of Coder Performance of Coder with VQ with VQ

M = 2 M = 2 MOS = 1.7 M = 4 M = 4 MOS = 1.9 M = 8 M = 8 MOS = 2.5 Original M = 16 M = 16 MOS = 2.9 M = 32 M = 32 MOS = 3.1 M = 64 M = 64 MOS = 3.1 M = 128 M = 128 MOS = 2.9 M = 256 M = 256 MOS = 3.0

slide-38
SLIDE 38

Conclusions Conclusions

  • Improvement in the quantization of LP

Improvement in the quantization of LP coefficients improves the performance coefficients improves the performance

  • f the coder
  • f the coder
  • For a given codebook size, VQ

For a given codebook size, VQ performed better in terms of MSE performed better in terms of MSE

  • DPCM performed better in terms of

DPCM performed better in terms of perceptual MSE perceptual MSE

slide-39
SLIDE 39

Questions Questions

????????? ?????????

slide-40
SLIDE 40

THANK YOU