Multimedia Communications Spring 2006-07 Voice Traffic - - PowerPoint PPT Presentation

multimedia communications
SMART_READER_LITE
LIVE PREVIEW

Multimedia Communications Spring 2006-07 Voice Traffic - - PowerPoint PPT Presentation

CS 584 / CMPE 584 Multimedia Communications Spring 2006-07 Voice Traffic Characteristics Shahab Baqai LUMS Voice Communication Characteristics Speech produces a signal that varies slowly in time 4 kHz bandwidth 2 Voice Coding


slide-1
SLIDE 1

CS 584 / CMPE 584

Multimedia Communications

Spring 2006-07

Voice Traffic Characteristics

Shahab Baqai LUMS

slide-2
SLIDE 2

2

Voice Communication Characteristics Speech produces a signal that varies slowly in time 4 kHz bandwidth

slide-3
SLIDE 3

3

Voice Coding Voice processing comprises of two steps Speech analysis

Converts an analogue voice signal to digital form

Speech Synthesis

Converts a digital voice data into its analogue form

Two Methods used for voice processing Waveform coding

Pulse Code Modulation (PCM) Code-excited Linear Prediction Coding (CELP)

Vocoding

slide-4
SLIDE 4

4

PCM Signal is sampled at regular intervals Sampling rate = 8 kHz (Nyquist Rate) Samples are quantized and transmitted 8 bits/sample ⇒ 64 kbps

slide-5
SLIDE 5

5

Sampling and Quantization

slide-6
SLIDE 6

6

Voice Quality Measure Quantization is a source for degradation (noise) May be measured by

Where

  • is the probability density function of the signal
  • is the decision level

k

X ( )

p x

( ) ( ) ( )

1 1

2 2 1 2 2 1

k k k k

X N k X x X N q k k X

x p x dx SNR x y p x dx σ σ

− −

= =

= = −

∑ ∫ ∑ ∫

slide-7
SLIDE 7

7

Uniform Quantizer Interval between consecutive decision levels is constant

  • (constant)

Problem SNR is not constant Depends on amplitude

The soft speaker is penalized more than a loud speaker

1 k k

X X − − = Δ

slide-8
SLIDE 8

8

Non Uniform Quantizer μ-Law (North America) A-Law (Europe)

slide-9
SLIDE 9

9

Adaptive Differential PCM Takes advantage of the slow rate of change in the voice signal:

– Quantizes and transmits the difference between consecutive samples – May use linear prediction of the signal

slide-10
SLIDE 10

10

CELP (Code-excited Linear Prediction) Coding Coder

– Voice is analyzed in frames of 10~30 ms represented by:

Synthesis filter

  • Updated by linear prediction

Excitation

  • Optimally selected so as to minimize a “perceptually” weighted

measure of distortion

  • Makes use of a codebook

– A data frame is produced & transmitted

Decoder

Excitation Signal

LP filter

Reproduced Waveform

slide-11
SLIDE 11

11

VoCoding For very low bit rates (≅ 2 kbps) Based on modeling the speech production mechanisms rather than the waveform

– Speech is processed in frames of 10~25 ms – Distinction between voiced & unvoiced frames

Voiced speech: vocal cords vibrating (e.g. vowels) Unvoiced speech: vocal cords held firm w/o vibration (e.g. consonants)

Speech is represented by

– Coefficients that define vocal tract resonance characteristics – Excitation energy – Pitch value

slide-12
SLIDE 12

12

VoCoding (cont) Low quality

– Unnatural, buzzy

Works only for human speech

– Not optimized for other audio signals

Little current interest

– No international standard yet

slide-13
SLIDE 13

13

Motivating Voice Compression

– MOS: Mean Opinion Score – subjective measure of voice quality – CELP: Code-excited Linear Prediction – LD: Low Delay – CS-ACELP: Conjugate Structure – Algebraic CELP – MP-MLQ: Multi-Pulse Excitation with a Maximum Likelihood Quantizer

slide-14
SLIDE 14

14

Speech Activity

Speech alternates between two states

– Silence – Talk spurt

slide-15
SLIDE 15

15

Speech Activity (cont)

– One speaker talking : 64 ~ 73 % – Both speakers talking: 3 ~ 7 % – Both speakers silent: 33 ~ 20 % Silence Talkspurt Avg Time ≈ 1.8 sec Avg Time ≈ 1.2 sec

slide-16
SLIDE 16

16

Silence Suppression Voice Activity Detector (VAD)

– When silence is detected, background noise is transmitted – When speech is detected, full fixed bit rate stream is transmitted

About 60% reduction in data rate

– Resulting traffic is no longer constant bit rate – Statistical Multiplexing gain may be significant