MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder - - PowerPoint PPT Presentation

melp vocoder
SMART_READER_LITE
LIVE PREVIEW

MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder - - PowerPoint PPT Presentation

MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic train or


slide-1
SLIDE 1

MELP Vocoder

Page 0 of 23

slide-2
SLIDE 2

Outline

Introduction

MELP Vocoder Features

Algorithm Description

Parameters & Comparison

Page 1 of 23

slide-3
SLIDE 3

Introduction

Traditional pitched-excited LPC vocoders use either a periodic train or white noise for synthesis filter  intelligible speech at very low bit rates

But sometimes results in mechanical or buzzy sound and are prone to tonal noise

Page 2 of 23

slide-4
SLIDE 4

Introduction

These problems arise from:

Inability of a simple pulse train to reproduce all kind of voiced speech

MELP vocoder uses a mixed-excitation model and it represents a richer ensemble of speech characteristic

 Produce more natural sounding speech

Page 3 of 23

slide-5
SLIDE 5

MELP vocoder

Robust in background noise environments

Based on traditional LPC model, also includes additional features

Page 4 of 23

Aperiodic pulses Adaptive spectral enhancement Mixed excitation Pulse dispersion

slide-6
SLIDE 6

هدننك دك

هحفص5 زا54

LPC LPC LPC LSF LSF LSF MSVQ

ردكو MELP

slide-7
SLIDE 7

هحفص6 زا54

ردكو MELP

slide-8
SLIDE 8

هيروف ليدبت ياه هنماد هبساحم

هحفص7 زا54

  • FFT

ردكو MELP

slide-9
SLIDE 9

يدويرپ ريغ مچرپ نييعت و ييادص ياه تدش هبساحمك

هحفص8 زا54

L=40,41,…,160

ردكو MELP

slide-10
SLIDE 10

جوا طاقن يگدنكارپ نازيم

هحفص9 زا54

ردكو MELP

 

   

79 80 79 80 2

] [ 160 1 ] [ 160 1

n n

n e n e p P=12.64 P=6.77 P=1.1 P=1.16

slide-11
SLIDE 11

جوا طاقن يگدنكارپ نازيم

هحفص10 زا54

 

   

79 80 79 80 2

] [ 160 1 ] [ 160 1

n n

n e n e p

ردكو MELP

slide-12
SLIDE 12

تيب صاصتخا لودج

هحفص11 زا54

LSF

25 25 8

  • 8

8

VS1

7 7 4

  • 1
  • 13

1 1

54 54

ردكو MELP

slide-13
SLIDE 13

Mixed Excitation

Mixed-excitation is implemented using a multi-band mixing model

This model can simulate frequency dependent voicing strength

Using a mixture of Aperiodic/periodic and white noise as excitation

Primary effect of this unit is to reduce the buzz in broadband acoustic noise

Page 12 of 23

slide-14
SLIDE 14

Aperiodic pulses

When input signal is voiced, MELP vocoder can synthesize speech using either aperiodic or periodic pulses.

Aperiodic pulses used during transition regions between voiced and unvoiced segments of speech signal

 Producing erratic glottal pulses without tonal noise

Page 13 of 23

slide-15
SLIDE 15

Pulse Dispersion

Pulse dispersion is implemented using fixed pulse dispersion filter based on a flattened triangle pulse

The pulse dispersion filter improves the match of bandpass filtered synthetic and natural speech waveforms in frequency bands which do not contain a formant resonance.  Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech

Page 14 of 23

slide-16
SLIDE 16

Adaptive spectral enhancement filter

Based on the poles of the vocal tract filter

Is used to enhance the formant structure in the synthetic speech

This filter improves the match between synthetic and natural bandpass waveforms  more natural speech

  • utput

Page 15 of 23

slide-17
SLIDE 17

MELP Algorithm Description (Encoder)

.1

filter out any low frequency noise

.2

This filtered speech is again filtered in

  • rder to perform the initial pitch search

for the pitch estimation

.3

The next step is to perform the Bandpass voicing analysis

  • In this step we decide to use periodic/Aperiodic train or

white noise model

Page 16 of 23

slide-18
SLIDE 18

MELP Algorithm Description (Encoder) cont’d

In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function of the speech signal and the smoothed rectified signal in the non-DC band

Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function:

Page 17 of 23

2 / 1 1 2 1 2 1

] ) ( ) ( [ ) ( ) ( ) (

  

     

  

N n N n N n x

p n x n x p n x n x p R

P – the pitch of current frame N – the frame length k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))

slide-19
SLIDE 19

MELP Algorithm Description (Encoder ) cont’d

The jittery state is determined by the peakiness of the fullwave rectified LP residue e(n):

Page 18 of 23

 

   

1 1 2 / 1 2

) ( 1 ] ) ( 1 [

N n N n

n e N n e N Peakiness

 If peakiness is greater than some threshold, the speech

frame is then flagged as jittered (Aperiodic flag will be set)

slide-20
SLIDE 20

MELP Algorithm Description (Encoder) cont’d

4.

Applying a LPC analysis 5. Calculating final pitch estimate 6. Calculating Gain estimate 7. quantize the LPC coefficients, pitch, gain and bandpass voicing .8 Fourier magnitudes are determined and quantized

The information in these coefficients improves the accuracy of the speech production model at the perceptually-important lower frequencies

Page 19 of 23

slide-21
SLIDE 21

MELP Encoder

Page 20 of 23

Pre filter Pitch Search

Bandpass Voicing Decision

Gain Calculator LPC Analysis Filter Final Pitch And voicing Decision LSF

quantization

Quantize Gain, pitch, Voicing, jitter

Fourier Magnitude calculation Apply Forward Error Correction

Input signal Transmitted Bitstream

slide-22
SLIDE 22

MELP Algorithm (Decoder)

.1

Decoding the pitch

.2

Applying gain attenuation

.3

Interpolating linearly all of the synthesis parameters pitch-synchronously

.4

Generating mixed-excitation

Page 21 of 23

slide-23
SLIDE 23

MELP Algorithm (Decoder) cont’d

.5

Applying an adaptive spectral enhancement filter

.6

LPC synthesis and applying gain factor

.7

Dispersion filtering

Page 22 of 23

slide-24
SLIDE 24

MELP Decoder

Page 23 of 23

Decode parameters Noise Generator Noise Shaping Filter Pulse Generator Pulse Position Jitter Pulse Shaping Filter Adaptive Spectral Enhancement + LPC Synthesis Filter Pulse Dispersion Filter gain Received Bitstream Synthesized Speech

slide-25
SLIDE 25

Parameter Quantization

Parameters Voiced Unvoiced LSF parameters 25 25 Fourier magnitudes 8

  • Gain (2 per frames)

8 8

  • Pitch. overall voicing

7 7 Bandpass voicing 4

  • Aperiodic flag

1

  • Error protection
  • 13

Sync bit 1 1

Total bits / 22.5 ms frame 54 54

Page 24 of 23

slide-26
SLIDE 26

Bit transmission order

Page 25 of 23

slide-27
SLIDE 27

Comparison of the 2400 BPS MELP with

  • ther Standard Coders

Diagnostic Acceptability Measure

Two Conditions

Quiet

Office

Continuously Variable Slope Delta Modulation (CVSD)

16,000 bps

Code Excited Linear Prediction (CELP)

4800 bps

FS1016

Mixed Excitation Linear Prediction (MELP)

2400 bps

FIPS Publication 137

Linear Predictive Coding (LPC)

2400 bps Page 26 of 23

slide-28
SLIDE 28

Comparison of the 2400 BPS MELP with other Standard Coders (cont’d)

Mean Opinion Score in Six Conditions Quiet

Anechoic Sound Chamber

Dynamic Microphone

Quiet - H250

Anechoic Sound Chamber

H250 Microphone

1% Random Bit Errors

Anechoic Sound Chamber

Dynamic Microphone

0.5% Random Block Errors

Anechoic Sound Chamber

Dynamic Microphone

50% Errors within a 35ms block

Office

Modern Office Environment

Dynamic Microphone

Mobile Command Environment

Field Shelter

EV M87 Microphone

Page 27 of 23

slide-29
SLIDE 29

Comparison of the 2400 BPS MELP with other Standard Coders (cont’d)

Complexity with three Measurements

RAM

ROM

MIPS

Page 28 of 23

slide-30
SLIDE 30

Page 29 of 23

LPC 10

Voice samples

slide-31
SLIDE 31

Voice samples

Page 30 of 30

Original Sound MELP 1800 MELP 2000 MELP 2200