[PPT] - MELP Vocoder Outline 1 Introduction MELP Vocoder Features PowerPoint Presentation

SLIDE 1

MELP Vocoder

SLIDE 2

Outline

 Introduction  MELP Vocoder Features  Algorithm Description  Parameters & Comparison

1

SLIDE 3

Introduction

 Traditional pitched-excited LPC vocoders use

either a periodic train or white noise for synthesis filter  intelligible speech at very low bit rates

 But sometimes results in mechanical or buzzy

sound and are prone to tonal noise

2

SLIDE 4

Introduction

 These problems arise from:

 Inability of a simple pulse train to reproduce all kind

f voiced speech

 MELP vocoder uses a mixed-excitation model and

it represents a richer ensemble of speech characteristic

 Produce more natural sounding speech

3

SLIDE 5

MELP vocoder

 Robust in background noise

environments

 Based on traditional LPC model, also

includes additional features

4

Aperiodic pulses Adaptive spectral enhancement Mixed excitation Pulse dispersion

SLIDE 6

هدننك دك

5 زا54

يدورو راتفگ يراذگ هرجنپ گنيمه ماگ هبساحم هبساحم LPC يانهپ هعسوت دناب LPC

ياطخ رتليف ييوگشيپ

ييوگشيپ ياطخ يادص تدش هبساحمي نازيم و جوا طاقن يگدنكارپ

ماگ يياهن هبساحم

هنماد هبساحم هيروف ليدبت ليدبت LPC هب LSF يزاس بترم LSF صاف داجيا متيروگلاهل لقادح50زتره رادرب LSF MSVQ يا هيارآ اهرادرب زا رادرب هدش هزيتناوك صاف داجيا متيروگلاهل لقادح50زتره يسازيتناوكنو يدبت ياه هنمادل هيروف

ردكو MELP

SLIDE 7

6 زا54

زيلانآ ياه هرجنپ تيعقوم

ردكو MELP

SLIDE 8

هيروف ليدبت ياه هنماد هبساحم

7 زا54

سلاپ داجيا رتليففيظو ه

سلاپراطق ديلوتدراد ار . نيا راك زا هدافتسا اب FFT و200 و لانگيس زا هنومن هبرض خساپ شوپ جارختسا دريگ يم تروص. ردكو MELP

SLIDE 9

كيدويرپ ريغ مچرپ نييعت و ييادص ياه تدش هبساحم

8 زا54

1- نيمخت لوا هلحرم( L=40,41,…,160 ) 2

نيياپ دناب ييادص تدش نبيعت

3- ييادص تدش نييعت4رگيد دناب

ردكو MELP

SLIDE 10

جوا طاقن يگدنكارپ نازيم

9 زا54

ردكو MELP

 

   



79 80 79 80 2

] [ 160 1 ] [ 160 1

n n

n e n e p

P=12.64 P=6.77 P=1.1 P=1.16

SLIDE 11

جوا طاقن يگدنكارپ نازيم

10 زا54

 

   



79 80 79 80 2

] [ 160 1 ] [ 160 1

n n

n e n e p

ردكو MELP

SLIDE 12

تيب صاصتخا لودج

11 زا54

رتماراپرادادص تلاحيب تلاحادص بيارض LSF

25 25

هنمادهيروف ليدبت ياه8- هرهب(2ميرف ره يازا هب راب)88 ماگ هرود + VS1

7 7

تدشييادص ياه4- كيدويرپ ريغ مچرپ1- اطخ زا تظفاحم-13 نويسازينوركنس تيب11 تيب لكيصاصتخا ياه5454

ردكو MELP

SLIDE 13

Mixed Excitation

 Mixed-excitation is implemented using a multi-

band mixing model

 This model can simulate frequency dependent

voicing strength

 Using a mixture of Aperiodic/periodic and white

noise as excitation

 Primary effect of this unit is to reduce the buzz in

broadband acoustic noise

12

SLIDE 14

Aperiodic pulses

 When input signal is voiced, MELP vocoder can

synthesize speech using either aperiodic or periodic pulses.

 Aperiodic pulses used during transition regions

between voiced and unvoiced segments of speech signal

 Producing erratic glottal pulses without tonal noise

13

SLIDE 15

Pulse Dispersion

 Pulse dispersion is implemented using fixed

pulse dispersion filter based on a flattened triangle pulse

 The pulse dispersion filter improves the match

f bandpass filtered synthetic and natural

speech waveforms in frequency bands which do not contain a formant resonance.  Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech

14

SLIDE 16

Adaptive spectral enhancement filter

 Based on the poles of the vocal tract filter  Is used to enhance the formant structure in the

synthetic speech

 This filter improves the match between synthetic

and natural bandpass waveforms  more natural speech output

15

SLIDE 17

MELP Algorithm Description (Encoder)

1.

filter out any low frequency noise

2.

This filtered speech is again filtered in order to perform the initial pitch search for the pitch estimation

3.

The next step is to perform the Bandpass voicing analysis

In this step we decide to use

periodic/Aperiodic train or white noise model

16

SLIDE 18

MELP Algorithm Description (Encoder) cont’d



In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function

f the speech signal and the smoothed rectified signal in

the non-DC band



Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function:

17 2 / 1 1 2 1 2 1

] ) ( ) ( [ ) ( ) ( ) (

  

     

  

N n N n N n x

p n x n x p n x n x p R

P – the pitch of current frame N – the frame length k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))

SLIDE 19

MELP Algorithm Description (Encoder ) cont’d

 The jittery state is determined by the peakiness of

the fullwave rectified LP residue e(n):

18

 

   



1 1 2 / 1 2

) ( 1 ] ) ( 1 [

N n N n

n e N n e N Peakiness

 If peakiness is greater than some threshold, the speech

frame is then flagged as jittered (Aperiodic flag will be set)

SLIDE 20

MELP Algorithm Description (Encoder) cont’d

4.

Applying a LPC analysis 5. Calculating final pitch estimate 6. Calculating Gain estimate 7. quantize the LPC coefficients, pitch, gain and bandpass voicing 8. Fourier magnitudes are determined and quantized



The information in these coefficients improves the accuracy of the speech production model at the perceptually- important lower frequencies

19

SLIDE 21

MELP Encoder

20

Pre filter Pitch Search

Bandpass Voicing Decision

Gain Calculator LPC Analysis Filter Final Pitch And voicing Decision LSF

quantization

Quantize Gain, pitch, Voicing, jitter

Fourier Magnitude calculation Apply Forward Error Correction

Input signal Transmitted Bitstream

SLIDE 22

MELP Algorithm (Decoder)

1.

Decoding the pitch

2.

Applying gain attenuation

3.

Interpolating linearly all of the synthesis parameters pitch-synchronously

4.

Generating mixed-excitation

21

SLIDE 23

MELP Algorithm (Decoder) cont’d

5.

Applying an adaptive spectral enhancement filter

6.

LPC synthesis and applying gain factor

7.

Dispersion filtering

22

SLIDE 24

MELP Decoder

23

Decode parameters Noise Generator Noise Shaping Filter Pulse Generator Pulse Position Jitter Pulse Shaping Filter Adaptive Spectral Enhancement + LPC Synthesis Filter Pulse Dispersion Filter gain Received Bitstream Synthesized Speech

SLIDE 25

Parameter Quantization

Parameters Voiced Unvoiced LSF parameters 25 25 Fourier magnitudes 8

Gain (2 per frames)

8 8

Pitch. overall voicing

7 7 Bandpass voicing 4

Aperiodic flag

1

Error protection
13

Sync bit 1 1

Total bits / 22.5 ms frame 54 54

24

SLIDE 26

Bit transmission order

25

SLIDE 27

Comparison of the 2400 BPS MELP with

ther Standard Coders



Diagnostic Acceptability Measure



Two Conditions



Quiet



Office



Continuously Variable Slope Delta Modulation (CVSD)



16,000 bps



Code Excited Linear Prediction (CELP)



4800 bps



FS1016



Mixed Excitation Linear Prediction (MELP)



2400 bps



FIPS Publication 137



Linear Predictive Coding (LPC)



2400 bps

26

SLIDE 28

Comparison of the 2400 BPS MELP with

ther Standard Coders (cont’d)



Mean Opinion Score in Six Conditions Quiet



Anechoic Sound Chamber



Dynamic Microphone

Quiet - H250



Anechoic Sound Chamber



H250 Microphone

1% Random Bit Errors



Anechoic Sound Chamber



Dynamic Microphone

0.5% Random Block Errors



Anechoic Sound Chamber



Dynamic Microphone



50% Errors within a 35ms block

Office



Modern Office Environment



Dynamic Microphone

Mobile Command Environment



Field Shelter



EV M87 Microphone

27

SLIDE 29

Comparison of the 2400 BPS MELP with

ther Standard Coders (cont’d)

 Complexity with

three Measurements

 RAM  ROM  MIPS

28

SLIDE 30

Voice samples

29

LPC 10

SLIDE 31

Voice samples

30

Original Sound MELP 1800 MELP 2000 MELP 2200