MELP Vocoder
Page 0 of 23
MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder - - PowerPoint PPT Presentation
MELP Vocoder Page 0 of 23 Outline Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Page 1 of 23 Introduction Traditional pitched-excited LPC vocoders use either a periodic train or
Page 0 of 23
MELP Vocoder Features
Algorithm Description
Page 1 of 23
But sometimes results in mechanical or buzzy sound and are prone to tonal noise
Page 2 of 23
Inability of a simple pulse train to reproduce all kind of voiced speech
Produce more natural sounding speech
Page 3 of 23
Robust in background noise environments
Based on traditional LPC model, also includes additional features
Page 4 of 23
Aperiodic pulses Adaptive spectral enhancement Mixed excitation Pulse dispersion
هحفص5 زا54
LPC LPC LPC LSF LSF LSF MSVQ
ردكو MELP
هحفص6 زا54
ردكو MELP
هحفص7 زا54
ردكو MELP
يدويرپ ريغ مچرپ نييعت و ييادص ياه تدش هبساحمك
هحفص8 زا54
L=40,41,…,160
ردكو MELP
جوا طاقن يگدنكارپ نازيم
هحفص9 زا54
ردكو MELP
79 80 79 80 2
] [ 160 1 ] [ 160 1
n n
n e n e p P=12.64 P=6.77 P=1.1 P=1.16
جوا طاقن يگدنكارپ نازيم
هحفص10 زا54
79 80 79 80 2
] [ 160 1 ] [ 160 1
n n
n e n e p
ردكو MELP
تيب صاصتخا لودج
هحفص11 زا54
LSF
25 25 8
8
VS1
7 7 4
1 1
54 54
ردكو MELP
This model can simulate frequency dependent voicing strength
Primary effect of this unit is to reduce the buzz in broadband acoustic noise
Page 12 of 23
Producing erratic glottal pulses without tonal noise
Page 13 of 23
Pulse dispersion is implemented using fixed pulse dispersion filter based on a flattened triangle pulse
The pulse dispersion filter improves the match of bandpass filtered synthetic and natural speech waveforms in frequency bands which do not contain a formant resonance. Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech
Page 14 of 23
Is used to enhance the formant structure in the synthetic speech
This filter improves the match between synthetic and natural bandpass waveforms more natural speech
Page 15 of 23
.1
.2
This filtered speech is again filtered in
.3
The next step is to perform the Bandpass voicing analysis
white noise model
Page 16 of 23
In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function of the speech signal and the smoothed rectified signal in the non-DC band
Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function:
Page 17 of 23
2 / 1 1 2 1 2 1
] ) ( ) ( [ ) ( ) ( ) (
N n N n N n x
p n x n x p n x n x p R
P – the pitch of current frame N – the frame length k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))
The jittery state is determined by the peakiness of the fullwave rectified LP residue e(n):
Page 18 of 23
1 1 2 / 1 2
) ( 1 ] ) ( 1 [
N n N n
n e N n e N Peakiness
If peakiness is greater than some threshold, the speech
frame is then flagged as jittered (Aperiodic flag will be set)
4.
Applying a LPC analysis 5. Calculating final pitch estimate 6. Calculating Gain estimate 7. quantize the LPC coefficients, pitch, gain and bandpass voicing .8 Fourier magnitudes are determined and quantized
The information in these coefficients improves the accuracy of the speech production model at the perceptually-important lower frequencies
Page 19 of 23
Page 20 of 23
Pre filter Pitch Search
Bandpass Voicing Decision
Gain Calculator LPC Analysis Filter Final Pitch And voicing Decision LSF
quantization
Quantize Gain, pitch, Voicing, jitter
Fourier Magnitude calculation Apply Forward Error Correction
Input signal Transmitted Bitstream
.1
.2
Applying gain attenuation
.3
Interpolating linearly all of the synthesis parameters pitch-synchronously
.4
Generating mixed-excitation
Page 21 of 23
.5
.6
LPC synthesis and applying gain factor
.7
Page 22 of 23
Page 23 of 23
Decode parameters Noise Generator Noise Shaping Filter Pulse Generator Pulse Position Jitter Pulse Shaping Filter Adaptive Spectral Enhancement + LPC Synthesis Filter Pulse Dispersion Filter gain Received Bitstream Synthesized Speech
Parameters Voiced Unvoiced LSF parameters 25 25 Fourier magnitudes 8
8 8
7 7 Bandpass voicing 4
1
Sync bit 1 1
Total bits / 22.5 ms frame 54 54
Page 24 of 23
Page 25 of 23
Comparison of the 2400 BPS MELP with
Diagnostic Acceptability Measure
Two Conditions
Quiet
Office
Continuously Variable Slope Delta Modulation (CVSD)
○
16,000 bps
Code Excited Linear Prediction (CELP)
○
4800 bps
○
FS1016
Mixed Excitation Linear Prediction (MELP)
○
2400 bps
○
FIPS Publication 137
Linear Predictive Coding (LPC)
○
2400 bps Page 26 of 23
Mean Opinion Score in Six Conditions Quiet
Anechoic Sound Chamber
Dynamic Microphone
Quiet - H250
Anechoic Sound Chamber
H250 Microphone
1% Random Bit Errors
Anechoic Sound Chamber
Dynamic Microphone
0.5% Random Block Errors
Anechoic Sound Chamber
Dynamic Microphone
50% Errors within a 35ms block
Office
Modern Office Environment
Dynamic Microphone
Mobile Command Environment
Field Shelter
EV M87 Microphone
Page 27 of 23
Complexity with three Measurements
RAM
ROM
MIPS
Page 28 of 23
Page 29 of 23
LPC 10
Page 30 of 30
Original Sound MELP 1800 MELP 2000 MELP 2200