MELP Vocoder Outline 1 Introduction MELP Vocoder Features - - PowerPoint PPT Presentation
MELP Vocoder Outline 1 Introduction MELP Vocoder Features - - PowerPoint PPT Presentation
0 MELP Vocoder Outline 1 Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison Introduction 2 Traditional pitched-excited LPC vocoders use either a periodic train or white noise for synthesis
Outline
Introduction MELP Vocoder Features Algorithm Description Parameters & Comparison
1
Introduction
Traditional pitched-excited LPC vocoders use
either a periodic train or white noise for synthesis filter intelligible speech at very low bit rates
But sometimes results in mechanical or buzzy
sound and are prone to tonal noise
2
Introduction
These problems arise from:
Inability of a simple pulse train to reproduce all kind
- f voiced speech
MELP vocoder uses a mixed-excitation model and
it represents a richer ensemble of speech characteristic
Produce more natural sounding speech
3
MELP vocoder
Robust in background noise
environments
Based on traditional LPC model, also
includes additional features
4
Aperiodic pulses Adaptive spectral enhancement Mixed excitation Pulse dispersion
هدننك دك
5 زا54
يدورو راتفگ يراذگ هرجنپ گنيمه ماگ هبساحم هبساحم LPC يانهپ هعسوت دناب LPC
ياطخ رتليف ييوگشيپ
ييوگشيپ ياطخ يادص تدش هبساحمي نازيم و جوا طاقن يگدنكارپ
ماگ يياهن هبساحم
هنماد هبساحم هيروف ليدبت ليدبت LPC هب LSF يزاس بترم LSF صاف داجيا متيروگلاهل لقادح50زتره رادرب LSF MSVQ يا هيارآ اهرادرب زا رادرب هدش هزيتناوك صاف داجيا متيروگلاهل لقادح50زتره يسازيتناوكنو يدبت ياه هنمادل هيروف
ردكو MELP
6 زا54
زيلانآ ياه هرجنپ تيعقوم
ردكو MELP
هيروف ليدبت ياه هنماد هبساحم
7 زا54
- سلاپ داجيا رتليففيظو ه
سلاپراطق ديلوتدراد ار . نيا راك زا هدافتسا اب FFT و200 و لانگيس زا هنومن هبرض خساپ شوپ جارختسا دريگ يم تروص. ردكو MELP
كيدويرپ ريغ مچرپ نييعت و ييادص ياه تدش هبساحم
8 زا54
1- نيمخت لوا هلحرم( L=40,41,…,160 ) 2
- نيياپ دناب ييادص تدش نبيعت
3- ييادص تدش نييعت4رگيد دناب
ردكو MELP
جوا طاقن يگدنكارپ نازيم
9 زا54
ردكو MELP
79 80 79 80 2
] [ 160 1 ] [ 160 1
n n
n e n e p
P=12.64 P=6.77 P=1.1 P=1.16
جوا طاقن يگدنكارپ نازيم
10 زا54
79 80 79 80 2
] [ 160 1 ] [ 160 1
n n
n e n e p
ردكو MELP
تيب صاصتخا لودج
11 زا54
رتماراپرادادص تلاحيب تلاحادص بيارض LSF
25 25
هنمادهيروف ليدبت ياه8- هرهب(2ميرف ره يازا هب راب)88 ماگ هرود + VS1
7 7
تدشييادص ياه4- كيدويرپ ريغ مچرپ1- اطخ زا تظفاحم-13 نويسازينوركنس تيب11 تيب لكيصاصتخا ياه5454
ردكو MELP
Mixed Excitation
Mixed-excitation is implemented using a multi-
band mixing model
This model can simulate frequency dependent
voicing strength
Using a mixture of Aperiodic/periodic and white
noise as excitation
Primary effect of this unit is to reduce the buzz in
broadband acoustic noise
12
Aperiodic pulses
When input signal is voiced, MELP vocoder can
synthesize speech using either aperiodic or periodic pulses.
Aperiodic pulses used during transition regions
between voiced and unvoiced segments of speech signal
Producing erratic glottal pulses without tonal noise
13
Pulse Dispersion
Pulse dispersion is implemented using fixed
pulse dispersion filter based on a flattened triangle pulse
The pulse dispersion filter improves the match
- f bandpass filtered synthetic and natural
speech waveforms in frequency bands which do not contain a formant resonance. Spreading the excitation energy with a pitch period Reduce harsh quality of the synthetic speech
14
Adaptive spectral enhancement filter
Based on the poles of the vocal tract filter Is used to enhance the formant structure in the
synthetic speech
This filter improves the match between synthetic
and natural bandpass waveforms more natural speech output
15
MELP Algorithm Description (Encoder)
1.
filter out any low frequency noise
2.
This filtered speech is again filtered in order to perform the initial pitch search for the pitch estimation
3.
The next step is to perform the Bandpass voicing analysis
- In this step we decide to use
periodic/Aperiodic train or white noise model
16
MELP Algorithm Description (Encoder) cont’d
In this stage A voice degree parameter is estimated in each band, based on the normalized correlation function
- f the speech signal and the smoothed rectified signal in
the non-DC band
Let sk(n) denote the speech signal in band k, uk(n) denote the DC-removed smoothed rectified signal of sk(n). The correlation function:
17 2 / 1 1 2 1 2 1
] ) ( ) ( [ ) ( ) ( ) (
N n N n N n x
p n x n x p n x n x p R
P – the pitch of current frame N – the frame length k – the voicing strength for band (defined as max(Rsk(P),Ruk(P)))
MELP Algorithm Description (Encoder ) cont’d
The jittery state is determined by the peakiness of
the fullwave rectified LP residue e(n):
18
1 1 2 / 1 2
) ( 1 ] ) ( 1 [
N n N n
n e N n e N Peakiness
If peakiness is greater than some threshold, the speech
frame is then flagged as jittered (Aperiodic flag will be set)
MELP Algorithm Description (Encoder) cont’d
4.
Applying a LPC analysis 5. Calculating final pitch estimate 6. Calculating Gain estimate 7. quantize the LPC coefficients, pitch, gain and bandpass voicing 8. Fourier magnitudes are determined and quantized
The information in these coefficients improves the accuracy of the speech production model at the perceptually- important lower frequencies
19
MELP Encoder
20
Pre filter Pitch Search
Bandpass Voicing Decision
Gain Calculator LPC Analysis Filter Final Pitch And voicing Decision LSF
quantization
Quantize Gain, pitch, Voicing, jitter
Fourier Magnitude calculation Apply Forward Error Correction
Input signal Transmitted Bitstream
MELP Algorithm (Decoder)
1.
Decoding the pitch
2.
Applying gain attenuation
3.
Interpolating linearly all of the synthesis parameters pitch-synchronously
4.
Generating mixed-excitation
21
MELP Algorithm (Decoder) cont’d
5.
Applying an adaptive spectral enhancement filter
6.
LPC synthesis and applying gain factor
7.
Dispersion filtering
22
MELP Decoder
23
Decode parameters Noise Generator Noise Shaping Filter Pulse Generator Pulse Position Jitter Pulse Shaping Filter Adaptive Spectral Enhancement + LPC Synthesis Filter Pulse Dispersion Filter gain Received Bitstream Synthesized Speech
Parameter Quantization
Parameters Voiced Unvoiced LSF parameters 25 25 Fourier magnitudes 8
- Gain (2 per frames)
8 8
- Pitch. overall voicing
7 7 Bandpass voicing 4
- Aperiodic flag
1
- Error protection
- 13
Sync bit 1 1
Total bits / 22.5 ms frame 54 54
24
Bit transmission order
25
Comparison of the 2400 BPS MELP with
- ther Standard Coders
Diagnostic Acceptability Measure
Two Conditions
Quiet
Office
Continuously Variable Slope Delta Modulation (CVSD)
16,000 bps
Code Excited Linear Prediction (CELP)
4800 bps
FS1016
Mixed Excitation Linear Prediction (MELP)
2400 bps
FIPS Publication 137
Linear Predictive Coding (LPC)
2400 bps
26
Comparison of the 2400 BPS MELP with
- ther Standard Coders (cont’d)
Mean Opinion Score in Six Conditions Quiet
Anechoic Sound Chamber
Dynamic Microphone
Quiet - H250
Anechoic Sound Chamber
H250 Microphone
1% Random Bit Errors
Anechoic Sound Chamber
Dynamic Microphone
0.5% Random Block Errors
Anechoic Sound Chamber
Dynamic Microphone
50% Errors within a 35ms block
Office
Modern Office Environment
Dynamic Microphone
Mobile Command Environment
Field Shelter
EV M87 Microphone
27
Comparison of the 2400 BPS MELP with
- ther Standard Coders (cont’d)
Complexity with
three Measurements
RAM ROM MIPS
28
Voice samples
29
LPC 10
Voice samples
30
Original Sound MELP 1800 MELP 2000 MELP 2200