Improved Soft Decisions in Missing Data ASR: Using Harmonicity in Conjunction with Local SNR Estimates
Speech and Hearing Research Group,
- Dept. Computer Science,
Improved Soft Decisions in Missing Data ASR: Using Harmonicity in - - PowerPoint PPT Presentation
Improved Soft Decisions in Missing Data ASR: Using Harmonicity in Conjunction with Local SNR Estimates Speech and Hearing Research Group, Dept. Computer Science, University of Sheffield, UK January 24, 2001 ved Soft
1
Time Frequency
v, T 1
Fuzzy Harmonicity Mask
Applied to each channel
32 frequency channels, 150 lags Correlogram 32 Channels 32 frequency channels Gammatone Filterbank Noisy Signal Instanteous Envelope Haircell Autocorrelation Sum Across Autocorrelogram Summary Pitch Peak Tracking Correlogram Select lag from (freq, lag) Peak’s lag index (~1/f0)
1
s ,T1
1
Peak’s Height (Degree of Voicing) f0 Model Frequency
h s wM +(1-w)M
1 1 1
Harmonicity Mask, M SNR Mask, M w s h Mask Combination s ,T s ,T s ,T
1 2 3 3 2 1
Raw Harmonicity Data Degree of Voicing Hybrid Mask Local SNR Estimate
Apriori 50Hz 3800Hz Local SNR Estimate Mask Harmonicity Based Mask Combined Mask 1.7 secs
Apriori 50Hz 3800Hz Local SNR Estimate Mask Harmonicity Based Mask Combined Mask 1.7 secs
−5 5 10 15 20 Clean 20 40 60 80 100 WER SNR (dB) Subway Noise
Discrete SNR Fuzzy SNR +Harmonicity
−5 5 10 15 20 Clean 20 40 60 80 100 WER SNR (dB) Babble Noise
Discrete SNR Fuzzy SNR +Harmonicity
−5 5 10 15 20 Clean 20 40 60 80 100 WER SNR (dB) Car Noise
Discrete SNR Fuzzy SNR +Harmonicity
−5 5 10 15 20 Clean 20 40 60 80 100 WER SNR (dB) Exhibition Noise
Discrete SNR Fuzzy SNR +Harmonicity
−5 5 10 15 20 Clean 10 20 30 40 50 60 70 80 90 100 WER SNR (dB)
Discrete SNR Fuzzy SNR +Harmonicity (MultiCondition)
−5 5 10 15 20 Clean 10 20 30 40 50 60 70 80 90 100 WER SNR (dB)
16 State Models DC Models
5 10 15 20 200 5 10 15 20 25 30 35 40 45 50
Digit recognition accuracy SNR (dB)
Discrete SNR Fuzzy SNR (ICSLP) Tuned Fuzzy Autoc/SNR (Apriori)