A Priori SNR Estimation Using Weibull Mixture Model 12. ITG - PowerPoint PPT Presentation

A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering Paderborn University 7. Oktober 2016 Computer Science, Electrical NT Engineering and Mathematics Communications Engineering Prof. Dr.-Ing. Reinhold Häb-Umbach

Table of contents 1 Problem formulation and motivation 2 A priori SNR estimation based on Weibull mixture model Experimental evaluation 3 Conclusions and outlook 4 NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10

Problem formulation and motivation Single-channel clean speech s ( t ) contaminated by an additive noise n ( t ) : STFT y ( t ) = s ( t ) + n ( t ) ◦ ——- • Y ( k , ℓ ) = S ( k , ℓ ) + N ( k , ℓ ) ˆ λ N ( k , ℓ ) − noise power spectral density (PSD) k - frequency bin Noise PSD ℓ - frame index tracker | Y ( k , ℓ ) | 2 ˆ ˆ Y ( k , ℓ ) ξ ( k , ℓ ) G ( k , ℓ ) S ( k , ℓ ) ˆ s ( t ) A priori SNR Gain | · | 2 • • ISTFT estimator function A priori SNR ξ ( k , ℓ ) = λ S ( k ,ℓ ) λ N ( k ,ℓ ) – a key component in enhancement system | S ( k , ℓ ) | 2 � | N ( k , ℓ ) | 2 � λ S ( k , ℓ ) = E � - clean speech PSD, λ N ( k , ℓ ) = E � - noise PSD Motivated by a generalized spectral subtraction (GSS) denoising | Y ( k , ℓ ) | α for α ∈ R > 0 not restricted to ( α = 1) or ( α = 2) with assumption | Y ( k , ℓ ) | α = | S ( k , ℓ ) | α + | N ( k , ℓ ) | α NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 1 / 10

Normalized α -order magnitude (NAOM) domain A priori SNR estimator | Y ( k , ℓ ) | 2 Y α ( k , ℓ ) λ m ( k , ℓ ) Estimate P S α ( k ) Estimate Estimate ˆ ˆ S α ( k , ℓ ) ξ ( k , ℓ ) Calculate parameter of and go into clean speech a priori SNR WMM p S α ( s ) NAOM domain NAOMs ˆ λ N α ( k , ℓ ) π m ( k , ℓ ) λ N ( k , ℓ ) Normalize | Y ( k , ℓ ) | α to a root of an averaged power P S α ( k ) of | S ( k , ℓ ) | α L Y α ( k , ℓ ) = | Y ( k , ℓ ) | α P S α ( k ) = 1 � | S ( k , ℓ ) | 2 α = S α ( k , ℓ )+ N α ( k , ℓ ) with � L P S α ( k ) ℓ = 1 Statistical models independent of speaker loudness Normalized energy of clean speech NAOMs E [ S 2 α ( k )] = 1 S α ( k , ℓ ) & N α ( k , ℓ ) – realizations of random variables S α ( k ) & N α ( k ) Estimate S α ( k , ℓ ) from Y α ( k , ℓ ) given models for S α ( k ) & N α ( k ) NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 2 / 10

Modeling of noise NAOM coefficients N α ( k , ℓ ) N ( k , ℓ ) ∼ N c ( n ; 0 , λ N ( k , ℓ )) Weibull PDF for λ = 1 and different α 0 . 5 N α ( k , ℓ ) – Weibull distributed Weib ( n ; 1 , α ) 1 p N α ( k ,ℓ ) ( n ) = Weib ( n ; λ N α ( k , ℓ ) , α ) 1 1 . 5 2 Shape parameter α ∈ R > 0 Scale parameter λ N ( k , ℓ ) λ N α ( k , ℓ ) = ∈ R > 0 0 � 0.5 1.5 2 P S α ( k ) α n Histogram and Weibull PDF for α = 0 . 7 Model N α ( k ) with Weibull PDF Noise NAOMs 3 p N α ( k ) ( n ) = Weib ( n ; λ N α ( k ) , α ) Weibull PDF L p N α (n) 2 with λ N α ( k ) = 1 � λ N α ( k , ℓ ) L ℓ = 1 1 NAOM coefficients of white noise 0 signal and estimated p N α ( k ) ( n ) 0 0.3 0.6 0.9 n NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 3 / 10

Modeling of NAOM coefficients of clean speech S α ( k , ℓ ) Histogram and estimated WMM S ( k , ℓ ) ∼ N c ( n ; 0 , λ S ( k , ℓ )) 10 Bimodal Weibull mixture model Clean speech NAOMs (WMM) to model S α ( k ) Bimodal WMM 2 � m = 1 component p S α ( k ) ( s ) = π m ( k ) · Weib ( s ; λ m ( k ) , β ) m = 2 component m = 1 m = 1 : silence p S α (s) 1 m = 2 : activity π m ( k ) ∈ [ 0 , 1 ] : weights λ m ( k ) : scale parameters β : shape parameter β � = α : additional degree of freedom in the model 0.1 Clean speech NAOMs & estimated 0 0.5 1.0 1.5 WMM ( α = 0 . 7 ; β = 2 . 5) s NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 4 / 10

Estimation of WMM parameters and clean speech NAOMs A priori SNR estimator | Y ( k , ℓ ) | 2 Y α ( k , ℓ ) λ m ( k , ℓ ) Estimate Estimate Estimate P S α ( k ) ˆ ˆ S α ( k , ℓ ) ξ ( k , ℓ ) Calculate parameter of and go into clean speech a priori SNR NAOM domain WMM p S α ( s ) NAOMs ˆ λ N α ( k , ℓ ) π m ( k , ℓ ) λ N ( k , ℓ ) Set λ 1 ( k ) acc. to ξ min usually used in a priori SNR estimation [Cappe 94] Expectation Maximization algorithm to estimate λ 2 ( k ) , π m ( k ) After EM, weights π m ( k ) are corrected with the constraint E [ S 2 α ( k )] = 1 A priori SNR estimator | Y ( k , ℓ ) | 2 Y α ( k , ℓ ) λ m ( k , ℓ ) Estimate P S α ( k ) Estimate Estimate ˆ ˆ S α ( k , ℓ ) ξ ( k , ℓ ) Calculate parameter of and go into clean speech a priori SNR WMM p S α ( s ) NAOM domain NAOMs ˆ λ N α ( k , ℓ ) π m ( k , ℓ ) λ N ( k , ℓ ) Maximum a posteriori (MAP) estimation: ˆ S MAP ( k , ℓ ) = argmax p S α ( k ) | Y α ( k ,ℓ ) ( s | y ) α s Y α ( k , ℓ ) is a realisation of random variable Y α ( k ) = S α ( k ) + N α ( k ) Approximative computationally efficient solution for β = α = 1 NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 5 / 10

Calculation of a priori SNR and causal implementation A priori SNR estimator | Y ( k , ℓ ) | 2 Y α ( k , ℓ ) λ m ( k , ℓ ) Estimate Estimate P S α ( k ) Estimate ˆ ˆ S α ( k , ℓ ) ξ ( k , ℓ ) Calculate and go into parameter of clean speech a priori SNR WMM p S α ( s ) NAOM domain NAOMs ˆ λ N ( k , ℓ ) λ N α ( k , ℓ ) π m ( k , ℓ ) Go back into domain of power spectral density by calculating � 2   � ˆ � α S α ( k , ℓ ) · P S α ( k ) ˆ ξ ( k , ℓ ) = max , ξ min   λ N ( k , ℓ )   Causal implementation of WMM-based a priori SNR estimators Calculate P S α ( k ) and λ N α ( k ) in a causal way Causal EM for λ 2 ( k ) and π 2 ( k ) with one EM-iteration per time frame Note, parameters α and β have to be set appropriately → optimization NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 6 / 10

Experimental evaluation Data and setup Clean speech: Wall Street Journal database 16 kHz (male and female) 7 different noise types of Noisex92 database: white , pink , f16 , hfchannel , factory-1 , factory-2 , babble Input global SNR from − 5 dB up to 25 dB in 5 dB steps Spectral speech enhancement framework Noise PSD tracking using Minimum statistics approach [Martin 01] A priori SNR estimation with ξ min = − 18 dB [Cappe 94] Proposed WMM-based approach with Wiener filter Reference approach: Decision Directed [Ephraim 84] NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 7 / 10

Optimization of α and β Speech quality maximization in terms of wide-band mean opinion score listening quality objective (MOS-LQO) with ∆ MOS-LQO = max ( MOS-LQO WMM − MOS-LQO DD , 0 ) Averaging over genders, noise types and input global SNR values ( α opt , β opt ) = ( 0 . 64 , 2 . 7 ) ∆ MOS-LQO 0.1 0 4 2 1 0 . 8 β 0 . 6 0 . 4 α NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 8 / 10

Final experimental results Clean speech: WSJ database signals other than used for optimization Estimation error – Itakura-Saito distance (ISD) and estimator’s variance – logarithmic error variance (LEV): the smaller the better Resulting ISD, LEV and MOS-LQO values averaged over noise types SNR, dB − 5 0 5 10 15 20 25 AVG 34 . 4 DD 48 . 8 44 . 0 39 . 6 34 . 9 30 . 2 24 . 5 19 . 1 ISD WMM 30 . 6 42 . 6 38 . 1 34 . 1 30 . 4 27 . 3 23 . 0 18 . 9 DD 53 . 1 49 . 0 46 . 4 45 . 1 45 . 5 47 . 4 50 . 5 48 . 1 LEV WMM 45 . 6 43 . 9 42 . 6 41 . 1 39 . 0 37 . 0 35 . 9 40 . 7 2 . 16 DD 1 . 11 1 . 30 1 . 63 2 . 09 2 . 57 3 . 00 3 . 39 MOS-LQO WMM 1 . 18 1 . 46 1 . 77 2 . 13 2 . 62 3 . 16 3 . 61 2 . 28 NT A Priori SNR Estimation Using Weibull Mixture Model A. Chinaev, J. Heitkaemper, R. Haeb-Umbach 9 / 10

A Priori SNR Estimation Using Weibull Mixture Model 12. ITG - PowerPoint PPT Presentation

A Priori SNR Estimation Using Weibull Mixture Model 12. ITG Fachtagung Sprachkommunikation Aleksej Chinaev, Jens Heitkaemper, Reinhold Haeb-Umbach Department of Communications Engineering Paderborn University 7. Oktober 2016 Computer Science,

SNR SNR- -cloud interaction cloud interaction cloud interaction SNR SNR cloud interaction

Weibull Distribution Weibull Distribution Definition A random variable X is said to have a

Bernoulli Mixture Models Victor Medina Researcher at SBIF DataCamp Mixture Models in R The

Structure of mixture models Victor Medina Researcher at SBIF DataCamp Mixture Models in R

Why use the Weibull model? Heidi Seibold Statistician at LMU Munich DataCamp Survival Analysis

Modeling end-to-end internet delays using mixtures of Weibull distributions Iain W. Phillips and

Constrained Mixture Estimation for Constrained Mixture Estimation Analysis and Robust

SOLUTIONS FOR AGRICULTURE NTN-SNR offer for implements 30.04.19 30.04.19 NTN-SNR 1 1

OECD COUNTRIES Ian Hawkesworth, Snr Public Sector Expert, World Bank Camila Vammale, Snr Policy

HIGH ENERGY EMISSION FROM HIGH ENERGY EMISSION FROM SNR RX J1713.7-3946 SNR RX J1713.7-3946

On the Deeply Contingent A Priori David J. Chalmers Contingent A Priori n Julius

Roadmap Frequent Patterns A-Priori Algorithm Improvements to A-Priori Park-Chen-Yu

MIXTURE DENSITY NETWORKS MIXTURE DENSITY NETWORKS Charles Martin SO FAR; RNNS THAT MODEL

Mixture Models Simulation-based Estimation Michel Bierlaire michel.bierlaire@epfl.ch

A Priori Error Analysis of Fully Discrete Elliptic model problem First convergence results

Flexible Mixture Modeling and Model-Based Clustering in R Bettina Grn September 2017 c

SDG Interac,ons, Policy Planning and Priori,za,on, and Leave No One Behind: new and evolving

Priori%zing Blighted Proper%es for Ac%on Homes Within Reach

A smarter approach to governance in Africa Launching the APPP synthesis report David Booth and

Negotiating Conflicts Eff Effectively ti l Agenda Agenda Agenda Agenda Introductions

Support Vector Regression with a Priori Knowledge Used in Order Execution Strategies Based on

Div Diver ersity Driving Ex sity Driving Excellence cellence From Concept to Measurement to

Potential Cooperative Projects Presenta(on to LBNL Community Advisory Group

Importance of Transforma/on 1. Costs are Unsustainable

Sambuz

Useful Links

Newsletter

Mail Us