Enhanced Robot Audition Based on Microphone Array Source Separation - - PowerPoint PPT Presentation

enhanced robot audition based on microphone array source
SMART_READER_LITE
LIVE PREVIEW

Enhanced Robot Audition Based on Microphone Array Source Separation - - PowerPoint PPT Presentation

Enhanced Robot Audition Based on Microphone Array Source Separation with Post-Filter Jean-Marc Valin , Jean Rouat, Franois Michaud Department of Electrical Engineering and Computer Engineering Universit de Sherbrooke, Qubec, Canada


slide-1
SLIDE 1

Enhanced Robot Audition Based

  • n Microphone Array Source

Separation with Post-Filter

Jean-Marc Valin, Jean Rouat, François Michaud Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca

slide-2
SLIDE 2

Motivations

The context: mobile robot and cocktail party efgect The problem: separating sound sources The solution: microphone array with both linear and non-linear processing

Geometric source separation Microphones

X n(k ,l) Sm(k ,l)

Sources Post- fjlter

Y m(k ,l) ^ Sm(k ,l)

Separated Sources

slide-3
SLIDE 3

Approach

Frequency-domain processing Geometric Source Separation (GSS)

Minimize leakage under constraints Adapted for real-time processing

Post-fjlter

Cancels remaining interferences Based on Ephraim and Malah estimator Handles both stationary and non-stationary noise/interference

slide-4
SLIDE 4

Geometric Source Separation

Frequency domain: Constrained optimization

Minimize correlation of the outputs: Subject to geometric constraint:

Modifjcations to original GSS algorithm

Instantaneous computation of correlations Stochastic-gradient descent

slide-5
SLIDE 5

Post-Filter Overview

Noise estimate as the sum of two components (stationary + transient)

slide-6
SLIDE 6

Background Noise Estimation

Minima-Controlled Recursive Average (Cohen)

Noise estimate is adapted during quiet periods Applied for each source of interest

Initial estimate provided directly from the microphones

slide-7
SLIDE 7

Interference Estimation

Source separation leaks

Incomplete adaptation Inaccuracy in localization Reverberation Imperfect microphones

Estimation from other separated sources

slide-8
SLIDE 8

Suppression Rule

Ephraim & Malah spectral estimator Gain is modifjed to take into account probability of source being present (Cohen)

slide-9
SLIDE 9

Experimental Setup

Array of 8 inexpensive microphones on a Pioneer2 robot Automatic localization Noisy conditions 350 ms reverberation time

slide-10
SLIDE 10

Results (Signal-to-Noise Ratio)

Three voices recorded separately so clean signal is available

slide-11
SLIDE 11

Results (spectrograms)

Input GSS Post-fjlter output Reference

slide-12
SLIDE 12

Results (recognition with post-fjlter)

Japanese isolated word recognition (SIG2 robot)

3 simultaneous sources 200 word vocabulary 90 degrees separation

14% reduction in error rate

mixed GSS+pf 66% 15% 41% 71% 21% 53% GSS only right left center

slide-13
SLIDE 13

Conclusion

Geometric Source Separation

Real-time minimization of leakage

Source separation post-fjlter

Interference estimated using other sources

Future work

Robustness to reverberation Better integration with speech recognition

Using the post-fjlter to estimate ASR feature reliability

  • riginal

processed

slide-14
SLIDE 14

Questions?