Auditory System For a Mobile Robot
Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca
Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin - - PowerPoint PPT Presentation
Auditory System For a Mobile Robot PhD Thesis Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Universit de Sherbrooke, Qubec, Canada Jean-Marc.Valin@USherbrooke.ca Motivations Robots need information
Jean-Marc Valin Department of Electrical Engineering and Computer Engineering Université de Sherbrooke, Québec, Canada Jean-Marc.Valin@USherbrooke.ca
– Human-robot interaction – Unreliable
– Imitate human auditory system – Limited localisation and separation
– More information available – Simpler processing
– complexity, algorithmic delay – robustness to noise and reverberation – weight/space/adaptability – moving sources, moving robot
– Lab (E1) 350 ms – Hall (E2) 1 s
– Interaural phase difference (delay) – Interaural intensity difference
– Estimation through TDOAs – Subspace methods (MUSIC) – Direct search (steered beamformer)
– Kalman filtering – Particle filtering
– Weight according to noise and reverberation – Models the precedence effect
– As a function of steered
– Need to know which observation is related to
– Compute
– Merging past and present information – Taking into account source-observation
– Weighted mean of the particle positions
– Simple, low complexity
Geometric source separation Microphones
X nk ,l Smk ,l
Sources Post- filter
Y mk ,l Smk ,l
Separated Sources Tracking information
– Minimize correlation of the outputs: – Subject to geometric constraint:
– Instantaneous computation of correlations – Regularisation
– Incomplete adaptation – Inaccuracy in localization – Reverberation/diffraction – Imperfect microphones
Input Delay- and- sum GSS GSS + single- source GSS + multi- source
2,5 5 7,5 10 12,5 15 Source 1 Source 2 Source 3 SNR (dB)
GSS only Post-filter (no dere- verb.) Proposed system 50% 55% 60% 65% 70% 75% 80% 85% 90%
E2, C2, 3 speakers
Right Front Left Word correct (%)
microphone separated
Listener 1 Listener 2 Listener 3 Listener 4 Listener 5 Pro- posed system 50% 55% 60% 65% 70% 75% 80% 85% 90% Word correct (%)
– Yes and no!
– Compute missing feature mask – Use the mask to compute probabilities
– 3 simultaneous sources – 200-word vocabulary – 30, 60, 90 degrees separation
GSS GSS+post- filter GSS+post- filter+MFT 10 20 30 40 50 60 70 80
Right Front Left
Word correct (%)
– Localisation and tracking of sound sources – Separation of multiple sources – Robust basis for human-robot interaction
– Frequency-domain steered beamformer – Particle filtering source-observation assignment – Separation post-filtering for multiple sources
– Integration with missing feature theory
– Complete dialogue system – Echo cancellation for the robot's own voice – Use human-inspired techniques – Environmental sound recognition – Embedded implementation
– Video-conference: automatically follow
– Automatic transcription