loud a 1020 node microphone array and acoustic beamformer
play

LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene - PowerPoint PPT Presentation

LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene Weinstein 1 , Kenneth Steele 2 , Anant Agarwal 2,3 , James Glass 3 1 Courant Institute of Mathematical Sciences 2 Tilera Corporation 3 MIT Computer Science and Artificial


  1. LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene Weinstein 1 , Kenneth Steele 2 , Anant Agarwal 2,3 , James Glass 3 1 Courant Institute of Mathematical Sciences 2 Tilera Corporation 3 MIT Computer Science and Artificial Intelligence Lab * Based on work done at MIT CSAIL

  2. Introduction • Recording sound in high-noise settings is difficult • e.g., noisy lab or conference room • Can use close-talking microphones (e.g., lapel mic) • However, an untethered solution is more natural • Idea: use software-steerable microphone arrays • Isolate and amplify sound using beamforming • Target application: speech recognition 2

  3. Large Microphone Arrays • Large acOUstic Data (LOUD) array: 1020 microphones • Microphone array gain increases linearly with the number of microphones • Past large-array speech recognition experiments scarce • Processing large quantities of data in real-time is a compelling application for novel computing architectures • LOUD generates 400 Mbits/sec • We use Raw, a 16-tile parallel architecture 3

  4. Acoustic Beamforming • Selectively amplify a sound source at a particular location • Take advantage of sound propagation through space Sound • Use simple delay-and-sum beamforming Source t8 … t1 t7 Microphones … Delay 0 t8-t7 t8-t1 +

  5. Two-microphone PCB • On-board A/D converter feeds into CPLD • Data streamed to CPU using time-division multiplexing 5

  6. 1020-Microphone Array 6

  7. Microphone Positions • Automated procedure to calibrate microphone positions • Play a test audio “chirp” through a speaker • Record with reference mic at speaker position and at each array mic • Peak of cross-correlation function between reference, array microphones gives propagation delay • Solve for precise array geometry 7

  8. Experiments • Setting: extremely noisy hardware lab • Subject and “interferer” talking at the same time • Goal: demonstrate that speech recognition accuracy improves with microphone array size • Speaker-independent recognizer for digit strings • Record 150 utterances with interferer, 110 without • Baseline: high quality close-talking mic, 80 utterances 8

  9. Recognition Accuracy • Word error rate 100 Array with interferer Array without interferer (WER) decreases Close � talking microphone 90 with array size 80 70 • WER drops by 87% Word Error Rate (%) 60 (w/ interferer), 91% 50 (no interferer) from 40 one to 1020 mics 30 • Accuracy approaches 20 10 close-talking 0 0 1 2 3 10 10 10 10 microphone levels! Number of Microphones 9

  10. LOUD Demo 10

  11. Summary/Future Work • LOUD allows high-quality untethered recording in very noisy settings • Speech recognition experiments demonstrate benefit of large arrays • Future work: • Implement more sophisticated beamforming techniques • Automatic speaker tracking • Conduct more experiments with different geometries, noise settings 11

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend