LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene - - PowerPoint PPT Presentation

loud a 1020 node microphone array and acoustic beamformer
SMART_READER_LITE
LIVE PREVIEW

LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene - - PowerPoint PPT Presentation

LOUD: A 1020-Node Microphone Array and Acoustic Beamformer* Eugene Weinstein 1 , Kenneth Steele 2 , Anant Agarwal 2,3 , James Glass 3 1 Courant Institute of Mathematical Sciences 2 Tilera Corporation 3 MIT Computer Science and Artificial


slide-1
SLIDE 1

LOUD: A 1020-Node Microphone Array and Acoustic Beamformer*

Eugene Weinstein1, Kenneth Steele2, Anant Agarwal2,3, James Glass3

1 Courant Institute of Mathematical Sciences 2 Tilera Corporation 3 MIT Computer Science and Artificial Intelligence Lab

* Based on work done at MIT CSAIL

slide-2
SLIDE 2

Introduction

  • Recording sound in high-noise settings is difficult
  • e.g., noisy lab or conference room
  • Can use close-talking microphones (e.g., lapel mic)
  • However, an untethered solution is more natural
  • Idea: use software-steerable microphone arrays
  • Isolate and amplify sound using beamforming
  • Target application: speech recognition

2

slide-3
SLIDE 3

Large Microphone Arrays

  • Large acOUstic Data (LOUD) array: 1020 microphones
  • Microphone array gain increases linearly with the number
  • f microphones
  • Past large-array speech recognition experiments scarce
  • Processing large quantities of data in real-time is a

compelling application for novel computing architectures

  • LOUD generates 400 Mbits/sec
  • We use Raw, a 16-tile parallel architecture

3

slide-4
SLIDE 4

Acoustic Beamforming

  • Selectively amplify a sound source at a particular location
  • Take advantage of sound propagation through space
  • Use simple delay-and-sum beamforming

Sound Source

t1 t8

Microphones

t8-t1 t8-t7 … t7 …

Delay

+

slide-5
SLIDE 5

Two-microphone PCB

  • On-board A/D converter feeds into CPLD
  • Data streamed to CPU using time-division multiplexing

5

slide-6
SLIDE 6

1020-Microphone Array

6

slide-7
SLIDE 7

Microphone Positions

  • Automated procedure to calibrate microphone positions
  • Play a test audio “chirp” through a speaker
  • Record with reference mic at speaker position and at

each array mic

  • Peak of cross-correlation function between reference,

array microphones gives propagation delay

  • Solve for precise array geometry

7

slide-8
SLIDE 8

Experiments

  • Setting: extremely noisy hardware lab
  • Subject and “interferer” talking at the same time
  • Goal: demonstrate that speech recognition accuracy

improves with microphone array size

  • Speaker-independent recognizer for digit strings
  • Record 150 utterances with interferer, 110 without
  • Baseline: high quality close-talking mic, 80 utterances

8

slide-9
SLIDE 9

Recognition Accuracy

  • Word error rate

(WER) decreases with array size

  • WER drops by 87%

(w/ interferer), 91% (no interferer) from

  • ne to 1020 mics
  • Accuracy approaches

close-talking microphone levels!

10 10

1

10

2

10

3

10 20 30 40 50 60 70 80 90 100 Number of Microphones Word Error Rate (%) Array with interferer Array without interferer Closetalking microphone

9

slide-10
SLIDE 10

LOUD Demo

10

slide-11
SLIDE 11

Summary/Future Work

  • LOUD allows high-quality untethered recording in very

noisy settings

  • Speech recognition experiments demonstrate benefit of

large arrays

  • Future work:
  • Implement more sophisticated beamforming techniques
  • Automatic speaker tracking
  • Conduct more experiments with different geometries,

noise settings

11