Smart Microphones n Sound source direction finding, null- and beam- - - PDF document

smart microphones
SMART_READER_LITE
LIVE PREVIEW

Smart Microphones n Sound source direction finding, null- and beam- - - PDF document

I n t e g r a t e d M e d i a S y s t e m s C e n t e r Array Audio Signal Processing and Virtual Microphones Chris Kyriakakis IMSC Immersive Audio Laboratory University of Southern California N a t i o n a l S c i e n c e F o u n d


slide-1
SLIDE 1

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

1

Array Audio Signal Processing and Virtual Microphones

Chris Kyriakakis

IMSC Immersive Audio Laboratory University of Southern California

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

2

Smart Microphones

n Sound source direction finding, null- and beam-

steering in the presence of interference

n Arrays with local processing power

u Compensate for moving sensors u Blind calibration u Change directivity characteristics

n New models required to deal with real-world

signals

u Alpha-stable distributions

n Virtual microphones

u Synthesize signals in locations where there are no

microphones

slide-2
SLIDE 2

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

3

Array Methods in Noisy Environments

n Traditional Gaussian modeling of noise signals

fails when the signals exhibit impulsive behavior

n The Symmetric α-Stable (SαS) model, can better

account for the outliers that exist in real-world signals

u Example: Time Delay Estimation

n In many array applications we often encounter

signals that are corrupted by multiplicative noise

u Traditional approach: stochastic Gaussian signal,

corrupted by a Gaussian noise

u Alternative: multi-dimensional Gaussian signal with

LŽvy noise.

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

4

Examples of Impulsive Noise

slide-3
SLIDE 3

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

5

Comparison of Gaussian and SαS models

n Real measurements in a typical room

CD tray Footsteps α = 1.68

Measured Gaussian

α = 1.8

Measured Gaussian

SαS SαS

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

6

Comparison of Gaussian and SαS models

Chair Typing α = 1.69

Measured Gaussian

α = 1.44

Measured Gaussian

SαS SαS

slide-4
SLIDE 4

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

7

New Algorithms for Time-Delay Estimation

n TDE techniques based on second-order statistics

fail when the noise is SαS

n Alternative methods, based on Fractional Lower-

Order Statistics

u Fractional Lower Order Correlation Function instead of

Cross Spectrum (PHAT)

A A A e

R R w R R R R j k

k 1 2 1 2 1 2

= = +

ω τ

ε

FLOS- PHAT algorithm

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

8

Performance Comparison

n

FLOS-PHAT performs better than the second-order based PHAT algorithm and adds little computational expense 1 1.2 1.4 1.6 1.8 2 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

α parameter Detection score

12 dB 25 dB 6 dB

Better Worse

PHAT FLOS-PHAT

slide-5
SLIDE 5

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

9

Multiple Sound Sources

n κ sources and

ρ sensors

n Each sensor

receives:

n Array receives:

x t a s t u t

r r k k k r

( ) = ( ) + ( )

=

, 1 κ

x t s t u t

( ) = ( ) + ( )

A

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

10

n LŽvy Distribution. n This is an α-Stable distribution

with α=0.5 and completely skewed to the right. (β=1)

Sub-Gaussian Signal

n The Sub-Gaussian

signal is formed as a product of a Gaussian density with the root

  • f a totally skewed α-

Stable density.

n So now the

transmitted signal v(t) will be corrupted by multiplicative noise u(t) which follows the LŽvy distribution

slide-6
SLIDE 6

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

11

Time Domain

n Time domain of a 2 dimensional sub-Gaussian process:

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

12

Maximum Likelihood Estimator

n From the Density of the sub-Gaussian distribution we can find

the Maximum likelihood estimator to be:

n Simulations can be made to show the effectiveness of this ML

estimator

slide-7
SLIDE 7

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

13

Simulations

n Σ=[1 0.2 ; 0.2 1] n θ1=−0.4, θ2=0.6 n 15 linearly spaced sensors n 2000 samples received

Assuming:

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

14

Virtual Microphones

S H S S H S

1 1 2 2

= , =

H H H p p = =

2 1 1 2 S S2 S1 H2 H1

n Minimize appropriate cost function

to find filter coefficients

n LP Norms

F s n s n

p p

=

( ) − ( )

≤ <

∑ 1

1

ˆ 1 p 2

n Synthesize signal in a location where

there is no microphone

slide-8
SLIDE 8

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

15

Audio ÒMorphingÓ

ORTF Left Tymp

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

16

Virtual Mic Performance

2 4 6 8 10 12 14 16 18 20 −100 −90 −80 −70 −60 −50 −40 −30 −20 −10 Frequency (kHz) Normalized Error (dB)

Normalized Error (dB) Frequency (kHz)

slide-9
SLIDE 9

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

17

Multichannel Transmission Scheme

n Send one channel n Synthesize remaining channels at the receiving end

from a set of stored filters

u Local processing can be used to generate filters

1 channel Network Stored Filters Multichannel

N a t i o n a l S c i e n c e F o u n d a t i o n E n g i n e e r i n g R e s e a r c h C e n t e r I n t e g r a t e d M e d i a S y s t e m s C e n t e r

18

Conclusions

n Local processing at each microphone in an array

can be used to enhance performance in TDE applications and source localization in the presence of noise

n Non-traditional models give better performance n Virtual microphone signals can be synthesized

remotely from a single reference and a set of filters computed at the sensor