Unified Stochastic Reverberation Modeling Roland Badeau LTCI, - - PowerPoint PPT Presentation

unified stochastic reverberation modeling
SMART_READER_LITE
LIVE PREVIEW

Unified Stochastic Reverberation Modeling Roland Badeau LTCI, - - PowerPoint PPT Presentation

Unified Stochastic Reverberation Modeling Roland Badeau LTCI, Tlcom ParisTech, Universit Paris-Saclay, Paris, France roland.badeau@telecom-paristech.fr September 6, 2018 26th European Signal Processing Conference (EUSIPCO) Page 1 / 32


slide-1
SLIDE 1

Page 1 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified Stochastic Reverberation Modeling

Roland Badeau LTCI, Télécom ParisTech, Université Paris-Saclay, Paris, France roland.badeau@telecom-paristech.fr

slide-2
SLIDE 2

Page 2 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Why this research work?

Applications of reverberation models:

Dereverberation (Belhomme et al., 2017), Source separation (Leglaive et al., 2018), Source localization, denoising, audio inpainting. . .

  • A. Belhomme, R. Badeau, Y. Grenier, and E. Humbert. Amplitude and phase dereverberation of

harmonic signals. In Proc. of IEEE WASPAA, New Paltz, New York, USA, October 2017

  • S. Leglaive, R. Badeau, and G. Richard. Student’s t source and mixing models for multichannel

audio source separation. IEEE Trans. Audio, Speech, Language Process., 26(5):1–15, May 2018

slide-3
SLIDE 3

Page 2 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Why this research work?

Applications of reverberation models:

Dereverberation (Belhomme et al., 2017), Source separation (Leglaive et al., 2018), Source localization, denoising, audio inpainting. . .

Existing stochastic models of late reverberation:

Time domain (Schroeder, 1962; Moorer, 1979) Frequency domain (Schroeder, 1962) Space-frequency domain (Cook et al., 1955) Time-frequency domain (Polack, 1988)

  • A. Belhomme, R. Badeau, Y. Grenier, and E. Humbert. Amplitude and phase dereverberation of

harmonic signals. In Proc. of IEEE WASPAA, New Paltz, New York, USA, October 2017

  • S. Leglaive, R. Badeau, and G. Richard. Student’s t source and mixing models for multichannel

audio source separation. IEEE Trans. Audio, Speech, Language Process., 26(5):1–15, May 2018

slide-4
SLIDE 4

Page 3 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Outline

II Properties of reverberation III Review of reverberation models IV Definition of the new stochastic model V Statistical properties of the model VI Experimental validation VII Conclusion

slide-5
SLIDE 5

Page 4 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part II Properties of reverberation

slide-6
SLIDE 6

Page 5 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency profile of reverberation

t Direct sound Early reflections Late reverberation Transition time Room impulse response (RIR)

slide-7
SLIDE 7

Page 5 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency profile of reverberation

t Direct sound Early reflections Late reverberation Transition time f Isolated room modes Dense room modes Schroeder’s frequency Room impulse response (RIR) Room frequency response (RFR)

slide-8
SLIDE 8

Page 5 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency profile of reverberation

t f Validity domain

  • f the stochastic model

t Direct sound Early reflections Late reverberation Transition time f Isolated room modes Dense room modes Schroeder’s frequency Room impulse response (RIR) Room frequency response (RFR)

slide-9
SLIDE 9

Page 6 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space domain: diffuse sound field

Diffusion: reflections on the room surfaces are not specular (mirror-like), but rather scattered in various directions

Room surface Incident waveform Specular reflection Diffuse reflection T.J. Schultz. Diffusion in reverberation rooms. Journal of Sound and Vibration, 16(1):17 – 28, 1971

slide-10
SLIDE 10

Page 6 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space domain: diffuse sound field

Diffusion: reflections on the room surfaces are not specular (mirror-like), but rather scattered in various directions

Room surface Incident waveform Specular reflection Diffuse reflection

The acoustic field can be approximated as diffuse (Schultz, 1971)

inside the time-frequency validity domain of the stochastic model if source/sensors are at least a half-wavelength away from walls

T.J. Schultz. Diffusion in reverberation rooms. Journal of Sound and Vibration, 16(1):17 – 28, 1971

slide-11
SLIDE 11

Page 6 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space domain: diffuse sound field

Diffusion: reflections on the room surfaces are not specular (mirror-like), but rather scattered in various directions

Room surface Incident waveform Specular reflection Diffuse reflection

The acoustic field can be approximated as diffuse (Schultz, 1971)

inside the time-frequency validity domain of the stochastic model if source/sensors are at least a half-wavelength away from walls

After many reflections, the acoustic field is uniform and isotropic

T.J. Schultz. Diffusion in reverberation rooms. Journal of Sound and Vibration, 16(1):17 – 28, 1971

slide-12
SLIDE 12

Page 7 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Experiments

Measured RIRs from C4DM database (169 RIRs, Fs=96 kHz)

Octagon room: 8 walls 7.5m length and domed ceiling 21m height 13 x 13 sensor positions distributed on a uniform square grid Space sampling of the omnidirectional microphone grid: D = 1m Reverberation time: RT60 ≈ 2s

  • R. Stewart and M. Sandler. Database of omnidirectional and b-format room impulse responses.

In IEEE ICASSP, pages 165–168, Center for Digital Music (C4DM), QMUL, London, March 2010 Emmanuel Vincent and Douglas R. Campbell. Roomsimove. GNU Public License, 2008. http://homepages.loria.fr/evincent/software/Roomsimove.zip

slide-13
SLIDE 13

Page 7 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Experiments

Measured RIRs from C4DM database (169 RIRs, Fs=96 kHz)

Octagon room: 8 walls 7.5m length and domed ceiling 21m height 13 x 13 sensor positions distributed on a uniform square grid Space sampling of the omnidirectional microphone grid: D = 1m Reverberation time: RT60 ≈ 2s

Synthetic RIRs from Roomsimove toolbox (400 RIRs, Fs=16 kHz)

Shoebox room: 4 x 5 x 2.5 m3 Random source and sensor positions, random sensor orientations Distance between the omnidirectional microphones: D = 20cm Reverberation time: RT60 ≈ 0.1s

  • R. Stewart and M. Sandler. Database of omnidirectional and b-format room impulse responses.

In IEEE ICASSP, pages 165–168, Center for Digital Music (C4DM), QMUL, London, March 2010 Emmanuel Vincent and Douglas R. Campbell. Roomsimove. GNU Public License, 2008. http://homepages.loria.fr/evincent/software/Roomsimove.zip

slide-14
SLIDE 14

Page 8 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency profile (C4DM database)

slide-15
SLIDE 15

Page 9 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part III Review of reverberation models

slide-16
SLIDE 16

Page 10 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time domain

Schroeder (1962) and Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0

Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962 James A. Moorer. About this reverberation business. Computer Music Journal, 3(2):13–28, 1979

slide-17
SLIDE 17

Page 10 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time domain

Schroeder (1962) and Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 bi(t) is a centered white Gaussian process

Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962 James A. Moorer. About this reverberation business. Computer Music Journal, 3(2):13–28, 1979

slide-18
SLIDE 18

Page 10 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time domain

Schroeder (1962) and Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 bi(t) is a centered white Gaussian process α > 0 is related to the reverberation time: RT60 = 3 ln(10)

α Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962 James A. Moorer. About this reverberation business. Computer Music Journal, 3(2):13–28, 1979

slide-19
SLIDE 19

Page 11 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Validation of time model (C4DM database)

0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 1.1 1.2

  • 5

5

  • 1
  • 0.8
  • 0.6
  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 0.5 1

  • 8
  • 6
  • 4
  • 2

2 4 6 8 0.02 0.04

slide-20
SLIDE 20

Page 12 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Frequency domain

The RFR is the Fourier transform of the RIR: Fhi(f) =

  • t∈R hi(t)e−2ıπftdt

Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962

slide-21
SLIDE 21

Page 12 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Frequency domain

The RFR is the Fourier transform of the RIR: Fhi(f) =

  • t∈R hi(t)e−2ıπftdt

Schroeder (1962): Fhi(f) is a stationary random process

Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962

slide-22
SLIDE 22

Page 12 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Frequency domain

The RFR is the Fourier transform of the RIR: Fhi(f) =

  • t∈R hi(t)e−2ıπftdt

Schroeder (1962): Fhi(f) is a stationary random process Complex autocorrelation function of Fhi(f): corr

  • Fhi(f1), Fhi(f2)
  • =

1 1 + ıπ f1−f2

α Manfred R. Schroeder. Frequency-correlation functions of frequency responses in rooms. The Journal of the Acoustical Society of America, 34(12):1819–1823, 1962

slide-23
SLIDE 23

Page 13 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Validation of spectral model (Roomsimove)

  • 2000
  • 1500
  • 1000
  • 500

500 1000 1500 2000 Frequency (Hz) 0.2 0.4 0.6 0.8 1 Amplitude Real part of the autocorrelation function of the late RFR

  • 2000
  • 1500
  • 1000
  • 500

500 1000 1500 2000 Frequency (Hz)

  • 0.4
  • 0.2

0.2 0.4 Amplitude Imaginary part of the autocorrelation function of the late RFR

slide-24
SLIDE 24

Page 14 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-frequency domain

Correlation at frequency f between sensors (Cook et al., 1955): corr

  • Fh1(f), Fh2(f)
  • = sinc
  • 2πfD

c

  • D is the distance between microphones

c is the speed of sound in the air (≈ 343 m/s)

  • R. K. Cook, R. V. Waterhouse, R. D. Berendt, S. Edelman, and M. C. Thompson Jr. Measurement
  • f correlation coefficients in reverberant sound fields.

The Journal of the Acoustical Society of America, 27(6):1072–1077, 1955

slide-25
SLIDE 25

Page 14 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-frequency domain

Correlation at frequency f between sensors (Cook et al., 1955): corr

  • Fh1(f), Fh2(f)
  • = sinc
  • 2πfD

c

  • D is the distance between microphones

c is the speed of sound in the air (≈ 343 m/s)

Assumptions:

Plane waves (far field) Isotropic incident waves (diffuse acoustic field)

  • R. K. Cook, R. V. Waterhouse, R. D. Berendt, S. Edelman, and M. C. Thompson Jr. Measurement
  • f correlation coefficients in reverberant sound fields.

The Journal of the Acoustical Society of America, 27(6):1072–1077, 1955

slide-26
SLIDE 26

Page 15 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Validation of space model (Roomsimove)

1000 2000 3000 4000 5000 6000 7000 8000 Frequency (Hz)

  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 Amplitude Correlation between sensors of the late RFR

slide-27
SLIDE 27

Page 16 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency domain

Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 where bi(t) is a centered white Gaussian process Spectrogram of bi(t) (C4DM database):

0.2 0.4 0.6 0.8 1 1.2 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 104

  • 60
  • 40
  • 20

20 40

slide-28
SLIDE 28

Page 17 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency domain

Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-29
SLIDE 29

Page 17 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency domain

Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 Polack (1988): bi(t) is a centered stationary Gaussian process, whose power spectral density (PSD) B(f) has slow variations

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-30
SLIDE 30

Page 17 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency domain

Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 Polack (1988): bi(t) is a centered stationary Gaussian process, whose power spectral density (PSD) B(f) has slow variations Polack (1988): the Wigner distribution of the RIR is Whi,hi(t, f) = B(f)e−2αt1t≥0.

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-31
SLIDE 31

Page 17 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency domain

Moorer (1979): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 Polack (1988): bi(t) is a centered stationary Gaussian process, whose power spectral density (PSD) B(f) has slow variations Polack (1988): the Wigner distribution of the RIR is Whi,hi(t, f) = B(f)e−2αt1t≥0. Wigner distribution of two 2nd order random processes ψ1, ψ2: Wψ1,ψ2(t, f) =

  • R cov[ψ1(t + u

2), ψ2(t − u 2)]e−2ıπfudu

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-32
SLIDE 32

Page 18 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Validation of Polack’s model (Roomsimove)

  • 70
  • 60
  • 50

0.02 8000

  • 40

0.04

  • 30

Time (seconds) 6000 Time-frequency distribution of the late RIR (dB)

  • 20

0.06 Frequency (Hz) 4000

  • 10

0.08 2000 0.1

  • 60
  • 55
  • 50
  • 45
  • 40
  • 35
  • 30
  • 25
  • 20
  • 15
slide-33
SLIDE 33

Page 19 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part IV Definition of the new stochastic model

slide-34
SLIDE 34

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (shoebox room)

Room Microphone Source

slide-35
SLIDE 35

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (direct path)

Room Microphone Source

slide-36
SLIDE 36

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (2 reflections)

Room Microphone Source

slide-37
SLIDE 37

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (mirroring)

Room Microphone Source

slide-38
SLIDE 38

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (any sensor)

slide-39
SLIDE 39

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (infinite space)

slide-40
SLIDE 40

Page 20 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Source image principle (no wall)

slide-41
SLIDE 41

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-42
SLIDE 42

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-43
SLIDE 43

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-44
SLIDE 44

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0 independent of the microphone position and true source position

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-45
SLIDE 45

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0 independent of the microphone position and true source position independent of the room geometry (Polack, 1993)

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-46
SLIDE 46

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0 independent of the microphone position and true source position independent of the room geometry (Polack, 1993) holds even more in a diffuse (spatially uniform) acoustic field

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-47
SLIDE 47

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0 independent of the microphone position and true source position independent of the room geometry (Polack, 1993) holds even more in a diffuse (spatially uniform) acoustic field

Assumption: microphone and source images are omnidirectional

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-48
SLIDE 48

Page 21 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Stochastic distribution of source images

Source image principle: specular reflections in a shoebox room ⇒ spatially uniform distribution of source images Proposed stochastic model: source images are spatially distributed according to a uniform Poisson distribution:

for any volume V ⊂ R3, N(V) ∼ P(λ|V|) with λ > 0 independent of the microphone position and true source position independent of the room geometry (Polack, 1993) holds even more in a diffuse (spatially uniform) acoustic field

Assumption: microphone and source images are omnidirectional

The attenuation of sound waves is exponential w.r.t. the distance, isotropic and independent of frequency

Jean-Dominique Polack. Playing billiards in the concert hall: The mathematical foundations of geometrical room acoustics. Applied Acoustics, 38(2):235 – 244, 1993

slide-49
SLIDE 49

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

slide-50
SLIDE 50

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position

slide-51
SLIDE 51

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position dN(x) ∼ P(λdx) are independent Poisson increments

slide-52
SLIDE 52

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position dN(x) ∼ P(λdx) are independent Poisson increments α > 0 is the attenuation coefficient (in Hz)

slide-53
SLIDE 53

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position dN(x) ∼ P(λdx) are independent Poisson increments α > 0 is the attenuation coefficient (in Hz) c > 0 is the speed of sound in the air (≈ 343 m/s)

slide-54
SLIDE 54

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position dN(x) ∼ P(λdx) are independent Poisson increments α > 0 is the attenuation coefficient (in Hz) c > 0 is the speed of sound in the air (≈ 343 m/s) h(t, r) is a coherent sum of monochromatic spherical waves: h(t, r) =

f∈R A(f) e

2ıπf(t− r c)

r

df

slide-55
SLIDE 55

Page 22 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Unified stochastic reverberation model

For a set of sensors at positions {xi}i ∈ R3, the RIRs are hi(t) =

  • x∈R3 h(t, x − xi2) e− α

c x−xi2 dN(x)

x ∈ R3 is a possible source image position dN(x) ∼ P(λdx) are independent Poisson increments α > 0 is the attenuation coefficient (in Hz) c > 0 is the speed of sound in the air (≈ 343 m/s) h(t, r) is a coherent sum of monochromatic spherical waves: h(t, r) =

f∈R A(f) e

2ıπf(t− r c)

r

df

We get hi(t) = e−α(t−T) bi(t), bi(t) =

  • x∈R3

g

  • t−T−

x−xi 2 c

  • x−xi2

dN(x) with g(t) ∈ L2([−T, T]) satisfying technical conditions

slide-56
SLIDE 56

Page 23 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part V Statistical properties of the model

slide-57
SLIDE 57

Page 24 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time domain

Asymptotic normality: with hi(t) = e−α(t−T) bi(t), when t → +∞, b(t) = [bi(t), bj(t)]⊤ converges to a stationary Gaussian process

slide-58
SLIDE 58

Page 24 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time domain

Asymptotic normality: with hi(t) = e−α(t−T) bi(t), when t → +∞, b(t) = [bi(t), bj(t)]⊤ converges to a stationary Gaussian process At one sensor: ∀t ≥ 2T, bi(t) is a centered wide sense stationary (WSS) process, of PSD B(f) = 4πλc |Fg(f)|2

slide-59
SLIDE 59

Page 24 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time domain

Asymptotic normality: with hi(t) = e−α(t−T) bi(t), when t → +∞, b(t) = [bi(t), bj(t)]⊤ converges to a stationary Gaussian process At one sensor: ∀t ≥ 2T, bi(t) is a centered wide sense stationary (WSS) process, of PSD B(f) = 4πλc |Fg(f)|2

When t → +∞, we retrieve (Polack, 1988)

slide-60
SLIDE 60

Page 24 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time domain

Asymptotic normality: with hi(t) = e−α(t−T) bi(t), when t → +∞, b(t) = [bi(t), bj(t)]⊤ converges to a stationary Gaussian process At one sensor: ∀t ≥ 2T, bi(t) is a centered wide sense stationary (WSS) process, of PSD B(f) = 4πλc |Fg(f)|2

When t → +∞, we retrieve (Polack, 1988) When t → +∞, if bi(t) is white, we retrieve (Moorer, 1979)

slide-61
SLIDE 61

Page 24 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time domain

Asymptotic normality: with hi(t) = e−α(t−T) bi(t), when t → +∞, b(t) = [bi(t), bj(t)]⊤ converges to a stationary Gaussian process At one sensor: ∀t ≥ 2T, bi(t) is a centered wide sense stationary (WSS) process, of PSD B(f) = 4πλc |Fg(f)|2

When t → +∞, we retrieve (Polack, 1988) When t → +∞, if bi(t) is white, we retrieve (Moorer, 1979)

Between two sensors: ∀t ≥ 2T + D

c , b(t) = [bi(t), bj(t)]⊤ is a

centered WSS process, of cross-PSD Bi,j(f) = B(f) sinc(2πfD

c )

(new)

slide-62
SLIDE 62

Page 25 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-frequency domain

Between two sensors: ∀f1, f2 ∈ R, corr[Fhi(f1), Fhj (f2)] =

e− αD

c −2ıπ(f1−f2)(T+ D 2c )sinc( π(f1+f2)D c

) 1+ıπ

f1−f2 α

(new)

slide-63
SLIDE 63

Page 25 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-frequency domain

Between two sensors: ∀f1, f2 ∈ R, corr[Fhi(f1), Fhj (f2)] =

e− αD

c −2ıπ(f1−f2)(T+ D 2c )sinc( π(f1+f2)D c

) 1+ıπ

f1−f2 α

(new) At one sensor (i = j, D = 0) with bi(t) white (T = 0): corr[Fhi(f1), Fhi(f2]) =

1 1+ıπ

f1−f2 α

(Schroeder, 1962)

slide-64
SLIDE 64

Page 25 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-frequency domain

Between two sensors: ∀f1, f2 ∈ R, corr[Fhi(f1), Fhj (f2)] =

e− αD

c −2ıπ(f1−f2)(T+ D 2c )sinc( π(f1+f2)D c

) 1+ıπ

f1−f2 α

(new) At one sensor (i = j, D = 0) with bi(t) white (T = 0): corr[Fhi(f1), Fhi(f2]) =

1 1+ıπ

f1−f2 α

(Schroeder, 1962) At one frequency (f1 = f2 = f), with no attenuation (α = 0): corr[Fhi(f), Fhj (f)] = sinc(2πfD

c )

(Cook et al., 1955)

slide-65
SLIDE 65

Page 26 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time-frequency domain

Between two sensors: ∀f ∈ R, ∀t ≥ 2T + D

2c ,

Whi,hj(t, f) = B(f)e−2α(t−T) sinc(2πfD

c )

(new)

slide-66
SLIDE 66

Page 26 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time-frequency domain

Between two sensors: ∀f ∈ R, ∀t ≥ 2T + D

2c ,

Whi,hj(t, f) = B(f)e−2α(t−T) sinc(2πfD

c )

(new) At one sensor (i = j, D = 0) ∀f ∈ R, ∀t ≥ 2T, Whi,hi(t, f) = B(f) e−2α(t−T) (Polack, 1988)

slide-67
SLIDE 67

Page 26 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Space-time-frequency domain

Between two sensors: ∀f ∈ R, ∀t ≥ 2T + D

2c ,

Whi,hj(t, f) = B(f)e−2α(t−T) sinc(2πfD

c )

(new) At one sensor (i = j, D = 0) ∀f ∈ R, ∀t ≥ 2T, Whi,hi(t, f) = B(f) e−2α(t−T) (Polack, 1988) Time-frequency correlation: ∀f ∈ R, ∀t ≥ 2T + D

2c , Whi ,hj (t,f) Whi ,hi (t,f)

= sinc(2πfD

c )

(new)

slide-68
SLIDE 68

Page 27 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part VI Experimental validation

slide-69
SLIDE 69

Page 28 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Time-frequency correlation (Roomsimove)

  • 0.5

0.02 0.04 Time (seconds) 0.06 0.08 8000 7000 Frequency (Hz) 6000 5000 0.5 4000 3000 2000 0.1 1000 Time-frequency correlation between sensors of the late RIR 1

  • 0.1

0.1 0.2 0.3 0.4

slide-70
SLIDE 70

Page 29 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Projection on the frequency axis

1000 2000 3000 4000 5000 6000 7000 8000 Frequency (Hz)

  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 Amplitude Correlation between sensors of the late RFR

slide-71
SLIDE 71

Page 30 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Part VII Conclusion

slide-72
SLIDE 72

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results

slide-73
SLIDE 73

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results Also applicable before the transition time:

slide-74
SLIDE 74

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results Also applicable before the transition time:

– the Poisson distribution makes hi(t) impulsive in early reverberation

slide-75
SLIDE 75

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results Also applicable before the transition time:

– the Poisson distribution makes hi(t) impulsive in early reverberation

Perspectives

slide-76
SLIDE 76

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results Also applicable before the transition time:

– the Poisson distribution makes hi(t) impulsive in early reverberation

Perspectives

Acoustics

– Directional sources, directional microphones – Non-perfectly diffuse acoustic fields – Frequency-dependent attenuation α

slide-77
SLIDE 77

Page 31 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Conclusion

Summary

New reverberation model that unifies and generalizes known results Also applicable before the transition time:

– the Poisson distribution makes hi(t) impulsive in early reverberation

Perspectives

Acoustics

– Directional sources, directional microphones – Non-perfectly diffuse acoustic fields – Frequency-dependent attenuation α

Signal processing

– Fast algorithm to estimate the model in discrete time – Applications: source separation, dereverberation,. . .

slide-78
SLIDE 78

Page 32 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Thank you!

slide-79
SLIDE 79

Page 33 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Work in progress

Polack (1988): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 where bi(t) is a centered stationary Gaussian process, whose PSD B(f) has slow variations

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-80
SLIDE 80

Page 33 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Work in progress

Polack (1988): the RIR at microphone i is hi(t) = bi(t)e−αt1t≥0 where bi(t) is a centered stationary Gaussian process, whose PSD B(f) has slow variations Then the Wigner distribution of the RIR is Whi,hi(t, f) = B(f)e−2αt1t≥0.

  • J. D. Polack. La transmission de l’énergie sonore dans les salles.

PhD thesis, Université du Maine, Le Mans, France, 1988

slide-81
SLIDE 81

Page 34 / 32

Roland Badeau 26th European Signal Processing Conference (EUSIPCO)

September 6, 2018

Work in progress (C4DM database)

Spectrogram of the RIR (dB) 0.2 0.4 0.6 0.8 1 1.2 Time (seconds) 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 Frequency (Hz) 104

  • 120
  • 100
  • 80
  • 60
  • 40
  • 20

Polack (1988): the attenuation actually depends on the frequency: Whi,hi(t, f) = B(f)e−2α(f)t1t≥0