Virtual Navigation of Ambisonics- Encoded Sound Fields Containing - - PowerPoint PPT Presentation

virtual navigation of ambisonics encoded sound fields
SMART_READER_LITE
LIVE PREVIEW

Virtual Navigation of Ambisonics- Encoded Sound Fields Containing - - PowerPoint PPT Presentation

Virtual Navigation of Ambisonics- Encoded Sound Fields Containing Near-Field Sources Joseph G. Tylka Final Public Oral Examination Mechanical and Aerospace Engineering Princeton University Advisor: Edgar Y. Choueiri 9 May 2019 1 Virtual


slide-1
SLIDE 1

Virtual Navigation of Ambisonics- Encoded Sound Fields Containing Near-Field Sources

Joseph G. Tylka Final Public Oral Examination Mechanical and Aerospace Engineering Princeton University Advisor: Edgar Y. Choueiri 9 May 2019

1

slide-2
SLIDE 2

Virtual Navigation

2

slide-3
SLIDE 3

Virtual Navigation

2

slide-4
SLIDE 4

Virtual Navigation

2

slide-5
SLIDE 5

Binaural signals HRTFs

Convolve Decode to virtual speakers

3

https://developers.google.com/vr/concepts/spatial-audio mh acoustics Eigenmike Xie (2013), Head-Related Transfer Function and Virtual Auditory Display, Fig. 2.5. https://en.wikipedia.org/wiki/Ambisonics

Ambisonics Encoding & Binaural Rendering

Encode to Ambisonics Capture

slide-6
SLIDE 6

4

Region of validity Sound source R Ambisonics mic.

Near-Field Sources

slide-7
SLIDE 7

Virtual Navigation of Ambisonics- Encoded Sound Fields Containing Near-Field Sources

  • Several navigational methods exist
  • Unclear how they perform, how to compare
  • When should each method be used?
  • Region of validity is a well-known mathematical issue
  • Unclear how it manifests in terms of audio

5

slide-8
SLIDE 8

Outline

  • A. Metrics for evaluating navigational methods
  • Auditory models and objective metrics
  • Spectral coloration and perceived source localization
  • Subjective validation through listening experiments
  • B. Effect of microphone validity
  • Proposed method: valid microphone interpolation (VMI)
  • Simple benchmark: weighted-average interpolation (WAI)
  • C. Comparisons of navigational methods
  • State-of-the-art: time-frequency analysis (TFA)
  • Practical considerations

Summary and conclusions

6

slide-9
SLIDE 9
  • A. Metrics for Evaluating

Navigational Methods

7

slide-10
SLIDE 10

8

Listening Experiments

slide-11
SLIDE 11

Experiment 1: Coloration

9

slide-12
SLIDE 12

Coloration Test

  • Collect subjective ratings of coloration & compute objective metrics
  • Perform multiple linear regression between ratings and metrics values
  • MUltiple Stimuli with Hidden Reference and Anchor (ITU-R BS.1534-3)
  • Reference: no navigation, pink noise
  • Anchor: 3.5 kHz low-passed version of Ref.
  • Test samples: vary navigational method and distance
  • User rates each sample from 0–100: 100 = Ref.; 0 = Anchor
  • Coloration score = 100 − MUSHRA rating: 0 = Ref.; 100 = Anchor

10

slide-13
SLIDE 13

Coloration GUI

11

slide-14
SLIDE 14

Coloration Metric

  • Averaged spectral distortions in

narrow frequency bands (Schärer and Lindau, 2009)

  • Critical bands approximated by

ERB-spaced gammatone filter bank

  • More sophisticated coloration

metrics exist (Boren et al., 2015)

12

Schärer and Lindau (2009). “Evaluation of Equalization Methods for Binaural Signals.” Boren et al. (2015). “Coloration metrics for headphone equalization.”

η(fc) = 10 log10 ✓ R |HΓ(f; fc)||A(f)|2d f R |HΓ(f; fc)||A0(f)|2d f ◆

Frequency (Hz)

|A(f)|/|A0(f)| η(fc) |HΓ(f;fc)|

Magnitude (dB) Magnitude (dB)

ρη = max

c

η(fc) − min

c

η(fc)

slide-15
SLIDE 15

50 100

Predicted Coloration Score

50 100

  • Avg. Measured Coloration Score

Proposed: r = 0.84

50 100

Predicted Coloration Score

50 100

Kates: r = 0.72

50 100

  • Avg. Predicted Coloration Score

50 100

  • Avg. Measured Coloration Score

Pulkki et al.: r = 0.77

50 100

  • Avg. Predicted Coloration Score

50 100

Wittek et al.: r = 0.77

13

Legend Data/model y = x y = x ± 20 − − —

Regression Results

Test details:

  • 27 test samples
  • 4 trained listeners
  • Pink noise signal

Tylka and Choueiri (2017). “Models for evaluating navigational techniques for higher-order ambisonics.”

slide-16
SLIDE 16

Experiment 2: Localization

14

slide-17
SLIDE 17

Localization Test

15

10 cm 127 cm θ 5 cm

15 14 13 12 11 10 … … Recording/encoding Interpolation

slide-18
SLIDE 18

Localization Test

16

slide-19
SLIDE 19

Localization Metric

1.Transform to plane-wave impulse responses (IRs) 2.Split each IR into wavelets 3.Threshold to find onset times 4.Compute average amplitude in each critical band 5.Compute perceptually-weighted “energy vector” in each band 6.Compute average vector over all bands

17

Plane-wave IR High-pass Find peaks Wavelets Window

Stitt et al. (2016). “Extended Energy Vector Prediction of Ambisonically Reproduced Image Direction at Off-Center Listening Positions.”

slide-20
SLIDE 20

Localization Test Results

18

Pearson correlation coefficient: r = 0.85 Mean absolute error: ε = 3.67° Test details:

  • 70 test samples
  • 4 trained listeners
  • Speech signal

Legend Data/model y = x y = x ± 5° − − —

Tylka and Choueiri (2017). “Models for evaluating navigational techniques for higher-order ambisonics.”

slide-21
SLIDE 21

Summary - Part A.

  • Developed objective metrics for coloration and localization
  • Constructed experimental setup to conduct listening tests
  • Conducted subjective listening tests to validate the

metrics

  • Finding: the chosen coloration metric is a dominant factor

in human perception of coloration

  • Finding: the structure of the localization metric is valid for

giving reasonable predictions of perceived localization

19

slide-22
SLIDE 22
  • B. Effect of

Microphone Validity

20

slide-23
SLIDE 23

Ambisonics Interpolation

21

  • Amb. mic. 2
  • Amb. mic. 1

Sound source

  • Amb. mic. 3

Listening position

slide-24
SLIDE 24

Region of validity

Ambisonics Interpolation

21

  • Amb. mic. 2
  • Amb. mic. 1

Sound source

  • Amb. mic. 3

Listening position

slide-25
SLIDE 25

Region of validity

Ambisonics Interpolation

21

  • Amb. mic. 2
  • Amb. mic. 1

Sound source

  • Amb. mic. 3

Listening position

slide-26
SLIDE 26

Region of validity

Ambisonics Interpolation

21

  • Amb. mic. 2
  • Amb. mic. 1

Sound source

  • Amb. mic. 3

Listening position

slide-27
SLIDE 27

Region of validity

Ambisonics Interpolation

21

  • Amb. mic. 2
  • Amb. mic. 1

Sound source

  • Amb. mic. 3

Listening position

slide-28
SLIDE 28

Valid Microphone Interpolation

22

Interpolated


  • amb. signals

Apply interpolation filter matrix Compute interpolation filter matrix Determine valid microphones Listening position Detect and locate near- field sources Microphone positions

  • Amb. signals

from mic. 1

  • Amb. signals

from mic. P Discard signals from invalid microphones

Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.

slide-29
SLIDE 29

Numerical Simulations

23

x y

  • ~

s0 ~ s1 ~ s2 '0 '1 '2

γ = s0 ∆/2

Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.

slide-30
SLIDE 30

Results - Coloration

24

Weighted-Average Interpolation (WAI) Valid Microphone Interpolation (VMI)

Finding: excluding the invalid microphone improves coloration performance for interior sources

Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.

slide-31
SLIDE 31

Results - Localization

25

Weighted-Average Interpolation (WAI) Valid Microphone Interpolation (VMI)

Finding: excluding the invalid microphone improves localization performance for interior sources

Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.

slide-32
SLIDE 32

Summary - Part B.

  • Developed the valid microphone interpolation (VMI)

method which excludes invalid microphones from the interpolation

  • Compared this method to a simple benchmark:

weighted-average interpolation (WAI)

  • Finding: excluding the invalid microphone

significantly improves coloration and localization performance for interior sources

26

slide-33
SLIDE 33
  • C. Comparisons of

Navigational Methods

27

slide-34
SLIDE 34

Time-Frequency Analysis

28

s0 ŝ1 ŝ2

Source Triangulation Point-Source Modeling

Thiergart et al. (2013). “Geometry-Based Spatial Sound Acquisition Using Distributed Microphone Arrays.”

slide-35
SLIDE 35

Results - Coloration

29

Time-Frequency Analysis (TFA) Valid Microphone Interpolation (VMI)

Finding: VMI achieves superior coloration performance for interior sources and/or large array spacings

Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.

slide-36
SLIDE 36

Results - Localization

30

Valid Microphone Interpolation (VMI) Time-Frequency Analysis (TFA)

Finding: TFA achieves superior localization performance only for interior sources with large array spacings

Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.

slide-37
SLIDE 37

Domains of Practical Application

31

  • Coloration
  • Localization

Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.

slide-38
SLIDE 38

Summary - Part C.

  • Compared the performance of state-of-the-art methods

across ranges of practically-relevant conditions

  • Finding: for applications with intimate sources and

sparsely distributed microphones, only VMI yields high tonal fidelity

  • Finding: in those applications, TFA yields accurate and

superior localization

  • Finding: for applications with distant sources and sparse

microphones, VMI and WAI achieve both accurate tonality and localization

32

slide-39
SLIDE 39

Summary

  • Developed and experimentally validated objective

metrics for evaluating navigational methods

  • Developed the valid microphone interpolation method

which circumvents the region of validity restriction

  • Characterized and compared this and other state-of-

the-art navigational methods

  • Identified practical guidelines for choosing between

them based on intended application

33

slide-40
SLIDE 40

Conclusions

I. The proposed metrics give usable predictions of perceived coloration and localization

  • II. Excluding invalid microphones improves coloration

performance and localization accuracy

  • III. For applications with sparsely distributed microphones:
  • For distant sources, both VMI and WAI give high tonal

fidelity and accurate localization performance

  • For intimate sources, only VMI gives high tonal fidelity

and only TFA gives accurate localization performance

34

slide-41
SLIDE 41

Acknowledgements

35

Local 315

slide-42
SLIDE 42

Thank You

36