Virtual Navigation of Ambisonics- Encoded Sound Fields Containing Near-Field Sources
Joseph G. Tylka Final Public Oral Examination Mechanical and Aerospace Engineering Princeton University Advisor: Edgar Y. Choueiri 9 May 2019
1
Virtual Navigation of Ambisonics- Encoded Sound Fields Containing - - PowerPoint PPT Presentation
Virtual Navigation of Ambisonics- Encoded Sound Fields Containing Near-Field Sources Joseph G. Tylka Final Public Oral Examination Mechanical and Aerospace Engineering Princeton University Advisor: Edgar Y. Choueiri 9 May 2019 1 Virtual
Joseph G. Tylka Final Public Oral Examination Mechanical and Aerospace Engineering Princeton University Advisor: Edgar Y. Choueiri 9 May 2019
1
2
2
2
Binaural signals HRTFs
Convolve Decode to virtual speakers
3
https://developers.google.com/vr/concepts/spatial-audio mh acoustics Eigenmike Xie (2013), Head-Related Transfer Function and Virtual Auditory Display, Fig. 2.5. https://en.wikipedia.org/wiki/Ambisonics
Encode to Ambisonics Capture
4
Region of validity Sound source R Ambisonics mic.
5
Summary and conclusions
6
7
8
9
10
11
narrow frequency bands (Schärer and Lindau, 2009)
ERB-spaced gammatone filter bank
metrics exist (Boren et al., 2015)
12
Schärer and Lindau (2009). “Evaluation of Equalization Methods for Binaural Signals.” Boren et al. (2015). “Coloration metrics for headphone equalization.”
η(fc) = 10 log10 ✓ R |HΓ(f; fc)||A(f)|2d f R |HΓ(f; fc)||A0(f)|2d f ◆
Frequency (Hz)
|A(f)|/|A0(f)| η(fc) |HΓ(f;fc)|
Magnitude (dB) Magnitude (dB)
ρη = max
c
η(fc) − min
c
η(fc)
50 100
Predicted Coloration Score
50 100
Proposed: r = 0.84
50 100
Predicted Coloration Score
50 100
Kates: r = 0.72
50 100
50 100
Pulkki et al.: r = 0.77
50 100
50 100
Wittek et al.: r = 0.77
13
Legend Data/model y = x y = x ± 20 − − —
Test details:
Tylka and Choueiri (2017). “Models for evaluating navigational techniques for higher-order ambisonics.”
14
15
10 cm 127 cm θ 5 cm
15 14 13 12 11 10 … … Recording/encoding Interpolation
16
1.Transform to plane-wave impulse responses (IRs) 2.Split each IR into wavelets 3.Threshold to find onset times 4.Compute average amplitude in each critical band 5.Compute perceptually-weighted “energy vector” in each band 6.Compute average vector over all bands
17
Plane-wave IR High-pass Find peaks Wavelets Window
Stitt et al. (2016). “Extended Energy Vector Prediction of Ambisonically Reproduced Image Direction at Off-Center Listening Positions.”
18
Pearson correlation coefficient: r = 0.85 Mean absolute error: ε = 3.67° Test details:
Legend Data/model y = x y = x ± 5° − − —
Tylka and Choueiri (2017). “Models for evaluating navigational techniques for higher-order ambisonics.”
metrics
in human perception of coloration
giving reasonable predictions of perceived localization
19
20
21
Sound source
Listening position
Region of validity
21
Sound source
Listening position
Region of validity
21
Sound source
Listening position
Region of validity
21
Sound source
Listening position
Region of validity
21
Sound source
Listening position
22
Interpolated
Apply interpolation filter matrix Compute interpolation filter matrix Determine valid microphones Listening position Detect and locate near- field sources Microphone positions
from mic. 1
from mic. P Discard signals from invalid microphones
Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.
23
x y
s0 ~ s1 ~ s2 '0 '1 '2
γ = s0 ∆/2
Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.
24
Weighted-Average Interpolation (WAI) Valid Microphone Interpolation (VMI)
Finding: excluding the invalid microphone improves coloration performance for interior sources
Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.
25
Weighted-Average Interpolation (WAI) Valid Microphone Interpolation (VMI)
Finding: excluding the invalid microphone improves localization performance for interior sources
Tylka and Choueiri (2019). “A Parametric Method for Virtual Navigation Within an Array of Ambisonics Microphones.” Under Review.
method which excludes invalid microphones from the interpolation
weighted-average interpolation (WAI)
significantly improves coloration and localization performance for interior sources
26
27
28
s0 ŝ1 ŝ2
Source Triangulation Point-Source Modeling
Thiergart et al. (2013). “Geometry-Based Spatial Sound Acquisition Using Distributed Microphone Arrays.”
29
Time-Frequency Analysis (TFA) Valid Microphone Interpolation (VMI)
Finding: VMI achieves superior coloration performance for interior sources and/or large array spacings
Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.
30
Valid Microphone Interpolation (VMI) Time-Frequency Analysis (TFA)
Finding: TFA achieves superior localization performance only for interior sources with large array spacings
Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.
31
Tylka and Choueiri (2019). “Domains of Practical Applicability for Parametric Interpolation Methods for Virtual Sound Field Navigation.” Under Review.
across ranges of practically-relevant conditions
sparsely distributed microphones, only VMI yields high tonal fidelity
superior localization
microphones, VMI and WAI achieve both accurate tonality and localization
32
metrics for evaluating navigational methods
which circumvents the region of validity restriction
the-art navigational methods
them based on intended application
33
I. The proposed metrics give usable predictions of perceived coloration and localization
performance and localization accuracy
fidelity and accurate localization performance
and only TFA gives accurate localization performance
34
35
Local 315
36