matching the analysis scheme to the signal
play

Matching the Analysis Scheme to the Signal Fritz Menzer - PowerPoint PPT Presentation

Time-Frequency Analysis for Audio Workshop Matching the Analysis Scheme to the Signal Fritz Menzer (fritz.menzer@epfl.ch) Communication Systems, 5 th year Ecole Polytechnique F ed erale de Lausanne 15th April, 2004 Overview 1


  1. Time-Frequency Analysis for Audio Workshop Matching the Analysis Scheme to the Signal Fritz Menzer (fritz.menzer@epfl.ch) Communication Systems, 5 th year Ecole Polytechnique F´ ed´ erale de Lausanne 15th April, 2004

  2. Overview 1 Introduction 3 2 Perfect Reconstruction - who cares? 4 2.1 Definition of perfect reconstruction . . . . . . . . . . . . . . . . . . 4 2.2 Do we need perfect reconstruction? . . . . . . . . . . . . . . . . . . 5 3 Harmonic Band Wavelet Transform 7 3.1 Coefficient modeling . . . . . . . . . . . . . . . . . . . . . . . . . . 10 3.2 Advantages / Drawbacks . . . . . . . . . . . . . . . . . . . . . . . . 11 4 From HBWT to inharmonic sound modeling 12 4.1 Taking filters from different PR filterbanks . . . . . . . . . . . . . . 13 4.2 Why aliasing is not a problem . . . . . . . . . . . . . . . . . . . . . 14 4.3 Method Overview . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17 4.4 Sounds . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18 5 Time-Frequency Analysis and Granular Synthesis 19 5.1 Time-domain effects . . . . . . . . . . . . . . . . . . . . . . . . . . 25 5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition . . 26 A References 27 2

  3. 1 Introduction • If you know what you’re looking at, you can examine it more precisely. 3

  4. 2 Perfect Reconstruction - who cares? 2.1 Definition of perfect reconstruction • Definition: Perfect Reconstruction (PR) method: method providing direct and inverse transforms T and T − 1 such that for any signal s , T − 1 ( T ( s )) = s • FFT based methods, Cosine Modulated Filterbanks and Wavelet transforms are usually PR methods. • Simple operations like filtering or distortion do not necessarily allow PR (i.e. it may be impossible to find T − 1 ). Example: Quantisation obviously does not allow to reconstuct the original signal perfectly. 4

  5. 2.2 Do we need perfect reconstruction? 1.5 150 1 0.5 100 0 −0.5 50 −1 −1.5 −2 0 0 5 10 15 20 0 5 10 15 20 1.5 80 1 60 0.5 0 40 −0.5 −1 20 −1.5 −2 0 0 5 10 15 20 0 5 10 15 20 samples frequency [kHz] Noise Noise, down- and upsampled by 4 5

  6. Do we need perfect reconstruction? • Not needed for: – Modifying a signal – Handling noise – If the nature of the signal is known • Why use PR methods for compression? – Generality (ideally any signal can be treated) – Localising the source of errors! 6

  7. 3 Harmonic Band Wavelet Transform (Polotti and Evangelista, 2000) ... x ( n ) g 0 ( k ) φ ( k ) φ ( k ) DC Comp. ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ P 2 2 ❄ ❄ ❄ ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ ... g 1 ( k ) φ ( k ) φ ( k ) Sinusoidal ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ 2 2 P ❄ ❄ ❄ Part ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ ❄ ... ... ... ... g P − 1 ( k ) φ ( k ) φ ( k ) Sinusoidal ✲ ✲ ✲ ✲ ✲ ✲ ✲ ✲ 2 2 P ❄ ❄ ❄ Part ψ ( k ) ψ ( k ) ✲ ✲ ✲ ✲ ✲ ✲ 2 2 ❄ ❄ 7

  8. 10000 9000 8000 7000 6000 frequency [Hz] 5000 4000 3000 2000 1000 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time [sec]

  9. 10000 9000 8000 7000 6000 frequency [Hz] 5000 4000 3000 2000 1000 0 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 time [sec]

  10. � � 3.1 Coefficient modeling • Wavelet Transform • Model the scale residual sinusoidally • Model the wavelet coefficients using LPC 7 scale Φ ω | 4,0 ( ) | residual 6 | � Ψ 4,0 ( )| ω N =4 5 | � Ψ 3,0 ( )| ω n =3 4 3 Ψ ω | � 2,0 ( ) n =2 | 2 Ψ ω | � 1,0 ( ) | n =1 1 0 0 0.5 1 1.5 2 2.5 3 3.5 10

  11. 3.2 Advantages / Drawbacks + Meaningful adaptation of frequency and time resolution = ⇒ Visually better resolution + Reasonable model for the coefficients − Works only for monophonic, harmonic sounds − No model for the transients 11

  12. 4 From HBWT to inharmonic sound modeling 1800 1600 1400 1200 frequency [Hz] 1000 800 600 400 200 1 2 3 4 5 6 7 time [sec]

  13. 4.1 Taking filters from different PR filterbanks 1 st partial 2 nd partial 3 rd partial . . . ω 1 st partial 2 nd partial 3 rd partial . . . ω

  14. 4.2 Why aliasing is not a problem If a sinusoid of the form  ˆ   kπ sin P t + ϕ  is the input to a P-channel cosine modulated filterbank, only two bands will output nonzero coefficients: ˆ kπ P ) | � = 0 ⇔ k ∈ { ˆ k − 1 , ˆ | H k ( e j k } partial’s frequency ω = ⇒ there is no aliasing of the sinusoidal part, but only of the part that we model as noise! 14

  15. 4.3 Method Overview Analysis analyse signal → find N partials → determine filterbank ↓ calculate 2 N sets of filterbank coefficients + residual ↓ calculate wavelet transform (WT) of filterbank coefficients ↓ model the WT coefficients sinusoidally and with LPC Synthesis reconstruct WT coefficients ↓ perform inverse wavelet transform → get filterbank coefficients ↓ inverse filterbank ↓ add residual (or not) 17

  16. 4.4 Sounds • Original Gong • Reconstructed from the Filterbank Coefficients • Synthesized from model parameters • 1 octave pitch-shifted Gong • Time-stretched Gong • Sinusoidal-only Gong • First wavelet scale only • Harmonic Gong 18

  17. 5 Time-Frequency Analysis and Granular Synthesis • Any Time-Frequency Transform implements a sort of Granular Synthesis. • Each coefficient corresponds to a grain • Grains are played at precise instants (instead of randomly) • To produce a grain, set all coefficients to zero, except one that will be set to one. Then perform the inverse transform. 19

  18. Windowed FFT (STFT) grain 2 x 10 −3 1.5 1 0.5 0 −0.5 −1 −1.5 −2 0 2 4 6 8 10 12 time [msec] play 20

  19. Cosine Modulated Filterbank grain 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 time [msec] play 21

  20. Full-tree wavelet “grain” 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 10 20 30 40 50 60 time [msec] play 22

  21. HBWT grain (noise part) 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 2 4 6 8 10 12 14 16 time [msec] play 23

  22. HBWT grain (sinusoidal part) 0.05 0.04 0.03 0.02 0.01 0 −0.01 −0.02 −0.03 −0.04 −0.05 0 20 40 60 80 100 120 140 160 time [msec] play 24

  23. 5.1 Time-domain effects 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 1 2 3 4 5 6 7 time [msec] 0.15 0.1 0.05 0 −0.05 −0.1 −0.15 −0.2 0 1 2 3 4 5 6 7 time [msec] Channel 8: one grain played continuously Channel 9: one grain played continuously 25

  24. 5.2 Scale of all grains in a 1024-band full-tree wavelet decomposition x 10 4 2.2 2 1.8 1.6 1.4 frequency [Hz] 1.2 1 0.8 0.6 0.4 0.2 0.5 1 1.5 2 2.5 3 3.5 4 4.5 5 5.5 time [sec] play 26

  25. A References • Article on Harmonic Band Wavelet Transform by Polotti and Evangelista http://lcavwww.epfl.ch/publications/publications/2000/PolottiE00b.pdf • DAFx 2002 paper on adaptation to inharmonic sounds http://lcavwww.epfl.ch/publications/publications/2002/PolottiE02.pdf • Some material (presentation slides, Matlab functions and pure data objects) http://www.xsmusic.ch/ 27

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend