ctp431 music and audio computing digital audio
play

CTP431- Music and Audio Computing Digital Audio Graduate School of - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1 Digital Representations 0 1 1 0 1 1 0 Sound 1 0 0 1 1 0 1 Image 0 0 1 1 0 1 1 Text Digital


  1. CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1

  2. Digital Representations … 0 1 1 0 1 1 0 … Sound … 1 0 0 1 1 0 1 … Image … 0 0 1 1 0 1 1 … Text

  3. Digital Representations § Sampling and Quantization – Sound (samples) – Image (pixels) § Trade-off – Resolution (quality) and data size

  4. Digital Audio Chain …0 0 1 0 1 0 … 4

  5. Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 5

  6. Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 6

  7. Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m f m f s -f m f s -f s -f s +f m -f m f s +f m Nyquist Frequency § Why? 𝑦 𝑜 = sin 2𝜌𝑔 * 𝑜𝑈 - = sin 2𝜌𝑔 * 𝑜/𝑔 - 𝑦 𝑢 = sin 2𝜌𝑔 * 𝑢 = sin 2𝜌𝑔 * 𝑜/𝑔 - ± 2𝜌𝑙𝑜 = sin 2𝜌𝑜(𝑔 * ± 𝑙𝑔 - )/𝑔 - 7

  8. Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 8

  9. Aliasing in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency Audible range Audible range -f m f m -f s -f s +f m f m f s -f m f s -f m f s +f m § The frequency that we hear is 𝑔 - − 𝑔 * In order to avoid aliasing f m < f s − f m 9

  10. Aliasing in Frequency Domain § For general signals, high-frequency content is folded over to lower frequency range Audible range f s -f s -f m f s -f m f m f s +f m 10

  11. Avoid Aliasing § Increase sampling rate f s > 2 ⋅ f m § Use lowpass filters before sampling -f s -f m f s -f m f s f m f s +f m Lowpass Filter f s -f s -f s /2 f s /2 11

  12. Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Trivial sawtooth wave spectrum Bandlimited sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 12 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)

  13. Example of Aliasing Aliasing in Video https://www.youtube.com/watch?v=jHS9JGkEOmA 13

  14. Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 14

  15. Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 15

  16. Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 16

  17. Dynamic Range § Dynamic range Again, RMS of full-scale sine wave – The ratio between the loudest and softest levels for both loudest and softest 2 B − 1 / S rms,max 2 (With 16bits, DR = 90.31 dB) 20log 10 = 20log 10 = 6.02 B − 6 S rms,min 1/ 2 § Human ear’s dynamic range – Depending on frequency band 17 Equal Loudness Curve

  18. Clipping and Headroom § Clipping – Non-linear distortion that occurs when a signal is above the max level § Headroom – Margin between the peak level and the max level In digital audio, 0dB is regarded as the maximum level Clipping 0 dB Max level Head room B = 16 bits -90.31 dB Min level -98.08 dB Noise floor (By quantization) 18

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend