ctp431 music and audio computing fundamentals of sound
play

CTP431- Music and Audio Computing Fundamentals of Sound and Digital - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines What is Sound? Sound Properties Loudness Pitch Timbre Digital Representation of


  1. CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of Culture Technology KAIST Juhan Nam 1

  2. Outlines § What is Sound? § Sound Properties – Loudness – Pitch – Timbre § Digital Representation of Sound – Sampling – Quantization 2

  3. What Is Sound? § Vibration of air that you can hear – Compression and rarefaction of air pressure Perception Propagation Production Vibration on materials Traveling via the air Sensation of the air vibration (e.g. string, pipe, membrane) through ears Physical Psychological 3

  4. Physical Sound § Governed by “Newton’s law” and ”Wave” properties § Sound production and propagation in musical instruments 1. Drive force on a sound object 2. Vibration by restoration force 3. Propagation 4. Reflection 5. Superposition 6. Standing Wave (modes): generate a tone Demos 7. Radiation from the object 8. Propagation through air http://www.acs.psu.edu/drussell/demos.html https://www.youtube.com/watch?v=_X72on6CSL0 4

  5. Psychological Sound § Governed by ears (physiological sense) and brain (cognitive sense) – human auditory system § Ears – A series of highly sensitive transducers – Transform sound into subband signals Electric § Brain (Cook, 1999) – Segregate and organize the auditory stimulus Fluid Air Mechanical – Recognize loudness, pitch and timbre Auditory Transduction Video http://www.youtube.com/watch?v=PeTriGTENoc 5

  6. Sound Properties Loudness Amplitude Frequency Pitch Waveshape Timbre Time Envelop (ADSR) Spectral Envelope (Modes) … Physical Psychological 6

  7. Loudness § Perceptual correlate of sound intensity § Sound Pressure Level (SPL) – Objective measure of sound intensity – Log scale: 20log 10 ( P / P 0 ) 0 = 20 µ Pa : threshold of human hearing P – Loudness is proportional to SPL but not exactly § Equal-Loudness Curve – Most sensitive to 2-5KHz tones – Threshold of hearing Equal-Loudness Curve (also called Fetcher-Munson Curve) 7

  8. 4000 Pitch 3500 3000 § Perceptual correlate of fundamental 2500 frequency − Hz 2000 frequency (F0) 1500 1000 § Pitch Scale 500 – Human ears are sensitive to frequency changes 0 10 20 30 40 50 time [second] in a log scale Chromatic Scale of Piano notes (Linear Frequency) • Ex) Piano note scale 120 100 § Frequency Range of Hearing MIDI note number 80 – 20 to 20kHz 60 40 20 10 20 30 40 50 time [second] Chromatic Scale of Piano notes 8 (Log Frequency)

  9. Timbre § Related to identifying a particular sound object – Musical instruments, human voices, … § Determined by multiple physical attributes – Time envelope (ADSR) – Spectral envelope – Changes of spectral envelope and fundamental ADSR frequency – Harmonicity: ratio between tonal and noise-like characteristics – The onset of a sound differing notably from the sustained vibration Changes of spectral envelope 9

  10. Timbre § Determined by multiple parameters – Perspective of sound synthesis Source: http://www.matrixsynth.com/2011/05/kid-with-buchla.html 10

  11. Digital Audio Chain …0 0 1 0 1 0 … 11

  12. Microphones / Speakers § Microphones – Air vibration to electrical signal – Dynamic / condenser microphones – The signal is very weak: use of pre-amp § Speakers – Electrical signal to air vibration – Generate some distortion (by diaphragm) – Crossover networks: woofer / tweeter 12

  13. Sampling • Convert continuous-time signal to discrete-time signal by periodically picking up the instantaneous values – Represented as a sequence of numbers; pulse code modulation (PCM) – Sampling period ( T s ): the amount of time between samples – Sampling rate ( f s = 1/ T s ) Signal notation T s x ( t ) → x ( nT s ) 13

  14. Sampling Theorem § What is an appropriate sampling rate? – Too high: increase data rate – Too low: become hard to reconstruct the original signal § Sampling Theorem – In order for a band-limited signal to be reconstructed fully, the sampling rate must be greater than twice the maximum frequency in the signal f s > 2 ⋅ f m f s – Half the sampling rate is called Nyquist frequency ( ) 2 14

  15. Sampling in Frequency Domain § Sampling in time creates imaginary content of the original at every f s frequency -f m f m f m f s -f s -f m f s -f m f s +f m To avoid overlap f m < f s − f m § Why ? f 2 = f 1 ± mf s x 1 ( t ) = A sin( ω 1 t ) = A sin(2 π f 1 n / f s ) x 2 ( t ) = A sin( ω 2 t ) = A sin(2 π f 2 n / f s ) = A sin(2 π ( f 1 ± mf s ) n / f s ) = A sin(2 π f 1 n / f s ± 2 π mn ) = A sin(2 π f 1 n / f s ) = x 1 ( t ) 15

  16. Aliasing § If the sampling rate is less than twice the maximum frequency, the high- frequency content is folded over to lower frequency range 1 0.8 0.6 0.4 0.2 0 − 0.2 − 0.4 − 0.6 − 0.8 0 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 4 x 10 16

  17. Aliasing in Frequency Domain § The high-frequency content is folded over to lower frequency range from the replicated images -f s -f m f s -f m f s f m f s +f m § A low-pass filter is applied before sampling to avoid the aliasing noise -f s /2 f s /2 f s -f s 17

  18. Example of Aliasing 0 0 Magnitude (dB) Magnitude (dB) − 20 − 20 − 40 − 40 − 60 − 60 5 10 15 20 5 10 15 20 Frequency (kHz) Frequency (kHz) Bandlimited sawtooth wave spectrum Trivial sawtooth wave spectrum 4 x 10 2 1.5 Frequency (Hz) 1 0.5 Frequency sweep of the trivial sawtooth wave 0 18 1 1.5 2 2.5 3 3.5 4 4.5 Time (s)

  19. Example of Aliasing § Aliasing in Video – https://www.youtube.com/watch?v=QOqtdl2sJk0 – https://www.youtube.com/watch?v=jHS9JGkEOmA ( Note that video frame rate corresponds to the sampling rate ) 19

  20. Sampling Rates § Determined by the bandwidth of signals or hearing limits – Consumer audio product: 44.1 kHz (CD) – Professional audio gears: 48/96/192 kHz – Speech communication: 8/16 kHz 20

  21. Quantization § Discretizing the amplitude of real-valued signals – Round the amplitude to the nearest discrete steps – The discrete steps are determined by the number of bit bits • Audio CD: 16 bits (-2 15 ~ 2 15 -1) ß B bits (-2 B-1 ~ 2 B-1 -1) 21

  22. Quantization Error § Quantization causes noise – Average power of quantization noise: obtained from the probability density function (PDF) of the error P ( e ) Root mean square (RMS) of noise 1 1/2 112 x 2 p ( e ) dx ∫ = − 1/2 -1/2 1/2 § Signal to Noise Ratio (SNR) RMS of full-scale sine wave – Based on average power 2 B − 1 / S rms 2 (With 16bits, SNR = 98.08dB) 20log 10 = 20log 10 = 6.02 B + 1.76 dB N rms 112 – Based on the max levels 2 B − 1 S max = 6.02 B dB (With 16bits, SNR = 96.32 dB) 20log 10 = 20log 10 12 N max 22

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend