CTP431- Music and Audio Computing Audio Signal Processing (Part #2) - - PowerPoint PPT Presentation

ctp431 music and audio computing audio signal processing
SMART_READER_LITE
LIVE PREVIEW

CTP431- Music and Audio Computing Audio Signal Processing (Part #2) - - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Audio Signal Processing (Part #2) Graduate School of Culture Technology KAIST Juhan Nam 1 Types of Audio Signal Processing Filter/EQ Compressor Delay-based Effects Delay, reverberation Spatial


slide-1
SLIDE 1

CTP431- Music and Audio Computing Audio Signal Processing (Part #2)

Graduate School of Culture Technology KAIST Juhan Nam

1

slide-2
SLIDE 2

Types of Audio Signal Processing

§ Filter/EQ § Compressor § Delay-based Effects

– Delay, reverberation

§ Spatial Effect

– HRTF

§ Playback Rate Conversion

– Resampling

2

slide-3
SLIDE 3

Filters

§ Adjust the level of a certain frequency band

– Lowpass – Highpass – Bandpass – Notch – Resonant Filter – Equalizer

§ Parameters

– Cut-off/Center Frequency – Q: sharpness/resonance

3

slide-4
SLIDE 4

Low-pass Filter

§ Transfer Function

– fc : cut-off frequency, Q: resonance

4

H(z) = (1−cosΘ 2 ) 1+ 2z−1 +1z−2 (1+α)− 2cosΘz−1 +(1−α)z−2 α = sinΘ 2Q

10

2

10

3

10

4

−30 −20 −10 10 20 30 f=400 f=1000 f=3000 f=8000 Lowpass Filters freqeuncy(log10) Gain(dB) 10

2

10

3

10

4

−30 −20 −10 10 20 30 Q =0.5 Q =1 Q =2 Q =4 Lowpass Filters freqeuncy(log10) Gain(dB)

Θ = 2π fc / fs

slide-5
SLIDE 5

High-pass Filter

§ Transfer Function

5

H(z) = (1+cosΘ 2 ) 1− 2z−1 +1z−2 (1+α)− 2cosΘz−1 +(1−α)z−2 α = sinΘ 2Q Θ = 2π fc / fs

10

2

10

3

10

4

−30 −20 −10 10 20 30 f=400 f=1000 f=3000 f=8000 Highpass Filters freqeuncy(log10) Gain(dB) 10

2

10

3

10

4

−30 −20 −10 10 20 30 Q =0.5 Q =1 Q =2 Q =4 Highpass Filters freqeuncy(log10) Gain(dB)

slide-6
SLIDE 6

Band-pass filter

§ Transfer Function

6

H(z) = (sinΘ 2 ) 1− z−2 (1+α)− 2cosΘz−1 +(1−α)z−2

10

2

10

3

10

4

−30 −20 −10 10 20 30 f=400 f=1000 f=3000 f=8000 Bandpass Filters freqeuncy(log10) Gain(dB) 10

2

10

3

10

4

−30 −20 −10 10 20 30 Q =0.5 Q =1 Q =2 Q =4 Bandpass Filters freqeuncy(log10) Gain(dB)

α = sinΘ 2Q Θ = 2π fc / fs

slide-7
SLIDE 7

Notch filter

§ Transfer Function

7

H(z) = 1− 2cosΘz−1 + z−2 (1+α)− 2cosΘz−1 +(1−α)z−2 α = sinΘ 2Q Θ = 2π fc / fs

10

2

10

3

10

4

−30 −20 −10 10 20 30 f=400 f=1000 f=3000 f=8000 Notch Filters freqeuncy(log10) Gain(dB) 10

2

10

3

10

4

−30 −20 −10 10 20 30 Q =0.5 Q =1 Q =2 Q =4 Notch Filters freqeuncy(log10) Gain(dB)

slide-8
SLIDE 8

10

2

10

3

10

4

−30 −20 −10 10 20 30 AdB=−12 AdB=−6 AdB=0 AdB=6 AdB=12 EQ freqeuncy(log10) Gain(dB) 10

2

10

3

10

4

−30 −20 −10 10 20 30 AdB=−12 AdB=−6 AdB=0 AdB=6 AdB=12 EQ freqeuncy(log10) Gain(dB)

Equalizer

§ Transfer Function

8

H(z) = (1+α ⋅ A)− 2cosΘz−1 +(1+α ⋅ A)z−2 (1+α / A)− 2cosΘz−1 +(1−α / A)z−2 α = sinΘ 2Q Θ = 2π fc / fs

Q=1 Q=4

slide-9
SLIDE 9

References

§ Cookbook formulae for audio EQs based on biquad filter (R. Bristow- Johnson)

– http://www.musicdsp.org/files/Audio-EQ-Cookbook.txt

9

slide-10
SLIDE 10

Compressor

§ Audio effect unit for automatic gain control

– Boost the level for soft signals and suppress it for loud signals – Typically used as a front-end processor in sound recording

§ Signal Processing Pipeline

10

Gain Curve Envelop Detector Input Output X

slide-11
SLIDE 11

Envelope Detector

§ Detecting the level of signal § Different sensitivity for increasing (attack) and decreasing (release) levels

– During attack: – During release:

11

Full-wave rectification Input Leaky Integrator envelope

y(n) = y(n −1)+(1−e

−1(attack _time*fs))( x(n) − y(n −1))

y(n) = y(n −1)+(1−e

−1(release_time*fs))( x(n) − y(n −1))

slide-12
SLIDE 12

Gain Curve

§ Parameters

– Threshold: level – Attack/Release: sensitivity – Ratio: amount of compression – Knee: smoothing

12

Input (dB) Output (dB) Threshold

No compression

Gain Curve

1:2 1:4 1:10

Ratio Soft Knee Hard Knee

0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10

4

−1 −0.5 0.5 1 time, samples 0.2 0.4 0.6 0.8 1 1.2 1.4 1.6 1.8 2 x 10

4

−1 −0.5 0.5 1 time, samples

Before compression After compression

slide-13
SLIDE 13

Delay-based Audio Effects

§ Types of delay-based audio effect

– Delay – Chorus – Flanger – Reverberation

13

slide-14
SLIDE 14

Delay

§ Delay effect

– Generate repetitive loop delay – Feedback coefficient controls the amount of delayed input – Can be extended to stereo signals such that the delay output is “ping-ponged” between the left and right channels – The delay length is often synchronized with music tempo – The delayline is implemented as a “circular buffer”

14

+

x(n)

feedback

y(n)

Dry

+

Wet

Delay Line

slide-15
SLIDE 15

Chorus

§ Chorus effect

– Gives the illusion of multiple voices playing in unison – By summing detuned copies of the input – Low frequency oscillators are used to modulate the position of output tops à This causes the pitch of the input (resampling!)

15

LFOs

x(n) y(n)

Dry

+ +

Wet

Delay Line

slide-16
SLIDE 16

Flanger

§ Flanger effect

– Originally generated by summing the output of two un-locked tape machines while varying their sync (used to be called “reel-flanging”) – Emulated by summing one static tap and variable tap in the delay line

  • Feed-forward combine filter where harmonic notches vary over frequency.

– LFO is often synchronized with music tempo

16

x(n)

+

LFOs Static tap Variable tap

y(n)

+

Wet Dry

Delay Line

slide-17
SLIDE 17

Reverberation

§ Natural acoustic phenomenon that occurs when sound sources are played in a room

– Thousands of echoes are generated as sound sources are reflected against wall, ceiling and floors – Reflected sounds are delayed, attenuated and low-pass filtered: high-frequency component decay faster – The patterns of myriads of echoes are determined by the volume and geometry of room and materials on the surfaces

17

Sound Source Listener Direct sound Reflected sound

slide-18
SLIDE 18

Reverberation

§ Room reverberation is characterized by its impulse response (IR)

– E.g. when a balloon pop is used as a sound source

§ The room IR is composed of three parts

– Direct path – Early reflections – Late-field reverberation: high echo density

§ RT60

– The time that it takes the reverberation to decay by 60 dB from its peak amplitude

18

10 20 30 40 50 60 70 80 90 100

  • 0.4
  • 0.2

0.2 0.4 0.6 0.8 1 CCRMA Lobby Impulse Response time - milliseconds response amplitude direct path early reflections late-field reverberation

slide-19
SLIDE 19

Artificial Reverberation

§ Mechanical reverb

– Use metal plate and spring – Plate reverb: https://www.youtube.com/watch?v=XJ5OFpvX5Vs

§ Delayline-based reverb

– Early reflections: feed-forward delayline – Late-field reverb: allpass/comb filter, feedback delay networks (FDN) – “Programmable” reverberation

§ Convolution reverb

– Measure the impulse response of a room – Do convolution input with the measured IR

19

slide-20
SLIDE 20

Delay-based Reverb

20

Z-M

+

x(n)

_

+

y(n) AllPass filter / Comb filter (when one tap is absent)

  • The lengths of delaylines are chosen

such that their greatest common factors is small (e.g. prime numbers)

  • The mixing matrix is chosen to be

unitary (orthonormal)

+

x(n) Feedback Delay Networks Z-M1 Z-M2 Z-M3

+

a11 a12 a13 a11 a12 a13 a11 a12 a13 y(n)

  • A reverb is constructed by cascading

multiple AP or FFCF units

slide-21
SLIDE 21

Convolution Reverb

§ Measuring impulse responses

– If the input is a unit impulse, SNR is low – Instead, we use specially designed input signals

  • Golay code, allpass chirp or sine sweep: their magnitude responses are all flat but

the signals are spread over time – The impulse response is obtained using its inverse signal or inverse discrete Fourier transform

21

s(t)

LTI system

r(t)

test sequence measured response

n(t) h(t)

measurement noise

r(t) = s(t) ∗ h(t) + n(t),

slide-22
SLIDE 22

Convolution Reverb

22

500 1000 1500

  • 0.5

0.5 sine sweep, s(t) amplitude frequency - kHz sine sweep spectrogram 200 400 600 800 1000 5 10 500 1000 1500 2000

  • 1
  • 0.5

0.5 1 sine sweep response, r(t) time - milliseconds amplitude time - milliseconds frequency - kHz sine sweep response spectrogram 500 1000 1500 2000 5 10 100 200 300 400 500 600 700 800 900 1000

  • 0.04
  • 0.02

0.02 0.04 0.06 0.08 measured impulse response time - milliseconds amplitude

s(t) r(t) ˆ h (t)

( J. Abel )

slide-23
SLIDE 23

Spatial Hearing

§ A sound source arrives in the ears of a listener with differences in time and level

– The differences are the main cues to identify where the source is. – We call them ITD (Inter-aural Time Difference) and IID (Inter-aural Intensity Difference) – ITD and IID are a function of the arrival angle.

23

R L ITD IID

slide-24
SLIDE 24

Head-Related Transfer Function (HRTF)

§ A filter measured as the frequency response that characterizes how a sound source arrives in the outer end of ear canal

– Determined by the refection on head, pinnae or other body parts – Function of azimuth (horizontal angle) and elevation (vertical angle)

24

𝐼"(𝜕, ∅, 𝜄) 𝐼)(𝜕, ∅, 𝜄)

R L

slide-25
SLIDE 25

25

Measured Head-Related Impulse Responses

slide-26
SLIDE 26

26

Magnitude response

  • f the HRIRs
slide-27
SLIDE 27

Binaural Synthesis

§ Rendering the spatial effect using the measured HRIRs as FIR filters

– HRIRs are typically several hundreds sample long – Convolution or modeling by IIR filters

§ Individualization of HRTF is a issue

27

Input Left output Right output ℎ"(𝑢, ∅, 𝜄) ℎ)(𝑢, ∅, 𝜄)

slide-28
SLIDE 28

Playback Rate Conversion

§ Adjusting playback rate given the sampling rate

– Analogy to sliding tapes on the magnetic header in a variable speed – Speeding down: “monster-like” – Speeding up: “chipmunk-like”

28

slide-29
SLIDE 29

Playback Rate Conversion

§ Change pitch, length and timbre

29

[The DaFX book]

slide-30
SLIDE 30

Resampling

§ Playback rate conversion is performed by resampling

– Interpolation on discrete samples – Convolution with interpolation filters – Need to avoid aliasing for down sampling

  • Narrowing the bandwidth of the lowpass filter

§ Two Types

– Down-sampling: pitch goes up and time shrinks – Up-sampling: pitch goes down and time expands

30

slide-31
SLIDE 31

Interpolation Filters

31

−5 −4 −3 −2 −1 1 2 3 4 5 0.5 1 1.5 Windowed Sinc Sample Time −5 −4 −3 −2 −1 1 2 3 4 5 0.5 1 1.5 Linear Sample Time −5 −4 −3 −2 −1 1 2 3 4 5 0.5 1 1.5 3rd−order B−spline Sample Time −5 −4 −3 −2 −1 1 2 3 4 5 0.5 1 1.5 3rd−order Lagrange Sample Time

h(t) = w(t)sinc(t) = w(t)sin(πt) πt

x(d) = x(k)

k=−(L−1) k=L

h(d − k)

Delayed by d ( 0 < d < 1)