CTP431- Music and Audio Computing Sound Synthesis Graduate School - - PowerPoint PPT Presentation

ctp431 music and audio computing sound synthesis
SMART_READER_LITE
LIVE PREVIEW

CTP431- Music and Audio Computing Sound Synthesis Graduate School - - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST Juhan Nam 1 Musical Sound Synthesis Modeling the patterns of musical tones and generating them (Typical) musical tones Time-wise: amplitude


slide-1
SLIDE 1

CTP431- Music and Audio Computing Sound Synthesis

Graduate School of Culture Technology KAIST Juhan Nam

1

slide-2
SLIDE 2

Musical Sound Synthesis

§ Modeling the patterns of musical tones and generating them § (Typical) musical tones

– Time-wise: amplitude envelope (ADSR) à Waveform – Frequency-wise : harmonic distribution à Spectrogram

§ How are musical tones different from other sounds (e.g. speech)?

2

slide-3
SLIDE 3

Types of Tones

§ Musical tones have more diverse types

– Harmonic: guitar, flute, violin, organ, singing voice (vowel) – Inharmonic: piano, vibraphone – Non-harmonic: drum, percussion, singing voice (consonant)

3

*Inharmonicity in Piano Vibraphone

[From Klapuri’s slides]

slide-4
SLIDE 4

Information in Tones

§ Musical tones have the main information in “pitch”

– Speech Tones have it mainly in “formant (i.e. spectral envelop)”

§ Examples

– Original music : – Reconstruction from timbre features (MFCC) and using white-noise as a source): – Reconstruction from tonal features (Chroma):

4

slide-5
SLIDE 5

Pitch Scale and Range

§ In music, pitch is arranged on a tuning system and the range is much wider

5

time [second] frequency−Hz 10 20 30 40 50 500 1000 1500 2000 2500 3000 3500 4000

slide-6
SLIDE 6

Control of Tones

§ Musical Instrument § Speech

6

Music Synthesizer Output Speech Synthesizer Note Number, Velocity, Duration Phonemes Output

(+ Expressions) (+ Expressions)

slide-7
SLIDE 7

Overview of Sound Synthesis Techniques

§ Signal model (analog / digital)

– Additive Synthesis – Subtractive Synthesis – Modulation Synthesis: ring modulation, frequency modulation – Distortion Synthesis: non-linear

§ Sample model (digital)

– Sampling Synthesis – Granular Synthesis – Concatenative Synthesis

§ Physical model (digital)

– Digital Waveguide Model

7

slide-8
SLIDE 8

Theremin

§ A sinusoidal tone generator § Two antennas are remotely controlled to adjust pitch and volume

Theremin ( by Léon Theremin, 1928)

slide-9
SLIDE 9

9

Theremin (Clara Rockmore)

https://www.youtube.com/watch?v=pSzTPGlNa5U

slide-10
SLIDE 10

Additive Synthesis

§ Synthesize sounds by adding multiple sine oscillators

– Also called Fourier synthesis

10

OSC OSC OSC

. . .

Amp (Env) Amp (Env) Amp (Env)

. . .

+

slide-11
SLIDE 11

Hammond Organ

11

§ Drawbars

– Control the levels of individual tonewheels

slide-12
SLIDE 12

Hammond Organ

12

https://www.youtube.com/watch?v=2rqn4bYFUZU

slide-13
SLIDE 13

Sound Examples

§ Web Audio Demo

– http://femurdesign.com/theremin/ – http://www.venlabsla.com/x/additive/additive.html – http://codepen.io/anon/pen/jPGJMK

§ Examples (instruments)

– Kurzweil K150

  • https://soundcloud.com/rosst/sets/kurzweil-k150-fs-additive

– Kawai K5, K5000

13

slide-14
SLIDE 14

Subtractive Synthesis

§ Synthesize sounds by filtering wide-band oscillators

– Source-Filter model – Examples

  • Analog Synthesizers: oscillators + resonant lowpass filters
  • Voice Synthesizers: glottal pulse train + formant filters

14

5 10 15 20 −60 −50 −40 −30 −20 −10 10 20 Frequency (kHz) Magnitude (dB) 5 10 15 20 −60 −50 −40 −30 −20 −10 10 20 Frequency (kHz) Magnitude (dB) 0.5 1 1.5 2 2.5 x 10

4

−60 −50 −40 −30 −20 −10 10 20 Frequency (kHz) Magnitude (dB)

Source Filter Filtered Source

slide-15
SLIDE 15

Moog Synthesizers

15

Envelope

Envelope LFO Wheels Slides Pedal Physical Control

Filter Oscillators Amp

Keyboard Audio Path Soft Control Parameter = offset + depth*control (e.g. filter cut-off frequency) (static value) (dynamic value)

Parameter Parameter Parameter

slide-16
SLIDE 16

Oscillators

§ Classic waveforms § Modulation

– Pulse width modulation – Hard-sync – More rich harmonics

16

50 100 150 200 −2 2 5 10 15 20 −60 −40 −20 20 Frequency (kHz) Magnitude (dB)

−6dB/oct

50 100 150 200 −1 1 5 10 15 20 −60 −40 −20 20 Frequency (kHz) Magnitude (dB)

−6dB/oct

50 100 150 200 −2 2 5 10 15 20 −60 −40 −20 20 Frequency (kHz) Magnitude (dB)

−12dB/oct

Sawtooth Triangular Square

slide-17
SLIDE 17

Amp Envelop Generator

§ Amplitude envelope generation

– ADSR curve: attack, decay, sustain and release – Each state has a pair of time and target level

17

Note On Note Off

Attack Decay Sustain Release

Amplitude (dB)

slide-18
SLIDE 18

Examples

§ Web Audio Demos

– http://www.google.com/doodles/robert-moogs-78th-birthday – http://webaudiodemos.appspot.com/midi-synth/index.html – http://aikelab.net/websynth/ – http://nicroto.github.io/viktor/

§ Example Sounds

– SuperSaw – Leads – Pad – MoogBass – 8-Bit sounds: https://www.youtube.com/watch?v=tf0-Rrm9dI0 – TR-808: https://www.youtube.com/watch?v=YeZZk2czG1c

18

slide-19
SLIDE 19

Modulation Synthesis

§ Modulation is originally from communication theory

– Carrier: channel signal, e.g., radio or TV channel – Modulator: information signal, e.g., voice, video

§ Decreasing the frequency of carrier to hearing range can be used to synthesize sound § Types of modulation synthesis

– Amplitude modulation (or ring modulation) – Frequency modulation

§ Modulation is non-linear processing

– Generate new sinusoidal components

19

slide-20
SLIDE 20

Ring Modulation / Amplitude Modulation

§ Change the amplitude of one source with another source

– Slow change: tremolo – Fast change: generate a new tone

20

OSC OSC

(1+ am(t))Ac cos(2π fct) am(t)Ac cos(2π fct)

Amplitude Modulation

x

Carrier Modulator

OSC OSC

x

Carrier Modulator

+ Ring Modulation

slide-21
SLIDE 21

Ring Modulation / Amplitude Modulation

§ Frequency domain

– Expressed in terms of its sideband frequencies – The sum and difference of the two frequencies are obtained according to trigonometric identity – If the modulator is a non-sinusoidal tone, a mirrored-spectrum with regard to the carrier frequency is obtained

21

fc+fm fc fc-fm

am(t) = Am sin(2π fmt))

carrier sideband sideband

slide-22
SLIDE 22

Examples

§ Tone generation

– SawtoothOsc x SineOsc – https://www.youtube.com/watch?v=yw7_WQmrzuk

§ Ring modulation is often used as an audio effect

– http://webaudio.prototyping.bbc.co.uk/ring-modulator/

22

slide-23
SLIDE 23

Frequency Modulation

§ Change the frequency of one source with another source

– Slow change: vibrato – Fast change: generate a new (and rich) tone – Invented by John Chowning in 1973 à Yamaha DX7

23

Ac cos(2π fct + β sin(2π fmt)) β = Am fm

Index of modulation

OSC OSC

Carrier Modulator frequency

slide-24
SLIDE 24

Frequency Modulation

§ Frequency Domain

– Expressed in terms of its sideband frequencies – Their amplitudes are determined by the Bessel function – The sidebands below 0 Hz or above the Nyquist frequency are folded

24

y(t) = Ac Jk(

k=−∞ k=−∞

β)cos(2π( fc + kfm)t) fc+fm fc fc+2fm fc+3fm fc-fm fc-2fm fc-3fm

carrier sideband1 sideband1 sideband2 sideband2 sideband3 sideband3

slide-25
SLIDE 25

Bessel Function

25

Jk(β) = (−1)n(β 2 )k+2n n!(n + k)! n=0

50 100 150 200 250 300 350 −0.5 0.5 1 beta J_(k) Carrier Sideband 1 Sideband 2 Sideband 3 Sideband 4

slide-26
SLIDE 26

Bessel Function

26

slide-27
SLIDE 27

The Effect of Modulation Index

27

500 1000 1500 2000 −1 1 Time (Sample) Amplitude 0.2 0.4 0.6 0.8 1 −60 −40 −20 20 Frequency (kHz) Magnitude (dB) Beta = 0 500 1000 1500 2000 −1 1 Time (Sample) Amplitude 0.2 0.4 0.6 0.8 1 −60 −40 −20 20 Frequency (kHz) Magnitude (dB) Beta = 1 500 1000 1500 2000 −1 1 Time (Sample) Amplitude 0.2 0.4 0.6 0.8 1 −60 −40 −20 20 Frequency (kHz) Magnitude (dB) Beta = 10 500 1000 1500 2000 −1 1 Time (Sample) Amplitude 0.2 0.4 0.6 0.8 1 −60 −40 −20 20 Frequency (kHz) Magnitude (dB) Beta = 20

fc = 500, fm = 50

slide-28
SLIDE 28

28

Yamaha DX7 (1983)

slide-29
SLIDE 29

“Algorithms” in DX7

29 http://www.audiocentralmagazine.com/yamaha-dx-7-riparliamo-di-fm-e-non-solo-seconda-parte/yamaha-dx7-algorithms/

slide-30
SLIDE 30

Examples

§ Web Audio Demo

– http://www.taktech.org/takm/WebFMSynth/

§ Sound Examples

– Bell – Wood – Brass – Electric Piano – Vibraphone

30

slide-31
SLIDE 31

Non-linear Synthesis (wave-shaping)

§ Generate a rich sound spectrum from a sinusoid using non-linear transfer functions (also called “distortion synthesis”) § Examples of transfer function: y = f(x)

– y = 1.5x’ – 0.5x’3 – y = x’/(1+|x’|) – y = sin(x’) – Chebyshev polynomial: Tk+1(x) = 2xTk(x)-Tk-1(x)

31

50 100 150 200 −1 1 Time (Sample) Amplitude 5 10 15 20 −60 −40 −20 20 Frequency (kHz) Magnitude (dB)

50 100 150 200 −1 1 Time (Sample) Amplitude 5 10 15 20 −60 −40 −20 20 Frequency (kHz) Magnitude (dB)

−1 −0.5 0.5 1 −1 −0.5 0.5 1 Time (Sample) Amplitude

T0(x)=1, T1(x)=x, T2(x)=2x2-1, T2(x)=4x3-3x x’=gx: g correspond to the “gain knob” of the distortion

slide-32
SLIDE 32

Physical Modeling

§ Modeling Newton’s laws of motion (i.e. 𝐺 = 𝑛𝑏) on musical instruments

– Every instrument have a different model

§ The ideal string

– Wave equation: 𝐺 = 𝑛𝑏 à 𝐿

'() '*( = 𝜁 '() '*(

(𝐿: tension, 𝜁: linear mass density) – General solution: 𝑧 𝑢, 𝑦 = 𝑧0(𝑢 −

* 3) + 𝑧5(𝑢 + * 3)

àLeft-going traveling wave and right-going traveling wave

32

slide-33
SLIDE 33

Physical Modeling

§ Waveguide Model

– With boundary condition (fixed ends)

§ The Karplus-Strong model

33

  • 1.0
  • 0.99

Z-M

+

x(n)

Delay Line

y(n)

Noise Burst Lowpass Filter

slide-34
SLIDE 34

Physical Modeling

§ The Extended Karplus -Strong model

34

https://ccrma.stanford.edu/~jos/pasp/Extended_Karplus_Strong_Algorithm.html

slide-35
SLIDE 35

Sample-based Synthesis

35

Synthogy Ivory II Piano Foley (filmmaking) Ringtones

§ The majority of digital sound and music synthesis today is accomplished via the playback of stored waveforms

– Media production: sound effects, narration, prompts – Digital devices: ringtone, sound alert – Musical Instruments

  • Native Instrument Kontakt5: 43+ GB (1000+ instruments)
  • Synthogy Ivory II Piano: 77GB+ (Steinway D Grand, ….)
slide-36
SLIDE 36

Why Don’t We Just Use Samples?

§ Advantages

– Reproduce realistic sounds (needless to say) – Less use of CPU

§ Limitations

– Not flexible: repeat the same sound again, not expressive – Can require a great deal of storage – Need high-quality recording – Limited to real-world sounds

§ Better ways

– Modify samples based on existing sound processing techniques

  • Much richer spectrum of sounds

– Trade-off: CPU, memory and programmability

36

slide-37
SLIDE 37

Sample-based Synthesis

37

Off-Line Processing Recording Storing in table Off-line Online Processing Sample Fetching On-line

Meta-data: pitch, loudness, action, text

User input Audio Effect

  • Sample editing
  • Pitch change
  • Filter, EQ, envelope
  • Normalization
  • Pitch Change
  • Filter, EQ
  • Envelope
  • Delay-based

effects (e.g. room Effects)

  • Pitch, velocity,

Timbre

  • Speed, strength
  • text

Read Samples from the table

slide-38
SLIDE 38

Wavetable Synthesis

§ Playback samples stored in tables

– Multi-sampling: choose different sample tables depending on input conditions such pitch and loudness

  • Velocity switching

§ Reducing sample tables in musical synthesizers

– Sample looping: reduce the size of tables – Pitch shifting by re-sampling: avoid sampling every single pitch – Filtering: avoid sampling every single loudness

  • e.g. low-pass filtering for soft input

38

slide-39
SLIDE 39

Sample Looping

§ Find a periodic segment and repeat it seamlessly during playback

– Particularly for instruments with forced oscillation (e.g. woodwind) – Usually taken from the sustained part of a pitched musical note

§ It is not easy to find an exactly clean loop

– The amplitude envelopes often decays or modulated:

  • e.g. piano, guitar, violin

– Period in sample is not integer à non-integer-size sample table?

39

Attack Loop

Playback using looping

slide-40
SLIDE 40

Sample Looping

§ Solutions

– Decaying amplitude: normalize the amplitude

  • Compute the envelope and multiply it inverse
  • Then, multiply the envelope back later

– Non-integer period in sample

  • Use multiple periods for the loop such that the total period is close to integers

* e.g. Period = 100.2 samples à 5*Period = 501 samples – Amplitude modulation

  • Crossfade between the end of loop and the beginning of loop meet

§ Automatic loop search

– Pitch detection and zero-crossing detection: c.f. samplers

40

slide-41
SLIDE 41

Concatenative Synthesis

§ Splicing sample segments based on input information

– Typically done in speech synthesis: unit selection

§ Sample size depends on applications

– ARS: limited expression and context-dependent

  • word or phrase level

– TTS: unlimited expression and context-independent

  • phone or di-phone (phone-to-phone transition) level

41

slide-42
SLIDE 42

Summary

42

Memory (Storage) Programmability (by # of parameters) Reproducibility of natural sounds Interpretability

  • f parameters

Computation power Additive ** ***** **** **** **** Subtractive * *** ** *** ** Non-linear * *** ** ** ** Physical model *** ** **** ***** *** ~ ***** Sample-based ***** * ***** N/A * ~ ***