GCT535- Sound Technology for Multimedia Tonal Analysis
Graduate School of Culture Technology KAIST Juhan Nam
1
GCT535- Sound Technology for Multimedia Tonal Analysis Graduate - - PowerPoint PPT Presentation
GCT535- Sound Technology for Multimedia Tonal Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outline Pitch Perception Perceptual Pitch Scale Log-Scaled Spectrum Tonal Analysis Chroma Feature Key
1
2
3
Piano (Chromatic Scale) Beatles “Hey Jude”
time [second] frequency−Hz 10 20 30 40 50 500 1000 1500 2000 2500 3000 3500 4000 time [second] frequency−Hz 1 2 3 4 5 6 7 8 2000 4000 6000 8000 10000
4
Response of the basilar membrane to a pair of tones
From CCRMA Music 150 slides (Thomas Rossing)
5
From CCRMA Music 150 slides (Thomas Rossing)
0.5 1 1.5 2 2.5 x 10
4
0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 frequency (Hz) normalized scales ERB Mel Bark
6
Using Matlab code from https://www.speech.kth.se/~giampi/auditoryscales/
Comparison of Pitch Scales
7
f = 440⋅2
(m−69) 12
https://newt.phys.unsw.edu.au/jw/notes.html
time [second] MIDI note number 10 20 30 40 50 20 40 60 80 100 120 time [second] frequency−Hz 10 20 30 40 50 500 1000 1500 2000 2500 3000 3500 4000
8 Center Frequency Band width
Log-Frequency Spectrogram Linear-Frequency Spectrogram
100 200 300 400 500 600 20 40 60 80 100 120
9
time [second] MIDI note number 10 20 30 40 50 20 40 60 80 100 120
(M: mapping matrix, X: spectrogram, Y: scaled spectrogram)
time [second] frequency−Hz 10 20 30 40 50 500 1000 1500 2000 2500 3000 3500 4000
10
Linear-Frequency Spectrogram Mel-Frequency Spectrogram
time [second] frequency−Hz 1 2 3 4 5 6 7 8 2000 4000 6000 8000 10000 time [second] Mel bin 1 2 3 4 5 6 7 8 50 100 150 200 250
11
12
Spectrogram (short window)
time frequency
Spectrogram (long window)
time frequency
Mel Spectrogram
time frequency
Constant-Q transform
time frequency
13 time [second] MIDI note number 10 20 30 40 50 20 40 60 80 100 120
Log-Frequency Spectrogram (mapping) Log-Frequency Spectrogram (Constant-Q transform)
10 20 30 40 50 100 120 140 160 180 200 220 240 260 280 300 320 time [second]
14
15
Pitch Helix and Chroma (Shepard, 2001)
16
Optical illusion stairs Shepard tone https://vimeo.com/34749558
17
18
19
20
(Muller, 2011)
21
(From Ellis’ slides)
22
23
Probe Tone Profile - Relative Pitch Ranking
24
25
26
(from Bello’s Slides)
27
(from Bello’s Slides)
28
29
30