GCT535- Sound Technology for Multimedia Temporal Analysis
Graduate School of Culture Technology KAIST Juhan Nam
1
GCT535- Sound Technology for Multimedia Temporal Analysis Graduate - - PowerPoint PPT Presentation
GCT535- Sound Technology for Multimedia Temporal Analysis Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Temporal Analysis Introduction Human perception of Tempo Onset detection Definition Onset
1
2
3
4
[Wikipedia]
5
[From D. Ellis’ e4896 course slides]
6
7
[M.Muller]
8
1 2 3 4 5 6 −1 −0.5 0.5 1 time [sec] amplitude
? “Eat (꺼내먹어요) ” Zion.T
9
1 2 3 4 5 6 5 10 15 20 time [sec] ODF
1 2 3 4 5 6 −1 −0.5 0.5 1 time [sec] amplitude
10
Waveform Onset Detection Function
/ 012/
𝑥(𝑛): window
11
1 2 3 4 5 6 2 4 6 8 10 time [sec] ODF 1 2 3 4 5 6 5 10 15 20 time [sec] ODF
1 2 3 4 5 6 2 4 6 8 time [sec] ODF
12
1 2 3 4 5 6 2 4 6 8 10 time [sec] ODF
13
/2D E1F
time [sec] frequency−kHz 1 2 3 4 5 0.5 1 1.5 2 x 10
4
1 2 3 4 5 100 200 300 400 time [sec] ODF
14
Deviation from the steady-state for all frequency bins [From D. Ellis’ e4896 course slides]
Phase continuation (e.g. during sustain of a single note)
k=1 N
15
Low-pass Filtering (Solid line)
(Tzanetakis, 2010)
0.5 1 1.5 2 2.5 3 3.5 4 4.5 5
time [sec]
50 100 150 200 250 300 350
ODF
ODF Threshold
16
Median with window size 5
17
18
19
1 2 3 4 5 −1 1 2 3 x 10
5
time [sec] ODF 1 2 3 4 5 100 200 300 400 time [sec] ODF
20
Onset Detection Function (spectral flux) Auto-Correlation
21
Histogram of beats from a dataset [From D. Ellis’ e4896 course slides] (Klapuri, 2003)
22
(Tzanetakis, 2002)
23
(Tzanetakis, 2002)
24
k∈R
(Foote, 2001)
25
i, j
(Foote, 2001)
26
(Grosche, 2009)
27
(Grosche, 2011)
28
(Klapuri, 2006)
29
(Scheirer, 1998)
30
31