GCT535- Sound Technology for Multimedia Music and Audio Alignment
Graduate School of Culture Technology KAIST Juhan Nam
1
GCT535- Sound Technology for Multimedia Music and Audio Alignment - - PowerPoint PPT Presentation
GCT535- Sound Technology for Multimedia Music and Audio Alignment Graduate School of Culture Technology KAIST Juhan Nam 1 Outlines Musical Representations Score, Audio, MIDI Music and Audio Alignment Synchronization Framework
1
2
3
4
5
6
[from M. Muller’s Book]
7
Compute local similarity Find the best path
8
CENS : Normalized Chroma Features (Muller, 2005) MIDI Lisitsa
9
Schumann−Traumerei−Lisitsa Schumann−Traumerei−MIDI
50 100 150 200 250 300 50 100 150 200 250
10
11
12
13
14
A C B D E F G H 2 4 3 3 6 2 4 2 2 3 2 5 4 1 2 3 3 1 5 3 I J K 7 4 5 6 3 3 5 7 4 3 2 3 2
15
A C B D E F G H 2 4 3 3 6 2 4 2 2 3 2 5 4 1 2 3 3 1 5 3 I J K 7 4 5 6 3 3 5 7 4 3 2 3 2
16
i {Ck−1(i)+cij}
A C B D E F G H 2 4 3 3 6 2 4 2 2 3 2 5 4 1 2 3 3 1 5 3 I J K 7 4 5 6 3 3 5 7 4 3 2 3 2
C(n,1) = sum(O(1:n,1)), n=1…N C(1,m) = sum(O(1,1:m)), n=1…M
For each m = 1…M For each n = 1…N C(n-1,m) C(n,m)= O(n,m)+ min C(n,m-1) C(n-1,m-1)
C(N,M) is distance
17
18
19
O(i,j) C(i,j)
20
21
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21
Figure 2: An example of the on-line time warping algorithm with search window c = 4, showing the order of evaluation for a partic- ular sequence of row and column increments. The axes represent the variables t and j (see Figure 1) respectively. All calculated cells are framed in bold, and the optimal path is coloured grey.
[Dixon, 2005]
22
[Ewert, 2009]
23
Score Following Results on the RWC dataset
Demo: https://www.audiolabs-erlangen.de/resources/MIR/SyncRWC60
24
∆. / )0
25
𝐷 𝑢2 = 3 𝑃 𝑢2 + 𝛽 3 𝐺(𝑢2 − 𝑢267, 𝜐)
8 290 8 297
𝐷 𝑢 = 𝑃 𝑢 + max
/ {𝛽𝐺 𝑢 + 𝜐, 𝜐> + 𝐷 𝑢 }
26
27
28
29