Music Synchronization Meinard Mller International Audio - - PowerPoint PPT Presentation

music synchronization
SMART_READER_LITE
LIVE PREVIEW

Music Synchronization Meinard Mller International Audio - - PowerPoint PPT Presentation

Lecture Music Processing Music Synchronization Meinard Mller International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de Book: Fundamentals of Music Processing Meinard Mller Fundamentals of Music Processing Audio,


slide-1
SLIDE 1

Music Processing Meinard Müller

Lecture

Music Synchronization

International Audio Laboratories Erlangen meinard.mueller@audiolabs-erlangen.de

slide-2
SLIDE 2

Book: Fundamentals of Music Processing

Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

slide-3
SLIDE 3

Book: Fundamentals of Music Processing

Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

slide-4
SLIDE 4

Book: Fundamentals of Music Processing

Meinard Müller Fundamentals of Music Processing Audio, Analysis, Algorithms, Applications 483 p., 249 illus., hardcover ISBN: 978-3-319-21944-8 Springer, 2015 Accompanying website: www.music-processing.de

slide-5
SLIDE 5

Chapter 3: Music Synchronization

3.1 Audio Features 3.2 Dynamic Time Warping 3.3 Applications 3.4 Further Notes

As a first music processing task, we study in Chapter 3 the problem of music synchronization. The

  • bjective

is to temporally align compatible representations of the same piece of music. Considering this scenario, we explain the need for musically informed audio features. In particular, we introduce the concept of chroma-based music features, which capture properties that are related to harmony and melody. Furthermore, we study an alignment technique known as dynamic time warping (DTW), a concept that is applicable for the analysis of general time series. For its efficient computation, we discuss an algorithm based on dynamic programming—a widely used method for solving a complex problem by breaking it down into a collection of simpler subproblems.

slide-6
SLIDE 6

Music Data

slide-7
SLIDE 7

Music Data

slide-8
SLIDE 8

Music Data

slide-9
SLIDE 9

Music Data

Various interpretations – Beethoven’s Fifth Bernstein Karajan Gould (piano) MIDI (piano)

slide-10
SLIDE 10

Music Synchronization: Audio-Audio

Given: Two different audio recordings of the same underlying piece of music. Goal: Find for each position in one audio recording the musically corresponding position in the other audio recording.

slide-11
SLIDE 11

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (seconds) Time (seconds)

slide-12
SLIDE 12

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (seconds) Time (seconds)

slide-13
SLIDE 13

Music Synchronization: Audio-Audio

Application: Interpretation Switcher

slide-14
SLIDE 14

Music Synchronization: Audio-Audio

Two main steps:

  • Robust but discriminative
  • Chroma features
  • Robust to variations in instrumentation, timbre, dynamics
  • Correlate to harmonic progression

1.) Audio features

  • Deals with local and global tempo variations
  • Needs to be efficient

2.) Alignment procedure

slide-15
SLIDE 15

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (seconds) Time (seconds)

slide-16
SLIDE 16

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (indices) Time (indices)

slide-17
SLIDE 17

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (indices) Time (indices)

slide-18
SLIDE 18

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (indices) Time (indices)

G G

slide-19
SLIDE 19

Music Synchronization: Audio-Audio

Karajan Gould Beethoven’s Fifth

Time (indices) Time (indices)

E

E

slide-20
SLIDE 20

Music Synchronization: Audio-Audio

Time (indices) Time (indices)

Karajan Gould

slide-21
SLIDE 21

Music Synchronization: Audio-Audio

Cost matrix

Time (indices) Time (indices)

Karajan Gould

slide-22
SLIDE 22

Music Synchronization: Audio-Audio

Cost matrix

Time (indices) Time (indices)

Karajan Gould

slide-23
SLIDE 23

Music Synchronization: Audio-Audio

Optimal alignment (cost-minimizing warping path)

Time (indices) Time (indices)

Karajan Gould

slide-24
SLIDE 24

Music Synchronization: Audio-Audio

Cost matrix

slide-25
SLIDE 25

Music Synchronization: Audio-Audio

Optimal alignment (cost-minimizing warping path)

slide-26
SLIDE 26

Music Synchronization: Audio-Audio

Karajan Gould Optimal alignment (cost-minimizing warping path)

Time (indices) Time (indices)

slide-27
SLIDE 27

Cost matrices Dynamic programming Dynamic Time Warping (DTW)

Music Synchronization: Audio-Audio

How to compute the alignment?

slide-28
SLIDE 28

Applications

Music Library

Freude, schoener Götterfunken, Tochter aus Elysium, Wir betreten feuertrunken, Himmlische dein Heiligtum. Deine Zauber binden wieder, Was die Mode streng geteilt; Alle Menschen werden Brueder, Wo dein sanfter Flügel weilt. Wem der grosse Wurf gelungen, Eines Freundes Freund zu sein, Wer ein holdes Weib errungen, Mische seine Jubel ein!

slide-29
SLIDE 29

Music Synchronization: MIDI-Audio

Time

slide-30
SLIDE 30

Music Synchronization: MIDI-Audio MIDI = meta data Automated annotation Audio recording

Sonification of annotations

slide-31
SLIDE 31
  • Automated audio annotation
  • Accurate audio access after MIDI-based retrieval
  • Automated tracking of MIDI note parameters

during audio playback

  • Performance analysis

Music Synchronization: MIDI-Audio

slide-32
SLIDE 32

Music Synchronization: MIDI-Audio MIDI = reference (score) Tempo information Audio recording

slide-33
SLIDE 33

Performance Analysis: Tempo Curves

1 2 3 4 5 1 2 3 4

Time (beats)

1 2 3 4 5

Time (beats)

60 120 180 240

Reference version Reference version Alignment Local tempo

Time (seconds) Tempo (BPM)

Performed version Performed version

slide-34
SLIDE 34

Performance Analysis: Tempo Curves

1 2 3 4 5 1 2 3 4

Time (beats)

1 2 3 4 5

Time (beats)

60 120 180 240

Reference version Reference version Alignment Local tempo

30

Time (seconds) Tempo (BPM)

Performed version Performed version 1 beat lasting 2 seconds ≙ 30 BPM

slide-35
SLIDE 35

Performance Analysis: Tempo Curves

1 2 3 4 5 1 2 3 4

Time (beats)

1 2 3 4 5

Time (beats)

60

120 180 240

Reference version Reference version Alignment Local tempo

Time (seconds) Tempo (BPM)

Performed version Performed version

30

1 beat lasting 1 seconds ≙ 60 BPM

slide-36
SLIDE 36

Performance Analysis: Tempo Curves

1 2 3 4 5 1 2 3 4

Time (seconds) Time (beats)

1 2 3 4 5

Time (beats) Tempo (BPM)

120 180 240

Performed version Reference version Reference version Performed version Alignment Local tempo

150 60 30

1 beat lasting 0.4 seconds ≙ 150 BPM

slide-37
SLIDE 37

Performance Analysis: Tempo Curves

1 2 3 4 5 1 2 3 4

Time (beats)

1 2 3 4 5

Time (beats)

120 180 240

Reference version Reference version Alignment Tempo curve

Time (seconds) Tempo (BPM)

Performed version Performed version

200 150 60 30

Tempo curve is optained by interpolation

slide-38
SLIDE 38

Schumann: Träumerei

Performance Analysis: Tempo Curves

Performance:

1 5 10 15 20 25

  • 0.2
  • 0.1

0.1 30

Time (seconds)

slide-39
SLIDE 39

Schumann: Träumerei

Performance Analysis: Tempo Curves

Score (reference):

1 2 3 4 5 6 7 8

Performance:

1 5 10 15 20 25

  • 0.2
  • 0.1

0.1 30

Time (seconds)

slide-40
SLIDE 40

Schumann: Träumerei

Performance Analysis: Tempo Curves

Strategy: Compute score-audio synchronization and derive tempo curve Score (reference):

1 2 3 4 5 6 7 8

Performance:

1 5 10 15 20 25

  • 0.2
  • 0.1

0.1 30

Time (seconds)

slide-41
SLIDE 41

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curve: Score (reference):

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

slide-42
SLIDE 42

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curves: Score (reference):

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

slide-43
SLIDE 43

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curves: Score (reference):

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

slide-44
SLIDE 44

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curves: Score (reference):

1 2 3 4 5 6 7 8 1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

?

slide-45
SLIDE 45

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curves:

1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

What can be done if no reference is available?

slide-46
SLIDE 46

Performance Analysis: Tempo Curves

Schumann: Träumerei

Tempo curves:

1 2 3 4 5 6 7 40 80 120 160 8

Tempo (BPM) Time (measures)

What can be done if no reference is available? → Tempo and Beat Tracking

slide-47
SLIDE 47

Music Synchronization: Image-Audio

Image Audio

slide-48
SLIDE 48

Music Synchronization: Image-Audio

Image Audio

slide-49
SLIDE 49

Music Synchronization: Image-Audio

Image Audio Convert data into common mid-level feature representation

slide-50
SLIDE 50

Music Synchronization: Image-Audio

Image Audio

Image Processing: Optical Music Recognition

Convert data into common mid-level feature representation

slide-51
SLIDE 51

Music Synchronization: Image-Audio

Image Audio

Image Processing: Optical Music Recognition Audio Processing: Fourier Analyse

Convert data into common mid-level feature representation

slide-52
SLIDE 52

Music Synchronization: Image-Audio

Image Audio

Image Processing: Optical Music Recognition Audio Processing: Fourier Analyse

slide-53
SLIDE 53

Application: Score Viewer

Music Synchronization: Image-Audio

slide-54
SLIDE 54

Music Synchronization: Lyrics-Audio

Ich träumte von bunten Blumen, so wie sie wohl blühen im Mai

slide-55
SLIDE 55

Music Synchronization: Lyrics-Audio

Ich träumte von bunten Blumen, so wie sie wohl blühen im Mai

Extremely difficult!

slide-56
SLIDE 56

Music Synchronization: Lyrics-Audio

Ich träumte von bunten Blumen, so wie sie wohl blühen im Mai

Lyrics-Audio  Lyrics-MIDI + MIDI-Audio

slide-57
SLIDE 57

Music Synchronization: Lyrics-Audio

Lyrics-Audio  Lyrics-MIDI + MIDI-Audio

Ich träumte von bunten Blumen, so wie sie wohl blühen im Mai

slide-58
SLIDE 58

Score-Informed Source Separation

slide-59
SLIDE 59

Score-Informed Source Separation

slide-60
SLIDE 60

Score-Informed Source Separation

slide-61
SLIDE 61

Score-Informed Source Separation

Experimental results for separating left and right hands for piano recordings:

Composer Piece Database Results

L R Eq Org

Bach BWV 875, Prelude SMD Chopin

  • Op. 28, No. 15

SMD Chopin

  • Op. 64, No. 1

European Archive

slide-62
SLIDE 62

Score-Informed Source Separation

500 580 523 Frequency (Hertz) 1 0.5 Time (seconds)

Audio editing

9 8 7 6 1600 1200 800 400 9 8 7 6 1600 1200 800 400 500 580 554 Frequency (Hertz) 1 0.5 Time (seconds)

slide-63
SLIDE 63

Dynamic Time Warping

slide-64
SLIDE 64

Dynamic Time Warping

  • Well-known technique to find an optimal alignment

between two given (time-dependent) sequences under certain restrictions.

  • Intuitively, sequences are warped in a non-linear

fashion to match each other.

  • Originally used to compare different speech

patterns in automatic speech recognition

slide-65
SLIDE 65

Dynamic Time Warping

Sequence X Sequence Y x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

slide-66
SLIDE 66

Dynamic Time Warping

Sequence X Sequence Y x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

Time alignment of two time-dependent sequences, where the aligned points are indicated by the arrows.

slide-67
SLIDE 67

Dynamic Time Warping

Sequence X Sequence Y x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

Time alignment of two time-dependent sequences, where the aligned points are indicated by the arrows.

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9

Sequence Y Sequence X

slide-68
SLIDE 68

The objective of DTW is to compare two (time-dependent) sequences

  • f length and
  • f length . Here,

are suitable features that are elements from a given feature space denoted by .

Dynamic Time Warping

slide-69
SLIDE 69

To compare two different features

  • ne needs a local cost measure which is defined

to be a function Typically, is small (low cost) if and are similar to each other, and otherwise is large (high cost).

Dynamic Time Warping

slide-70
SLIDE 70

Dynamic Time Warping

Evaluating the local cost measure for each pair of elements of the sequences and , one obtains the cost matrix denfined by Then the goal is to find an alignment between and having minimal overall cost. Intuitively, such an optimal alignment runs along a “valley” of low cost within the cost matrix .

slide-71
SLIDE 71

Dynamic Time Warping

Time (indices) Time (indices)

Cost matrix C

slide-72
SLIDE 72

Dynamic Time Warping

Time (indices) Time (indices)

Cost matrix C C(5,6)

slide-73
SLIDE 73

Dynamic Time Warping

Cost matrix C

slide-74
SLIDE 74

Dynamic Time Warping

Cost matrix C C(5,6)

slide-75
SLIDE 75
  • Boundary condition: and
  • Monotonicity condition: and
  • Step size condition:

Dynamic Time Warping

The next definition formalizes the notion of an alignment. A warping path is a sequence with for satisfying the following three conditions: for

slide-76
SLIDE 76

Dynamic Time Warping

1 2 3 4 5 6 7 9 8 7 6 5 4 3 2 1

Sequence Y Sequence X

Cell = (6,3) Each matrix entry (cell) corresponds to a pair of indices. Boundary cells: p1 = (1,1) pL = (N,M) = (9,7) Warping path

slide-77
SLIDE 77

Dynamic Time Warping

Correct warping path Warping path

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9

Sequence Y Sequence X Sequence X Sequence Y x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

slide-78
SLIDE 78

Dynamic Time Warping

Warping path

Sequence X Sequence Y

Violation of boundary condition

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9

Sequence Y Sequence X y1 y2 y3 y4 y5 y6 y7 x1 x2 x3 x4 x5 x6 x7 x8 x9

slide-79
SLIDE 79

Dynamic Time Warping

Warping path

Sequence X Sequence Y

Violation of monotonicity condition

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9

Sequence Y Sequence X x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

slide-80
SLIDE 80

Dynamic Time Warping

Warping path

Sequence X Sequence Y

Violation of step size condition

1 2 3 4 5 6 7 1 2 3 4 5 6 7 8 9

Sequence Y Sequence X x1 x2 x3 x4 x5 x6 x7 x8 x9 y1 y2 y3 y4 y5 y6 y7

slide-81
SLIDE 81

Furthermore, an optimal warping path between and is a warping path having minimal total cost among all possible warping paths. The DTW distance between and is then defined as the total cost of The total cost

  • f a warping path between

and with respect to the local cost measure is defined as

Dynamic Time Warping

slide-82
SLIDE 82

Dynamic Time Warping

  • The warping path is not unique (in general).
  • DTW does (in general) not definne a metric since it

may not satisfy the triangle inequality.

  • There exist exponentially many warping paths.
  • How can be computed efficiently?
slide-83
SLIDE 83

Dynamic Time Warping

Notation: The matrix is called the accumulated cost matrix. The entry specifies the cost of an optimal warping path that aligns with .

slide-84
SLIDE 84

Dynamic Time Warping

Lemma: for Proof: (i) – (iii) are clear by definition

slide-85
SLIDE 85

Dynamic Time Warping

Proof of (iv): Induction via :

Let and be an optimal warping path for and . Then (boundary condition). Let . The step size condition implies The warping path must be optimal for . Thus,

slide-86
SLIDE 86

Dynamic Time Warping

  • Initialize using (ii) and (iii) of the lemma.
  • Compute e for using (iv).
  • using (i).

Given the two feature sequences and , the matrix is computed recursively. Note:

  • Complexity O(NM).
  • Dynamic programming: “overlapping-subproblem property”

Accumulated cost matrix

slide-87
SLIDE 87

Given to the algorithm is the accumulated cost matrix . The optimal path is computed in reverse

  • rder of the indices starting with .

Suppose has been computed. In case , one must have and we are done. Otherwise, where we take the lexicographically smallest pair in case “argmin” is not unique.

Dynamic Time Warping

Optimal warping path

slide-88
SLIDE 88

Dynamic Time Warping

Summary

1 2 3 4 5 6 7 8 9 1 2 3 4 5 6

m = 10 n = 6 D(n,m) D(N,M) = DTW(X,Y) D(n,m-1) D(n-1,m-1) D(n-1,m)

10 11 7

D(1,m) D(n,1)

12 13 14 15 8

slide-89
SLIDE 89

Dynamic Time Warping

Summary

slide-90
SLIDE 90

Dynamic Time Warping

1 1 1 7 6 1 6 8 8 1 6 1 3 3 5 4 1 1 3 3 5 4 1 1 1 1 7 6 1 2 8 7 2 10 10 11 14 13 9 9 11 13 7 8 14 3 5 7 10 12 13 2 4 5 8 12 13 1 2 3 10 16 17 2 8 7 2 1 3 3 8 1 2 8 7 2 1 8 3 3 1 1 8 3 3 1

Example

Alignment Optimal warping path:

slide-91
SLIDE 91

Dynamic Time Warping

Step size conditions Σ 1,0 , 0,1 , 1,1

slide-92
SLIDE 92

Dynamic Time Warping

Step size conditions Σ 2,1 , 1,2 , 1,1

slide-93
SLIDE 93

Dynamic Time Warping

Step size conditions

slide-94
SLIDE 94
  • Computation via dynamic programming
  • Memory requirements and running time: O(NM)
  • Problem: Infeasible for large N and M
  • Example: Feature resolution 10 Hz, pieces 15 min

N, M ~ 10,000 N ꞏ M ~ 100,000,000

Dynamic Time Warping

slide-95
SLIDE 95

Sakoe-Chiba band Itakura parallelogram Global constraints

Dynamic Time Warping

slide-96
SLIDE 96

Problem: Optimal warping path not in constraint region Sakoe-Chiba band Itakura parallelogram Global constraints

Dynamic Time Warping

slide-97
SLIDE 97

Compute optimal warping path on coarse level Multiscale approach

Dynamic Time Warping

slide-98
SLIDE 98

Project on fine level Multiscale approach

Dynamic Time Warping

slide-99
SLIDE 99

Specify constraint region Multiscale approach

Dynamic Time Warping

slide-100
SLIDE 100

Compute constrained optimal warping path Multiscale approach

Dynamic Time Warping

slide-101
SLIDE 101

Good trade-off between efficiency and robustness?

  • Suitable features?
  • Suitable resolution levels?
  • Size of constraint regions?

Multiscale approach

Dynamic Time Warping

Suitable parameters depend very much on application!