Wayne Snyder Computer Science Department Boston University Today: - - PowerPoint PPT Presentation

wayne snyder computer science department boston
SMART_READER_LITE
LIVE PREVIEW

Wayne Snyder Computer Science Department Boston University Today: - - PowerPoint PPT Presentation

CS 591 S1 Computational Audio Wayne Snyder Computer Science Department Boston University Today: Analyzing Rhythm Analyzing rhythm: basic notions and motivations Onset detection, beat tracking Rhythm analysis Tempo Estimation Time


slide-1
SLIDE 1

Computer Science

CS 591 S1 – Computational Audio

Today: Analyzing Rhythm Analyzing rhythm: basic notions and motivations Onset detection, beat tracking Rhythm analysis Tempo Estimation Time Warping to account for variations in tempo

Wayne Snyder Computer Science Department Boston University

slide-2
SLIDE 2

What is Rhythm?

slide-3
SLIDE 3

Rhythm Analysis: Breakdown of Phases

  • 1. Onset Detection

§ Where precisely do notes start?

  • 2. Beat tracking

§ Given an audio recording of a piece of music, determine the periodic sequence of beat positions

  • 3. Tempo and Meter Estimation

§ Interpreting the periodicity in musical terms (BPM, meter)

  • 4. Analyzing Style and Musical Effects

§ Variations in tempo (intentional and unintentional) § Musical effects: Anticipations, rubato, fermata, playing “behind the beat” (or ahead), swing. Note: There is something of a “chicken or egg” problem with 2 and 3.... We’ll fold these together for simplicity....

slide-4
SLIDE 4

Time (seconds)

Example: Queen – Another One Bites The Dust

Rhythm Analysis: Introduction

slide-5
SLIDE 5

Example: Queen – Another One Bites The Dust

Time (seconds)

Rhythm Analysis: Introduction

slide-6
SLIDE 6

Further Examples: If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?

Rhythm Analysis: Introduction

slide-7
SLIDE 7

Rhythm Analysis: Introduction

Even when rhythm is regular, there is a complicated semantic problem: rhythm is hierarchical, consisting of many interrelated groupings: Pulse level: Measure

slide-8
SLIDE 8

Rhythm Analysis: Introduction

Pulse level: Tactus (beat)

slide-9
SLIDE 9

Rhythm Analysis: Introduction

Example: Happy Birthday to you Pulse level: Tatum (fastest unit of division)

Note: “Tatum” was named after Art Tatum, one of the greatest of all jazz pianists, who played a lot

  • f fast notes!
slide-10
SLIDE 10

In a sophisticated piece of music, these various levels are exploited by the composer in complicated ways. How should it be notated and described precisely? What is the time signature? Example: Bach, WTC, Fugue #1 in C Major

Rhythm Analysis: Introduction

slide-11
SLIDE 11

Rhythm Analysis: Introduction

§ Hierarchical levels often unclear § Global/slow tempo changes (all musicians do this!) § Local/sudden tempo changes (e.g. rubato) § Vague information (e.g., soft onsets, false positives ) § Sparse information: not all beats occur! (often only note onsets are used) Challenges in beat tracking

slide-12
SLIDE 12

§ Onset detection § Beat tracking § Tempo estimation Tasks

Introduction

slide-13
SLIDE 13

§ Onset detection § Beat tracking § Tempo estimation Tasks

Tasks in Rhythm Analysis

slide-14
SLIDE 14

period phase

§ Onset detection § Beat tracking § Tempo estimation Tasks

Tasks in Rhythm Analysis

slide-15
SLIDE 15

Tempo := 60 / period Beats per minute (BPM) § Onset detection § Beat tracking § Tempo estimation Tasks

period

Tasks in Rhythm Analysis

slide-16
SLIDE 16

Onset Detection

§ Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal’s properties:

– energy or loudness – pitch or harmony – timbre

slide-17
SLIDE 17

Onset Detection

[Bello et al., IEEE-TASLP 2005]

§ Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal’s properties:

– energy or loudness – pitch or harmony – timbre

slide-18
SLIDE 18

Steps

Time (seconds)

Waveform

Onset Detection (Amplitude or Energy-Based)

slide-19
SLIDE 19

Time (seconds)

Squared waveform Steps 1. Amplitude squaring (full-wave rectification of power signal)

Onset Detection (Amplitude or Energy-Based)

slide-20
SLIDE 20

Onset Detection (Amplitude or Energy-Based)

Time (seconds)

Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window): “energy envelope”

slide-21
SLIDE 21

Onset Detection (Energy-Based)

Time (seconds)

Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.”

slide-22
SLIDE 22

Onset Detection (Energy-Based)

Time (seconds)

Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.” 4. Half-wave Rectification (negative samples => 0.0): note onsets are indicated by increases in energy only.

slide-23
SLIDE 23

Onset Detection (Energy-Based)

Time (seconds)

Steps 1. Amplitude squaring 2. Windowing 3. Differentiation 4. Half wave rectification 5. Peak picking Peak positions indicate note onset candidates

slide-24
SLIDE 24

Energy based methods work well for percussive instruments, including piano: Example: Bach Well-Tempered Clavier, Book 1, Fugue #1 in C major (Glenn Gould)

slide-25
SLIDE 25

Onset Detection

§ Energy curves often only work for percussive music § Many instruments have weak note onsets: wind, strings, voice.

– Example: Shakuhachi Flute

§ Biggest problem: pitch or timbre changes may not correlate with energy changes (e.g., a singer may change the pitch without changing loudness). § More refined methods needed that capture changes in spectrum

[Bello et al., IEEE-TASLP 2005]

slide-26
SLIDE 26
  • 1. Spectrogram

Magnitude spectrogram

Frequency (Hz) Time (seconds)

| | X

Steps:

Onset Detection (Spectral-Based)

§ Aspects concerning pitch, harmony, or timbre are captured by spectrogram § Allows for detecting local energy changes in certain frequency ranges

slide-27
SLIDE 27

Compressed spectrogram Y

|) | 1 log( X C Y ⋅ + =

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression

Steps:

§ Accounts for the human logarithmic sensation of sound intensity § Dynamic range compression § Enhancement of low-intensity values § Often leading to enhancement

  • f high-frequency spectrum

Time (seconds) Frequency (Hz)

slide-28
SLIDE 28

Spectral difference

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation

Steps:

§ First-order temporal difference § Captures changes of the spectral content § Only positive intensity changes considered

Time (seconds) Frequency (Hz)

slide-29
SLIDE 29

Spectral difference

t

Novelty curve

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation
  • 4. Accumulation: spectral

differences summarized by a number. Steps:

§ Frame-wise accumulation of all positive intensity changes § Encodes changes of the spectral content

Frequency (Hz)

slide-30
SLIDE 30

30

Digression: Difference/Distance Metrics

One of the most important issues in analyzing data, especially, multi-dimension and/or time- series data, is understand how similar two pieces of data are (represented typically by a vector or multi-dimensional array). There are two principle methods for such comparisons: Distance Metrics: Similar data vectors are regarded as closer in a geometrical sense; the range is [0 .. ∞), where distance = 0 means the vectors are identical: Dependence Metrics: Similar data vectors exhibit dependence: they “move together” in similar ways; the range of the coefficients is [-1 .. 1]:

D( a, b ) = “distance” between a and b b a

  • 1 0 1

Inverse No Strong Dependence

slide-31
SLIDE 31

31

Distance Metrics

A Distance Metric obeys typical geometric laws: A set with an associated Distance Metric is called a Metric Space.

slide-32
SLIDE 32

32

Distance Metrics

A variety of metrics have been developed, from fields as diverse as game playing to pattern recognition, and the most important of these is as follows: Sum of Absolute Difference (Manhattan Distance): Sum of Squared Difference: Mean Absolute Error: Mean Squared Error: Euclidean Distance:

slide-33
SLIDE 33

33

Distance Metrics

These measures extend our common understanding of the notion of distance to complex mathematical domains (such as vector spaces) and give us tools to understand how similar

  • r dissimilar two objects are.
slide-34
SLIDE 34

34

Dependence Metrics

Two common dependence metrics are as follows: Correlation (Pearson’s Product-Moment Correlation Coefficient): Correlation measures the linear dependence of two vectors or random variables X and Y. Cosine Similarity: Cosine similarity measures the cosine of the angle between two vectors of length N in N- dimensional space. NOTE that these are similar calculations, except that correlation subtracts the mean from each point. For musical signals of any length, the mean will be very close to 0, and so these are effectively the same.

slide-35
SLIDE 35

35

Distance Metrics

Dependence metrics can be converted (almost) into distance metrics by the simple expediency of subtracting them from 1.0:

Cosine Distance = 1.0 - Cosine Similarity Pearson’s Distance = 1.0 - Correlation Coefficient

Now these are in the range [0..2], with 0 indicating the strongest possible dependence; these are not actually distance metrics, since a metric of 0 does not indicate identity, but just the strongest possible linear depedence; and the cosine distance does not satisfy the triangle inequality; however, this does not prevent them from being extremely useful!!

slide-36
SLIDE 36

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation
  • 4. Accumulation

Steps: Novelty curve

slide-37
SLIDE 37

Subtraction of local average

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation
  • 4. Accumulation
  • 5. Normalization

Steps: Novelty curve

slide-38
SLIDE 38

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation
  • 4. Accumulation
  • 5. Normalization

Steps: Normalized novelty curve

slide-39
SLIDE 39

Onset Detection (Spectral-Based)

  • 1. Spectrogram
  • 2. Logarithmic compression
  • 3. Differentiation
  • 4. Accumulation
  • 5. Normalization
  • 6. Peak picking

Steps: Normalized novelty curve

slide-40
SLIDE 40

Examples of Onset Detection:

WTC Fugue #1 (Bach) A Smooth One (Benny Goodman) Doc‘s Guitar (Doc Watson) WTC Prelude #5 (Bach) Poulenc, Valse No.114 Faure, Op.15, No.1

slide-41
SLIDE 41

Beat and Tempo

§ Steady pulse that drives music forward and provides the temporal framework of a piece

  • f music

§ Sequence of perceived pulses that are equally spaced in time § The pulse a human taps along when listening to the music

[Parncutt 1994] [Sethares 2007] [Large/Palmer 2002] [Lerdahl/ Jackendoff 1983] [Fitch/ Rosenfeld 2007]

What is a beat? The term tempo then refers to the speed of the pulse.

slide-42
SLIDE 42

Beat and Tempo

§ Analyze the novelty curve with respect to reoccurring or quasi- periodic patterns § Avoid the explicit determination

  • f note onsets (no peak picking)

Strategy

slide-43
SLIDE 43

Beat and Tempo

Strategy § Autocorrelation § Fourier transfrom Methods § Analyze the novelty curve with respect to reoccurring or quasi- periodic patterns—as if it were a musical signal and you are trying to find the component pitches (= periodic patterns of the novelty curve) § Avoid the explicit determination

  • f note onsets (no peak

picking)

slide-44
SLIDE 44

Definition: A tempogram is a time-tempo representation that encodes the local tempo of a music signal

  • ver time (= spectrograph of novelty curve!).

Tempo (BPM) Time (seconds) Intensity

Tempogram

slide-45
SLIDE 45

Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal

  • ver time (= spectrograph of novelty curve!).

§ Compute a spectrogram (STFT) of the novelty curve § Convert frequency axis (given in Hertz) into tempo axis (given in BPM) § Magnitude spectrogram indicates local tempo Fourier-based method

Tempogram (Fourier)

slide-46
SLIDE 46

Tempo (BPM) Time (seconds)

Tempogram (Fourier)

Novelty curve

slide-47
SLIDE 47

Tempo (BPM)

Tempogram (Fourier)

Novelty curve (local window)

Time (seconds)

slide-48
SLIDE 48

Tempo (BPM)

Hann-windowed sinusoidal

Tempogram (Fourier)

Time (seconds)

slide-49
SLIDE 49

Tempo (BPM)

Hann-windowed sinusoidal

Tempogram (Fourier)

Time (seconds)

slide-50
SLIDE 50

Tempo (BPM)

Tempogram (Fourier)

Hann-windowed sinusoidal

Time (seconds)

slide-51
SLIDE 51

Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal

  • ver time (= spectrograph of novelty curve!).

§ Compare novelty curve with time-lagged local sections of itself § Convert lag-axis (given in seconds) into tempo axis (given in BPM) § Autocorrelogram indicates local tempo Autocorrelation-based method (cf. pitch determination algorithm).

Tempogram (Autocorrelation)

slide-52
SLIDE 52

Tempogram (Autocorrelation)

Novelty curve (local window)

Lag (seconds) Time (seconds)

slide-53
SLIDE 53

Tempogram (Autocorrelation)

Windowed autocorrelation

Lag (seconds)

slide-54
SLIDE 54

Tempogram (Autocorrelation)

Lag = 0 (seconds)

Lag (seconds)

slide-55
SLIDE 55

Tempogram (Autocorrelation)

Lag = 0.26 (seconds)

Lag (seconds)

slide-56
SLIDE 56

Tempogram (Autocorrelation)

Lag = 0.52 (seconds)

Lag (seconds)

slide-57
SLIDE 57

Tempogram (Autocorrelation)

Lag = 0.78 (seconds)

Lag (seconds)

slide-58
SLIDE 58

Tempogram (Autocorrelation)

Lag = 1.56 (seconds)

Lag (seconds)

slide-59
SLIDE 59

Tempogram (Autocorrelation)

Time (seconds) Time (seconds) Lag (seconds)

slide-60
SLIDE 60

300 60 80 40 30 120

Tempogram (Autocorrelation)

Tempo (BPM) Time (seconds) Time (seconds)

slide-61
SLIDE 61

600 500 400 300 200 100

Tempogram (Autocorrelation)

Tempo (BPM) Time (seconds) Time (seconds)

slide-62
SLIDE 62

Time (seconds)

Tempogram

Fourier Autocorrelation

Time (seconds) Tempo (BPM)

slide-63
SLIDE 63

Tempogram

Fourier Autocorrelation

210 70

Tempo (BPM)

Tempo@Tatum = 210 BPM Tempo@Measure = 70 BPM

Time (seconds) Time (seconds)

slide-64
SLIDE 64

Tempogram

Fourier Autocorrelation

Time (seconds) Time (seconds) Tempo (BPM) Time (seconds)

Emphasis of tempo harmonics (integer multiples) Emphasis of tempo subharmonics (integer fractions)

[Grosche et al., ICASSP 2010] [Peeters, JASP 2007]

slide-65
SLIDE 65

Tempogram (Summary)

Fourier Autocorrelation

Novelty curve is compared with sinusoidal kernels each representing a specific tempo Novelty curve is compared with time-lagged local (windowed) sections of itself Convert frequency (Hertz) into tempo (BPM) Convert time-lag (seconds) into tempo (BPM) Reveals novelty periodicities Reveals novelty self-similarities Emphasizes harmonics Emphasizes subharmonics Granularity increases as tempo increases; Suitable to analyze tempo on tatum and tactus level Granularity increases as tempo decreases; Suitable to analyze tempo on tatum and measure level