SLIDE 1 Computer Science
CS 591 S1 – Computational Audio
Today: Analyzing Rhythm Analyzing rhythm: basic notions and motivations Onset detection, beat tracking Rhythm analysis Tempo Estimation Time Warping to account for variations in tempo
Wayne Snyder Computer Science Department Boston University
SLIDE 2
What is Rhythm?
SLIDE 3 Rhythm Analysis: Breakdown of Phases
§ Where precisely do notes start?
§ Given an audio recording of a piece of music, determine the periodic sequence of beat positions
- 3. Tempo and Meter Estimation
§ Interpreting the periodicity in musical terms (BPM, meter)
- 4. Analyzing Style and Musical Effects
§ Variations in tempo (intentional and unintentional) § Musical effects: Anticipations, rubato, fermata, playing “behind the beat” (or ahead), swing. Note: There is something of a “chicken or egg” problem with 2 and 3.... We’ll fold these together for simplicity....
SLIDE 4
Time (seconds)
Example: Queen – Another One Bites The Dust
Rhythm Analysis: Introduction
SLIDE 5
Example: Queen – Another One Bites The Dust
Time (seconds)
Rhythm Analysis: Introduction
SLIDE 6
Further Examples: If I Had You (Benny Goodman) Shakuhachi Flute Liszt: Sonetto No. 104 Del Petrarca Where is the beat? Can you tap your foot to it? What is the meter? How to find the underlying regular beat which is being varied by the composer and/or performer for expressive effect?
Rhythm Analysis: Introduction
SLIDE 7
Rhythm Analysis: Introduction
Even when rhythm is regular, there is a complicated semantic problem: rhythm is hierarchical, consisting of many interrelated groupings: Pulse level: Measure
SLIDE 8
Rhythm Analysis: Introduction
Pulse level: Tactus (beat)
SLIDE 9 Rhythm Analysis: Introduction
Example: Happy Birthday to you Pulse level: Tatum (fastest unit of division)
Note: “Tatum” was named after Art Tatum, one of the greatest of all jazz pianists, who played a lot
SLIDE 10
In a sophisticated piece of music, these various levels are exploited by the composer in complicated ways. How should it be notated and described precisely? What is the time signature? Example: Bach, WTC, Fugue #1 in C Major
Rhythm Analysis: Introduction
SLIDE 11
Rhythm Analysis: Introduction
§ Hierarchical levels often unclear § Global/slow tempo changes (all musicians do this!) § Local/sudden tempo changes (e.g. rubato) § Vague information (e.g., soft onsets, false positives ) § Sparse information: not all beats occur! (often only note onsets are used) Challenges in beat tracking
SLIDE 12
§ Onset detection § Beat tracking § Tempo estimation Tasks
Introduction
SLIDE 13
§ Onset detection § Beat tracking § Tempo estimation Tasks
Tasks in Rhythm Analysis
SLIDE 14
period phase
§ Onset detection § Beat tracking § Tempo estimation Tasks
Tasks in Rhythm Analysis
SLIDE 15
Tempo := 60 / period Beats per minute (BPM) § Onset detection § Beat tracking § Tempo estimation Tasks
period
Tasks in Rhythm Analysis
SLIDE 16
Onset Detection
§ Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal’s properties:
– energy or loudness – pitch or harmony – timbre
SLIDE 17
Onset Detection
[Bello et al., IEEE-TASLP 2005]
§ Finding start times of perceptually relevant acoustic events in music signal § Onset is the time position where a note is played § Onset typically goes along with a change of the signal’s properties:
– energy or loudness – pitch or harmony – timbre
SLIDE 18
Steps
Time (seconds)
Waveform
Onset Detection (Amplitude or Energy-Based)
SLIDE 19
Time (seconds)
Squared waveform Steps 1. Amplitude squaring (full-wave rectification of power signal)
Onset Detection (Amplitude or Energy-Based)
SLIDE 20
Onset Detection (Amplitude or Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window): “energy envelope”
SLIDE 21
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.”
SLIDE 22
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring (full-wave rectification of power signal) 2. Windowing (taking mean or max in each window) : “energy envelope” 3. Difference Function (using appropriate Distance Function): captures changes in signal energy: “novelty curve.” 4. Half-wave Rectification (negative samples => 0.0): note onsets are indicated by increases in energy only.
SLIDE 23
Onset Detection (Energy-Based)
Time (seconds)
Steps 1. Amplitude squaring 2. Windowing 3. Differentiation 4. Half wave rectification 5. Peak picking Peak positions indicate note onset candidates
SLIDE 24
Energy based methods work well for percussive instruments, including piano: Example: Bach Well-Tempered Clavier, Book 1, Fugue #1 in C major (Glenn Gould)
SLIDE 25
Onset Detection
§ Energy curves often only work for percussive music § Many instruments have weak note onsets: wind, strings, voice.
– Example: Shakuhachi Flute
§ Biggest problem: pitch or timbre changes may not correlate with energy changes (e.g., a singer may change the pitch without changing loudness). § More refined methods needed that capture changes in spectrum
[Bello et al., IEEE-TASLP 2005]
SLIDE 26
Magnitude spectrogram
Frequency (Hz) Time (seconds)
| | X
Steps:
Onset Detection (Spectral-Based)
§ Aspects concerning pitch, harmony, or timbre are captured by spectrogram § Allows for detecting local energy changes in certain frequency ranges
SLIDE 27 Compressed spectrogram Y
|) | 1 log( X C Y ⋅ + =
Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
Steps:
§ Accounts for the human logarithmic sensation of sound intensity § Dynamic range compression § Enhancement of low-intensity values § Often leading to enhancement
- f high-frequency spectrum
Time (seconds) Frequency (Hz)
SLIDE 28 Spectral difference
Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
Steps:
§ First-order temporal difference § Captures changes of the spectral content § Only positive intensity changes considered
Time (seconds) Frequency (Hz)
SLIDE 29 Spectral difference
t
Novelty curve
Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
- 4. Accumulation: spectral
differences summarized by a number. Steps:
§ Frame-wise accumulation of all positive intensity changes § Encodes changes of the spectral content
Frequency (Hz)
SLIDE 30 30
Digression: Difference/Distance Metrics
One of the most important issues in analyzing data, especially, multi-dimension and/or time- series data, is understand how similar two pieces of data are (represented typically by a vector or multi-dimensional array). There are two principle methods for such comparisons: Distance Metrics: Similar data vectors are regarded as closer in a geometrical sense; the range is [0 .. ∞), where distance = 0 means the vectors are identical: Dependence Metrics: Similar data vectors exhibit dependence: they “move together” in similar ways; the range of the coefficients is [-1 .. 1]:
D( a, b ) = “distance” between a and b b a
Inverse No Strong Dependence
SLIDE 31 31
Distance Metrics
A Distance Metric obeys typical geometric laws: A set with an associated Distance Metric is called a Metric Space.
SLIDE 32 32
Distance Metrics
A variety of metrics have been developed, from fields as diverse as game playing to pattern recognition, and the most important of these is as follows: Sum of Absolute Difference (Manhattan Distance): Sum of Squared Difference: Mean Absolute Error: Mean Squared Error: Euclidean Distance:
SLIDE 33 33
Distance Metrics
These measures extend our common understanding of the notion of distance to complex mathematical domains (such as vector spaces) and give us tools to understand how similar
- r dissimilar two objects are.
SLIDE 34 34
Dependence Metrics
Two common dependence metrics are as follows: Correlation (Pearson’s Product-Moment Correlation Coefficient): Correlation measures the linear dependence of two vectors or random variables X and Y. Cosine Similarity: Cosine similarity measures the cosine of the angle between two vectors of length N in N- dimensional space. NOTE that these are similar calculations, except that correlation subtracts the mean from each point. For musical signals of any length, the mean will be very close to 0, and so these are effectively the same.
SLIDE 35 35
Distance Metrics
Dependence metrics can be converted (almost) into distance metrics by the simple expediency of subtracting them from 1.0:
Cosine Distance = 1.0 - Cosine Similarity Pearson’s Distance = 1.0 - Correlation Coefficient
Now these are in the range [0..2], with 0 indicating the strongest possible dependence; these are not actually distance metrics, since a metric of 0 does not indicate identity, but just the strongest possible linear depedence; and the cosine distance does not satisfy the triangle inequality; however, this does not prevent them from being extremely useful!!
SLIDE 36 Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
- 4. Accumulation
Steps: Novelty curve
SLIDE 37 Subtraction of local average
Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
- 4. Accumulation
- 5. Normalization
Steps: Novelty curve
SLIDE 38 Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
- 4. Accumulation
- 5. Normalization
Steps: Normalized novelty curve
SLIDE 39 Onset Detection (Spectral-Based)
- 1. Spectrogram
- 2. Logarithmic compression
- 3. Differentiation
- 4. Accumulation
- 5. Normalization
- 6. Peak picking
Steps: Normalized novelty curve
SLIDE 40
Examples of Onset Detection:
WTC Fugue #1 (Bach) A Smooth One (Benny Goodman) Doc‘s Guitar (Doc Watson) WTC Prelude #5 (Bach) Poulenc, Valse No.114 Faure, Op.15, No.1
SLIDE 41 Beat and Tempo
§ Steady pulse that drives music forward and provides the temporal framework of a piece
§ Sequence of perceived pulses that are equally spaced in time § The pulse a human taps along when listening to the music
[Parncutt 1994] [Sethares 2007] [Large/Palmer 2002] [Lerdahl/ Jackendoff 1983] [Fitch/ Rosenfeld 2007]
What is a beat? The term tempo then refers to the speed of the pulse.
SLIDE 42 Beat and Tempo
§ Analyze the novelty curve with respect to reoccurring or quasi- periodic patterns § Avoid the explicit determination
- f note onsets (no peak picking)
Strategy
SLIDE 43 Beat and Tempo
Strategy § Autocorrelation § Fourier transfrom Methods § Analyze the novelty curve with respect to reoccurring or quasi- periodic patterns—as if it were a musical signal and you are trying to find the component pitches (= periodic patterns of the novelty curve) § Avoid the explicit determination
picking)
SLIDE 44 Definition: A tempogram is a time-tempo representation that encodes the local tempo of a music signal
- ver time (= spectrograph of novelty curve!).
Tempo (BPM) Time (seconds) Intensity
Tempogram
SLIDE 45 Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal
- ver time (= spectrograph of novelty curve!).
§ Compute a spectrogram (STFT) of the novelty curve § Convert frequency axis (given in Hertz) into tempo axis (given in BPM) § Magnitude spectrogram indicates local tempo Fourier-based method
Tempogram (Fourier)
SLIDE 46
Tempo (BPM) Time (seconds)
Tempogram (Fourier)
Novelty curve
SLIDE 47
Tempo (BPM)
Tempogram (Fourier)
Novelty curve (local window)
Time (seconds)
SLIDE 48
Tempo (BPM)
Hann-windowed sinusoidal
Tempogram (Fourier)
Time (seconds)
SLIDE 49
Tempo (BPM)
Hann-windowed sinusoidal
Tempogram (Fourier)
Time (seconds)
SLIDE 50
Tempo (BPM)
Tempogram (Fourier)
Hann-windowed sinusoidal
Time (seconds)
SLIDE 51 Definition: A tempogram is a time-tempo represenation that encodes the local tempo of a music signal
- ver time (= spectrograph of novelty curve!).
§ Compare novelty curve with time-lagged local sections of itself § Convert lag-axis (given in seconds) into tempo axis (given in BPM) § Autocorrelogram indicates local tempo Autocorrelation-based method (cf. pitch determination algorithm).
Tempogram (Autocorrelation)
SLIDE 52
Tempogram (Autocorrelation)
Novelty curve (local window)
Lag (seconds) Time (seconds)
SLIDE 53
Tempogram (Autocorrelation)
Windowed autocorrelation
Lag (seconds)
SLIDE 54
Tempogram (Autocorrelation)
Lag = 0 (seconds)
Lag (seconds)
SLIDE 55
Tempogram (Autocorrelation)
Lag = 0.26 (seconds)
Lag (seconds)
SLIDE 56
Tempogram (Autocorrelation)
Lag = 0.52 (seconds)
Lag (seconds)
SLIDE 57
Tempogram (Autocorrelation)
Lag = 0.78 (seconds)
Lag (seconds)
SLIDE 58
Tempogram (Autocorrelation)
Lag = 1.56 (seconds)
Lag (seconds)
SLIDE 59
Tempogram (Autocorrelation)
Time (seconds) Time (seconds) Lag (seconds)
SLIDE 60 300 60 80 40 30 120
Tempogram (Autocorrelation)
Tempo (BPM) Time (seconds) Time (seconds)
SLIDE 61 600 500 400 300 200 100
Tempogram (Autocorrelation)
Tempo (BPM) Time (seconds) Time (seconds)
SLIDE 62
Time (seconds)
Tempogram
Fourier Autocorrelation
Time (seconds) Tempo (BPM)
SLIDE 63
Tempogram
Fourier Autocorrelation
210 70
Tempo (BPM)
Tempo@Tatum = 210 BPM Tempo@Measure = 70 BPM
Time (seconds) Time (seconds)
SLIDE 64
Tempogram
Fourier Autocorrelation
Time (seconds) Time (seconds) Tempo (BPM) Time (seconds)
Emphasis of tempo harmonics (integer multiples) Emphasis of tempo subharmonics (integer fractions)
[Grosche et al., ICASSP 2010] [Peeters, JASP 2007]
SLIDE 65
Tempogram (Summary)
Fourier Autocorrelation
Novelty curve is compared with sinusoidal kernels each representing a specific tempo Novelty curve is compared with time-lagged local (windowed) sections of itself Convert frequency (Hertz) into tempo (BPM) Convert time-lag (seconds) into tempo (BPM) Reveals novelty periodicities Reveals novelty self-similarities Emphasizes harmonics Emphasizes subharmonics Granularity increases as tempo increases; Suitable to analyze tempo on tatum and tactus level Granularity increases as tempo decreases; Suitable to analyze tempo on tatum and measure level