tonal analysis hidden markov model
play

Tonal Analysis Hidden Markov Model Graduate School of Culture - PowerPoint PPT Presentation

GCT634: Musical Applications of Machine Learning Tonal Analysis Hidden Markov Model Graduate School of Culture Technology, KAIST Juhan Nam Outlines Introduction - Tonality - Perceptual Distance of Two Tones - Chords and Scales Tonal


  1. GCT634: Musical Applications of Machine Learning Tonal Analysis Hidden Markov Model Graduate School of Culture Technology, KAIST Juhan Nam

  2. Outlines • Introduction - Tonality - Perceptual Distance of Two Tones - Chords and Scales • Tonal Analysis - Key Estimation - Chord Recognition • Hidden Markov Model

  3. Introduction Bach’s Chorale Harmonization Jazz “Real book” Pop Music

  4. Tonality • Tonal music has a tonal center called key - 12 keys (C, C#, D, …, B) • Tonal music has a major or minor scale on the key and the notes have different roles (C major scale) • Notes in tonal music are harmonized by chords

  5. Tonality • A sequence of notes or chord progressions provide certain degree of stability or instability - E.g., cadence (V-I, IV-I), tension (sus2, sus4) • How the tonality is formed? - In other words, how we perceive different degrees of stability or tension from notes?

  6. Tonality • Consonance and Dissonance - If two sinusoidal tones are within 3 ST (minor 3 rd ) in frequency, they become dissonant - Most dissonant when they are apart about one quarter of the critical band - Critical bands become wider below 500 Hz; two low notes can sound dissonant (e.g. two piano notes in lower keys) • Consonance of two harmonics tones - Determined by how much two tones have closely-located overtones within critical bands

  7. Consonance Rating of Intervals in Music • Perceptual distance between two notes are different from semi- tone distance between them.

  8. Chords • The basic units of tonal harmony - Triads, 7 th , 9 th , 11 th , … • Triads are formed by choosing three notes that make the most consonant (or “most harmonized”) sounds - This ends up with stacking up major or minor 3rds - 7 th , 9 th are obtained by stacking up 3rds more. • The quality of consonance becomes more sophisticated as more notes are added - Music theory is basically about how to create tension and resolve it with different quality of consonance

  9. Scales in Tonal Harmony • Major Scale - Formed by spreading notes from three major chords • Minor scale - Formed by spreading notes from three minor chords (natural minor scale) - Harmonic or melodic minor scale can be formed by using both minor and major chords

  10. Automatic Chord Recognition • Identifying chord progression of tonal music • It is a challenging task (even for human) - Chords are not explicit in music - Non-chord notes or passing notes - Key change and chromaticism: requires in-depth knowledge of music theory - In audio, multiple musical instruments are mixed - Relevant: harmonically arranged notes - Irrelevant: percussive sounds (but can help detecting chord changes) • What kind of audio features can be extracted to recognize chords in a robust way?

  11. Chroma Features: FFT-based approach • Compute spectrogram and mapping matrix - Convert frequency to music pitch scale and get the pitch class - Set one to the corresponding pitch class and, otherwise, set zero - Adjust non-zeros values such that low-frequency content have more weights

  12. Chroma Features: Filter-bank approach • A filter-bank can be used to get a log- scale time-frequency representation - Center frequencies are arranged over 88 piano notes - band widths are set to have constant-Q and robust to +/- 25 cent detune • The outputs that belong to the same pitch class are wrapped and summed. (Müller, 2011)

  13. Beat-Synchronous Chroma Features • Make chroma features homogeneous within a beat (Bartsch and Wakefield, 2001) (From Ellis’ slides)

  14. Key Estimation Overview • Estimate music key from music data - One of 24 keys: 12 pitch classes (C, C#, D, .., B) + major/minor • General Framework (Gomez, 2006) Chroma Similarity Average Key G major Features Measure Strength Key Template

  15. Key Template • Probe tone profile (Krumhansl and Kessler, 1982) - Relative stability or weight of tones - Listeners rated which tones best completed the first seven notes of a major scale - For example, in C major key, C, D, E, F, G, A, B, … what? Probe Tone Profile - Relative Pitch Ranking

  16. Key Estimation • Similarity by cross-correlation between chroma features and templates • Find the key that produces the maximum correlation

  17. Chord Recognition • Estimate chords from music data - Typically, one of 24 keys: 12 pitch classes + major/minor - Often, diminish chords are added (36 chords) • General Framework Template Matching HMM, SVM Audio/ Decision Chords Chroma Transform Making Features Chord Template or Models

  18. Template-Based Approach • Use chord templates (Fujishima, 1999; Harte and Sandler, 2005) and find the best matches • Chord Templates (from Bello’s Slides)

  19. Template-Based Approach • Compute the cross-correlation between chroma features and chord templates and select chords that have maximum values (from Bello’s Slides)

  20. Review • Template approach is too straightforward - The binary templates are hard assignments • We can use a multi-class classifier - The output is one of the target chords - However, the local estimation tends to be temporally not smooth • We need some algorithm that considers the temporal dependency between chords - The majority of tonal music have certain types of chord progression

  21. Hidden Markov Model (HMM) • A probabilistic model for time series data - Speech, gesture, DNA sequence, financial data, weather data, … • Assumes that the time series data are generated from hidden states and the hidden states follow a Markov model • Learning-based approach - Need training data annotated with labels - The labels usually correspond to hidden states

  22. Markov Model • A random variable 𝑟 has 𝑂 states ( 𝑇 1 , 𝑇 2 , … , 𝑇 𝑂 ) and, at each time step, one of the states are randomly chosen: 𝑟 ( ∈ {𝑇 1 , 𝑇 2 , … , 𝑇 𝑂 } • The probability distribution for the current state is determined by the previous state(s) - The first-order: 𝑄 𝑟 ( 𝑟 - , 𝑟 . , … , 𝑟 (/- = 𝑄 𝑟 ( 𝑟 (/- - The second-order: 𝑄 𝑟 ( 𝑟 - , 𝑟 . , … , 𝑟 (/- = 𝑄 𝑟 ( 𝑟 (/- , 𝑟 (/. • The first-order Markov model is widely used for simplicity

  23. Markov Model • Example: chord progression - 𝑟 ( ∈ {𝐷, 𝐺, 𝐻} - The transition probability matrix 3 by 3 𝑄 𝑟 ( = 𝐷 𝑟 (/- = 𝐺 = 0.2 𝑄 𝑟 ( = 𝐷 𝑟 (/- = 𝐷 = 0.7 𝑄 𝑟 ( = 𝐺 𝑟 (/- = 𝐺 = 0.6 𝑄 𝑟 ( = 𝐺 𝑟 (/- = 𝐷 = 0.1 F 𝑄 𝑟 ( = 𝐻 𝑟 (/- = 𝐺 = 0.2 𝑄 𝑟 ( = 𝐻 𝑟 (/- = 𝐷 = 0.2 C St G End 𝑄 𝑟 ( = 𝐷 𝑟 (/- = 𝐻 = 0.3 𝑄 𝑟 ( = 𝐺 𝑟 (/- = 𝐻 = 0.1 𝑄 𝑟 ( = 𝐻 𝑟 (/- = 𝐻 = 0.6

  24. Markov Model • The joint probability of a sequence of states is simple with the Markov model 𝑄 𝑟 - , 𝑟 . , … , 𝑟 ( = 𝑄 𝑟 - , 𝑟 . , … , 𝑟 (/- 𝑄 𝑟 ( 𝑟 - , 𝑟 . , … , 𝑟 (/- = 𝑄 𝑟 - , 𝑟 . , … , 𝑟 (/- 𝑄 𝑟 ( 𝑟 (/- = 𝑄 𝑟 - , 𝑟 . , … , 𝑟 (/. 𝑄 𝑟 (/- 𝑟 - , 𝑟 . , … , 𝑟 (/. 𝑄 𝑟 ( 𝑟 (/- = 𝑄 𝑟 - , 𝑟 . , … , 𝑟 (/. 𝑄 𝑟 (/- 𝑟 (/. 𝑄 𝑟 ( 𝑟 (/- = 𝑄 𝑟 - 𝑄 𝑟 . |𝑟 - … 𝑄 𝑟 (/- 𝑟 (/. 𝑄 𝑟 ( 𝑟 (/-

  25. What Can We Do with the Markov Model? • Generate a chord sequence - e.g.) C – C – C – C – F – F – C – C – G – G – C– C - … - We can also generate melody if we define the transition probability matrix among notes • Evaluate if a specific chord progression is more likely than others. - For example, C-G-C is more likely than C-F-C (assuming 𝑄 𝑟 - = 𝐷 = 1 ) 𝑄 𝑟 = 𝐷, 𝐻, 𝐷 = 𝑄 𝑟 - = 𝐷 𝑄 𝑟 . = 𝐻|𝑟 - = 𝐷 𝑄 𝑟 ; = 𝐷|𝑟 . = 𝐻 = 0.2 ∗ 0.3 = 0.06 𝑄 𝑟 = 𝐷, 𝐺, 𝐷 = 𝑄 𝑟 - = 𝐷 𝑄 𝑟 . = 𝐺|𝑟 - = 𝐷 𝑄 𝑟 ; = 𝐷|𝑟 . = 𝐺 = 0.1 ∗ 0.2 = 0.02

  26. What Can We Do with a Markov Model ? • Compute the probability that the chord at time 𝑈 is C (or F or G) - Naïve method: count all paths that have C chord at time 𝑈 : exponential! - Clever method: use a recursive induction - 𝑄 𝑟 > = 𝐷 = 𝑄 𝑟 > = 𝐷|𝑟 >/- = 𝐷 𝑄 𝑟 >/- = 𝐷 +𝑄 𝑟 > = 𝐷|𝑟 >/- = 𝐺 𝑄 𝑟 >/- = 𝐺 +𝑄 𝑟 > = 𝐷|𝑟 >/- = 𝐻 𝑄 𝑟 >/- = 𝐻 - Repeat this for 𝑄 𝑟 @ = 𝐷 , 𝑄 𝑟 @ = 𝐺 , 𝑄 𝑟 @ = 𝐻 for 𝑗 = 𝑈 − 1, 𝑈 − 2, … , 1

  27. Chord Recognition from Audio • What we observe are not chords but audio features (e.g. chroma) • We want to infer a chord sequence from audio feature sequences 𝑟 - , 𝑟 . , … , 𝑟 (/- 𝑃 - , 𝑃 . , … , 𝑃 (/-

  28. Hidden Markov Model (HMM) • The hidden states follow the Markov model • Given a state, the corresponding observation distribution is independent of previous states or observations - Each state has emission distribution F C . . . 𝑟 (/- 𝑟 ( 𝑟 (D- G 𝑃 (/- 𝑃 ( 𝑃 (D- 𝑄 𝑃 𝑟 ( = 𝐷 𝑄 𝑃 𝑟 ( = 𝐻 𝑄 𝑃 𝑟 ( = 𝐺

  29. Hidden Markov Model (HMM) • Model parameters - Initial state probabilities: 𝑄 𝑟 E → 𝜌 @ - Transition probability matrix: 𝑄 𝑟 ( 𝑟 (/- → 𝑏 @J - Observation distribution given a state: 𝑄 𝑃 𝑟 J → 𝑐 J (e.g. Gaussian) • How can we learn the parameters from data?

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend