ctp431 music and audio computing music information
play

CTP431- Music and Audio Computing Music Information Retrieval - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1 Introduction Instrument: Piano Classical Genre: Composer: Chopin Key: E-minor Mood: Melancholy, Sad,


  1. CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1

  2. Introduction ü Instrument: Piano Classical ü Genre: ü Composer: Chopin ü Key: E-minor ü Mood: Melancholy, Sad, … ü Songs with similar m elody - ELO “After all ” - Radiohead “Exit Music” ü Can you transcribe the song into a music score ? 2

  3. Information in Music § Factual Information – track, artist, years, composers § Musical Information – Music score: instrument, notes, meter, expressions – Melody, rhythm, chords, structure § Semantic Information – genre, mood, text descriptions 3

  4. Music Understanding by Human http://www.slideshare.net/Daritsetseg/brainstem-auditory-evoked-responses-baer-or-abr-45762118 5

  5. Music Understanding by Computer § Music Information Retrieval (MIR) – An area of research that aims to infer various types of information from music by computers 6

  6. Applications of MIR Music listening § – Music identification, search and recommendation Music Performance § – Interactive music performance – Musical Instrument learning Music composition § – Automatic composition and arrangement Entertainment § – Singing evaluation, game Sound production § – Sound sample search in sound libraries – Automatic segmentation and digital audio Effects 7

  7. Background § Scale and diversity of music contents – Commercial music tracks • Spotify: 30M+ songs (2015) • Bugs music: 10M+ songs (2017) – User contents • YouTube: 300h+ video uploaded per min (2015) • SoundCloud: 12h+ audio uploaded per minute (2014) – User data • Profile, play history, rate, • Spotify: +24M active users (as of Jan, 2014) • YouTube: +1B unique users’ visit each month (as of Dec, 2014) § All the music contents are readily accessible. – How can we find music of my taste? – Can we have a Google for music?

  8. Music Identification § Query by music – Search a single unique song identified by the query – Audio fingerprinting Shazam Audio Fingerprinting (http://labrosa.ee.columbia.edu/matlab/fingerprint/) 10

  9. Music Identification § Query by humming – Sing with humming and find closest matches – Melody-based match Melody Extraction SoundHound 11

  10. Music Search and Recommendation § Music Recommendation – Playlist generation: personalized internet radio – Matching songs to users • Song information: genre, years, artist, audio • User information: profile, play history, rating, context (places) – Music service item in industry: Google, Apple, Pandora, Spotify, Melon, Bugs,… iTunes Music Pandora 12

  11. Current Approaches § Manual Curation § Human Expert Analysis § Collaborative Filtering § Content-based Analysis (by computers) 13

  12. Manual Curation § Playlist generation by music experts (or users) – Traditional: AM/FM radio – The majority of current music services are based on this approach § Advantages – Effective for usage-based music services (workout, study, driving or prenatal education) – Good for music discovery – Often with story-telling § Limitations – No personalization – Not scalable [www.soribada.com] 14

  13. Human Expert Analysis § Pandora: music genome project (1999) – Musicologists analyze a song for about 450 musical attributes in various categories – Big success as a music service § Advantages – High-quality analysis – Good for music discovery § Limitations – Expensive: take 20-30 minutes for a song to be analyzed – Not scalable : only for commercial tracks ? 15

  14. Collaborative Filtering (CF) § Basic idea Person A: I like songs A, B, C and D. Person B: I like songs A, B, C and E. Person A: Really? You should check out song D. Person B: Wow, you also should check out song E. § Formation Song – Matrix factorization (or matrix completion) problem Preference T y s p us = x u y s User Similarity T x u 2 q u 1 u 2 = x u 1 Song Similarity Gangnam Style’s latent T y s 2 Juhan x u vector r s 1 s 2 = y s 1 Juhan’s latent vector Gangnam Style 16 16

  15. Collaborative Filtering § Advantages – Capture semantics of music in the aspect of human – Enable personalized recommendation (by nature) § Limitations – The cold start problem: what if a song was never played by anyone? – Popularity bias: likely to recommend (already) well-known songs or songs from the same musician or album 17

  16. Collaborative Filtering § Bad examples Can you find songs similar to this musician? [Oord et. al, 2013] 18

  17. Content-Based Analysis § An intelligent approach that makes computers listen to music and predict descriptive words from audio tracks – Tags: genre, mood, instrument, voice quality, usage – Features: Spectrogram, MFCC, – Algorithms: GMM, SVM, Neural Networks Audio Files Audio Features Algorithms 19

  18. Text-based Music Retrieval by Auto-tagging § Sort the probability of the query tag and choose top-N songs – Like text-based Google search Top 5 ranked songs Norah Jones – Don’t know why Dido – Here with me Query word: “Female Lead Vocals” Sheryl Crow – I shall believe No doubt – Simple kind of like Carpenters – Rainy days and Mondays § We also can compute similarity between songs using the estimated tag probabilities – E.g. cosine distance between two tag probability vectors – Applicable to query by audio 21

  19. Demo: Music Galaxy Hitchhiker (b) Search by Song mode with highlighted search results

  20. Content-based Music Recommendation § Blending audio and user data – Replace the text-based tags with the latent vector of a song “user” “Gangnam Style’s latent vector “song” Matrix factorization from collaborative filtering [Oord et. al, 2013] Audio Track of “Gangnam Style” 23

  21. Music Retrieval Results Collaborative Filtering Collaborative Filtering only + Audio Content [Oord et. al, 2013] 24

  22. Content-Based Analysis § Advantages – Free of cold-start and popularity bias – Highly scalable: using high-performance computing – Works for music in other media or user content as well – Can be combined with other approaches § Limitations – Social context is also important: indy, idol, affilation – Do not care of music quality (e.g. level of performance), especially for user contents 25

  23. Automatic Music Transcription (AMT) § Predict score information from audio – Note information: note onset, duration, velocity – Rhythm: tempo, beat, down-beat – Chord – Structure

  24. Zenph’s Re-performance

  25. Zenph’s Re-performance

  26. Entertainment / Education Yousician 29

  27. Score-Audio Alignment § Temporally align audio and score – Dynamic time warping of AMT results as audio features § Applications – Score Following – Automatic page turning – Auto-accompaniment – Performance analysis

  28. Automatic Page Turner (JKU, Austria)

  29. The Piano Music Companion ( JKU, Austria) 32

  30. Sonation’s Cadenza 33

  31. Music Production https://www.youtube.com/watch?v=RmT6MDOD3uc

  32. Music Production § Adaptive Audio Effects: automatic effect control – Loudness • Compressor – Pitch • Pitch correction (e.g. auto-tune) • Harmonizer – Timbre • Genre-based automatic EQ Antares Auto-tune

  33. Music Production § Singing Expression Transfer – Given two renditions of the same piece of music – Transfer singing expressions from one voice to another Note timing, Pitch, Dynamics

  34. Singing Expression Transfer Temporal Alignment Pitch Alignment Dynamics Alignment Target Singing Voice Feature Pitch Envelope DTW Smoothing HPSS Extraction Detector Detector harmonic signal stretching ratio smoothed pitch ratio gain ratio stretching ratio Source Time-Scale Modified Pitch Shifting Gain 𝑡 𝑡 " 𝑡 "# 𝑡 "#$ Singing Voice Modification Singing Voice

  35. Singing Expression Transfer: Demo Examples source all modified source target 벚꽃엔딩 Let it go 취중진담

  36. Music Production § Sound Sample search – Imagine Research’s MediaMind: search sound effect sample for media production (e.g. film, drama) – Izotope’s Breaktweaker: search similar timbre of drum sounds 39

  37. Automatic Music Composition § Algorithmic Composition – An Area of Generative Art § Types of Algorithms – Generative Grammar – Transition Network – Markov Model – Generic Algorithms – Neural Networks

  38. Automatic Music Composition § David Cope’s EMI (Experiments in Music Intelligence) (1980s) – Based on Style Imitation Augmented Transition Networks

  39. Recent Work: Automatic Music Composition § Flow Machine – Style Imitation based on Markov Model – http://www.flow-machines.com/ § Magenta – Python Library based Deep Neural Networks (TensorFlow) – https://magenta.tensorflow.org/welcome-to-magenta

  40. “Daddy’s car”: Sony CSL Lab’s Flow Machines

  41. Automatic Music Composition § Background Music Generation: www.jukedeck.com

  42. Automatic Music Arrangement 쿨잼 (Cool Jamm) – Hum On

  43. Musical “Process” and “Data” “Musical” Knowledge Base Composer Perception Symbolic Cognition Representation Data Performer Listener Process Sound Temporal Field Control Instrument Source Room Sound “Physical” Knowledge Base

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend