CTP431- Music and Audio Computing Music Information Retrieval - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1

Introduction ü Instrument: Piano Classical ü Genre: ü Composer: Chopin ü Key: E-minor ü Mood: Melancholy, Sad, … ü Songs with similar m elody - ELO “After all ” - Radiohead “Exit Music” ü Can you transcribe the song into a music score ? 2

Information in Music § Factual Information – track, artist, years, composers § Musical Information – Music score: instrument, notes, meter, expressions – Melody, rhythm, chords, structure § Semantic Information – genre, mood, text descriptions 3

Music Understanding by Human http://www.slideshare.net/Daritsetseg/brainstem-auditory-evoked-responses-baer-or-abr-45762118 5

Music Understanding by Computer § Music Information Retrieval (MIR) – An area of research that aims to infer various types of information from music by computers 6

Applications of MIR Music listening § – Music identification, search and recommendation Music Performance § – Interactive music performance – Musical Instrument learning Music composition § – Automatic composition and arrangement Entertainment § – Singing evaluation, game Sound production § – Sound sample search in sound libraries – Automatic segmentation and digital audio Effects 7

Background § Scale and diversity of music contents – Commercial music tracks • Spotify: 30M+ songs (2015) • Bugs music: 10M+ songs (2017) – User contents • YouTube: 300h+ video uploaded per min (2015) • SoundCloud: 12h+ audio uploaded per minute (2014) – User data • Profile, play history, rate, • Spotify: +24M active users (as of Jan, 2014) • YouTube: +1B unique users’ visit each month (as of Dec, 2014) § All the music contents are readily accessible. – How can we find music of my taste? – Can we have a Google for music?

Music Identification § Query by music – Search a single unique song identified by the query – Audio fingerprinting Shazam Audio Fingerprinting (http://labrosa.ee.columbia.edu/matlab/fingerprint/) 10

Music Identification § Query by humming – Sing with humming and find closest matches – Melody-based match Melody Extraction SoundHound 11

Music Search and Recommendation § Music Recommendation – Playlist generation: personalized internet radio – Matching songs to users • Song information: genre, years, artist, audio • User information: profile, play history, rating, context (places) – Music service item in industry: Google, Apple, Pandora, Spotify, Melon, Bugs,… iTunes Music Pandora 12

Current Approaches § Manual Curation § Human Expert Analysis § Collaborative Filtering § Content-based Analysis (by computers) 13

Manual Curation § Playlist generation by music experts (or users) – Traditional: AM/FM radio – The majority of current music services are based on this approach § Advantages – Effective for usage-based music services (workout, study, driving or prenatal education) – Good for music discovery – Often with story-telling § Limitations – No personalization – Not scalable [www.soribada.com] 14

Human Expert Analysis § Pandora: music genome project (1999) – Musicologists analyze a song for about 450 musical attributes in various categories – Big success as a music service § Advantages – High-quality analysis – Good for music discovery § Limitations – Expensive: take 20-30 minutes for a song to be analyzed – Not scalable : only for commercial tracks ? 15

Collaborative Filtering (CF) § Basic idea Person A: I like songs A, B, C and D. Person B: I like songs A, B, C and E. Person A: Really? You should check out song D. Person B: Wow, you also should check out song E. § Formation Song – Matrix factorization (or matrix completion) problem Preference T y s p us = x u y s User Similarity T x u 2 q u 1 u 2 = x u 1 Song Similarity Gangnam Style’s latent T y s 2 Juhan x u vector r s 1 s 2 = y s 1 Juhan’s latent vector Gangnam Style 16 16

Collaborative Filtering § Advantages – Capture semantics of music in the aspect of human – Enable personalized recommendation (by nature) § Limitations – The cold start problem: what if a song was never played by anyone? – Popularity bias: likely to recommend (already) well-known songs or songs from the same musician or album 17

Collaborative Filtering § Bad examples Can you find songs similar to this musician? [Oord et. al, 2013] 18

Content-Based Analysis § An intelligent approach that makes computers listen to music and predict descriptive words from audio tracks – Tags: genre, mood, instrument, voice quality, usage – Features: Spectrogram, MFCC, – Algorithms: GMM, SVM, Neural Networks Audio Files Audio Features Algorithms 19

Text-based Music Retrieval by Auto-tagging § Sort the probability of the query tag and choose top-N songs – Like text-based Google search Top 5 ranked songs Norah Jones – Don’t know why Dido – Here with me Query word: “Female Lead Vocals” Sheryl Crow – I shall believe No doubt – Simple kind of like Carpenters – Rainy days and Mondays § We also can compute similarity between songs using the estimated tag probabilities – E.g. cosine distance between two tag probability vectors – Applicable to query by audio 21

Demo: Music Galaxy Hitchhiker (b) Search by Song mode with highlighted search results

Content-based Music Recommendation § Blending audio and user data – Replace the text-based tags with the latent vector of a song “user” “Gangnam Style’s latent vector “song” Matrix factorization from collaborative filtering [Oord et. al, 2013] Audio Track of “Gangnam Style” 23

Music Retrieval Results Collaborative Filtering Collaborative Filtering only + Audio Content [Oord et. al, 2013] 24

Content-Based Analysis § Advantages – Free of cold-start and popularity bias – Highly scalable: using high-performance computing – Works for music in other media or user content as well – Can be combined with other approaches § Limitations – Social context is also important: indy, idol, affilation – Do not care of music quality (e.g. level of performance), especially for user contents 25

Automatic Music Transcription (AMT) § Predict score information from audio – Note information: note onset, duration, velocity – Rhythm: tempo, beat, down-beat – Chord – Structure

Zenph’s Re-performance

Entertainment / Education Yousician 29

Score-Audio Alignment § Temporally align audio and score – Dynamic time warping of AMT results as audio features § Applications – Score Following – Automatic page turning – Auto-accompaniment – Performance analysis

Automatic Page Turner (JKU, Austria)

The Piano Music Companion ( JKU, Austria) 32

Sonation’s Cadenza 33

Music Production https://www.youtube.com/watch?v=RmT6MDOD3uc

Music Production § Adaptive Audio Effects: automatic effect control – Loudness • Compressor – Pitch • Pitch correction (e.g. auto-tune) • Harmonizer – Timbre • Genre-based automatic EQ Antares Auto-tune

Music Production § Singing Expression Transfer – Given two renditions of the same piece of music – Transfer singing expressions from one voice to another Note timing, Pitch, Dynamics

Singing Expression Transfer Temporal Alignment Pitch Alignment Dynamics Alignment Target Singing Voice Feature Pitch Envelope DTW Smoothing HPSS Extraction Detector Detector harmonic signal stretching ratio smoothed pitch ratio gain ratio stretching ratio Source Time-Scale Modified Pitch Shifting Gain 𝑡 𝑡 " 𝑡 "# 𝑡 "#$ Singing Voice Modification Singing Voice

Singing Expression Transfer: Demo Examples source all modified source target 벚꽃엔딩 Let it go 취중진담

Music Production § Sound Sample search – Imagine Research’s MediaMind: search sound effect sample for media production (e.g. film, drama) – Izotope’s Breaktweaker: search similar timbre of drum sounds 39

Automatic Music Composition § Algorithmic Composition – An Area of Generative Art § Types of Algorithms – Generative Grammar – Transition Network – Markov Model – Generic Algorithms – Neural Networks

Automatic Music Composition § David Cope’s EMI (Experiments in Music Intelligence) (1980s) – Based on Style Imitation Augmented Transition Networks

Recent Work: Automatic Music Composition § Flow Machine – Style Imitation based on Markov Model – http://www.flow-machines.com/ § Magenta – Python Library based Deep Neural Networks (TensorFlow) – https://magenta.tensorflow.org/welcome-to-magenta

“Daddy’s car”: Sony CSL Lab’s Flow Machines

Automatic Music Composition § Background Music Generation: www.jukedeck.com

Automatic Music Arrangement 쿨잼 (Cool Jamm) – Hum On

Musical “Process” and “Data” “Musical” Knowledge Base Composer Perception Symbolic Cognition Representation Data Performer Listener Process Sound Temporal Field Control Instrument Source Room Sound “Physical” Knowledge Base

CTP431- Music and Audio Computing Music Information Retrieval - PowerPoint PPT Presentation

CTP431- Music and Audio Computing Music Information Retrieval Graduate School of Culture Technology KAIST Juhan Nam 1 Introduction Instrument: Piano Classical Genre: Composer: Chopin Key: E-minor Mood: Melancholy, Sad,

CTP431- Music and Audio Computing Audio Signal Processing (Part #2) Graduate School of Culture

CTP431- Music and Audio Computing Audio Signal Processing (Part #1) Graduate School of Culture

CTP431- Music and Audio Computing, Fall 2017 Introduction Graduate School of Culture Technology,

CTP431- Music and Audio Computing Fundamentals of Sound and Digital Audio Graduate School of

CTP431- Music and Audio Computing Digital Audio Effects Graduate School of Culture Technology

CTP431- Music and Audio Computing Digital Audio Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Musical Interface and Sequencer Graduate School of Culture

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Musical Interface Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Spectral Analysis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Sound Synthesis Graduate School of Culture Technology KAIST

CTP431- Music and Audio Computing Spectral Analysis Graduate School of Culture Technology KAIST

Audio Device Client Better and Faster Audio I/O on Web Hongchan Choi Google Chrome Web Audio

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

Cirrus Audio Solutions Cirrus Audio Solutions Home Audio Portable Audio Personal CD Player

Digital Audio Graduate School of Culture Technology, KAIST Juhan Nam Outlines Introduction

A stochastic model for biological neuronal nets Antonio Galves Eva L ocherbach First Workshop

Marr-Albus Model of Cerebellum Computational Models of Neural Systems Lecture 2.2 David S.

Administrivia Homework 2 will be posted today Will be due Tue., Feb. 23 before class

Housing F ir st and Coor dinate d E ntr y Chic a g o , I L Se pte mb e r 12-13, 2018 Home

Trade Economics of Olives and Olive Oil: Data and Issues Sacramento Valley Olive Day Orland,

Olives: what are they anyways? Yasha Pushak

U.S. Consumer Demand for Olive Oil: An Economietric Analysis from Retail Scanning Data Jose

Dwarf Galaxies, Chemical Evolution, and the Future for the Primordial Helium Abundance Evan