Music Information Retrieval and Music Emotion Recognition Yi-Hsuan - PowerPoint PPT Presentation

2014 Music Information Retrieval and Music Emotion Recognition Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica

About Me & CITI, AS • Yi-Hsuan Yang , Ph.D., Assistant Research Fellow  Education Ph.D. in GICE, National Taiwan University, 2006-2010 B.S. in EE, National Taiwan University, 2002-2006  Research Interests Music information retrieval, multimedia applications, and machine learning • Research Center for IT Innovation, Academia Sinica  Music and Audio Computing Lab - Since 2011/09 - Research assistants - PhD students - Postdocs  Industrial collaborations: KKBOX, HTC, iKala 2

Outline • What is and why music information retrieval ? • Current projects • Example project: music and emotion 3

Digital Music Industry 5

Proliferation of Mobile Devices Mobile behavior related to multimedia Took photos Played games Recorded video Social networking Listened to music Watched video 0% 20% 40% 60% • 1.5 billion handsets were sold in 2011 Japan Europe United States • 1/3 of them are smart phones • 6 billion mobile-cellular subscriptions #Statistics from ITU 6

Music Information Retrieval • User need: find the “right” song  For a specific listening context (in a car, before sleep)  For a specific mood (feeling down, in an anger)  For a specific event (wedding, party)  For accompanying a video (home video, movie) • Current solution  Manual  Keyword search  Social recommendation 9

“Smart” Content-Based Retrieval Recommendation Query by humming Music audio Music content analysis (e.g., similarity estimation) Content-based retrieval 10

Demos Pop Danthology 2012 – Mashup of 50+ Pop Songs

Scope of MIR • Music signal analysis  Timbre, rhythm, pitch, harmony, tonality  Melody transcription, audio-to-score alignment  Source separation • Content-based music retrieval  Metadata-based  Genre, style, and mood analysis  Audio-based  Query by example / singing / humming / tapping  Fingerprinting and digital rights management  Recommendation, personalized playlist generation  Summarization, structure analysis 12

Scope of MIR (Cont’) • By nature inter-disciplinary Information Signal Musicology science processing Human Machine Psychology computer learning Computer interaction science 14

Current Projects 1/4: Music Emotion • Music retrieval and organization by “emotion”  Music is created to convey and modulate emotions  The most important functions of music are social and psychological (Huron, 2000) 16

Current Projects 2/4: Listening Context Mobile phone On-device music sensing feature extraction Accelerometer Microphone Ambient light Proximity Compass Running apps Dual cameras Time GPS Wifi Gyroscope 17

Current Projects 3/4: Singing Voice Separation • Useful for modeling singing voice timbre, instrument identification and melody transcription

Current Projects 4/4: Musical Timbre 19

Focus: Emotion-based Recognition & Retrieval ￮ Energy or neurophysiological Activation ‒ Arousal stimulation level Evaluation ‒ Valence ￮ Pleasantness ￮ Positive and negative affective states [psp80]

Music Retrieval in the Emotion Space • Automatic computation of activation activation energy level music emotion  No need of human labeling ⊳ Demo  Scalable  Easy to personalize/update valence valence • Emotion-based music positive or retrieval / recommendation negative  Content-based  Intuitive  Fun 23

Learning to Predict Music Emotion • Learn the mapping between ground truth and feature using pattern recognition algorithms feature Feature training extraction Model data training (multimedi Manual a signal) annotation ground truth model feature Feature Automatic test estimate extraction Prediction data 24

- Figure from Paul Lamere Audio Feature Analysis 25

Short-Time Fourier Transform and Spectrogram Time domain waveform Time-frequency spectrogram • Time domain: energy , rhythm • Frequency domain: pitch , harmonics , timbre 26

Timbre • The perceptual feature that makes two sounds with same pitch and loudness sound different  Temporal attack-delay  Spectral shape (a) Flute (b) Clarinet 27

Spectral Timbre Features • Widely used in all kinds of MIR tasks • Spectral centroid (brightness) • Spectral rolloff  The freq. which 85% of spectral power is concentrated • Spectral flux  Amount of frame-to-frame spectral amplitude difference (local change) • Spectral flatness  Whether the spectral power is concentrated Mel spectrum • Mel-frequency cepstral coefficient (MFCC) • Vibrato 28

Pitch 29

Extension 1: Time-varying Prediction Application to Video content understanding 35

Extension 2: Affect-Based MV Composition • Audio • Video  Sound energy  Lighting key  Tempo and beat strength  Shot change rate  Rhythm regularity  Motion Intensity  Pitch  Color (saturation, color energy) 36

Demos • Music → video • Video → music • ACM MM 2012 Multimedia Grand Challenge First Prize 。 “The Acousticvisual Emotion Gaussians model for automatic generation of music video,” J.-C. Wang, Y.-H. Yang, I.-H. Jhuo, Y.-Y. Lin, and H.-M. Wang 37

Extension 3: User Mood & Music Emotion • In addition to blog writing, users  enter an emotion tag (user mood)  enter a song title & artist name (music emotion) 39

Mood-Congruent or Mood-Incongruent 40

Emotion-Based Music Recommendation • Melody Feature Feature • Timbre * Training extraction • Dynamics data * Model • Rhythm (multimedia training Manual * • Lyrics signal) annotation Emotion value Model Feature Test Feature Automatic * Personalization data extraction Prediction Emotion value Human affect/activity User Emotion-based detection feedback (e.g., facial expression, recommendation speech intonation)

Wrap-Up • Introduction of the field ‘Music information retrieval’  Music signal analysis  Query by example (humming, similarity)  Query by text (genre, emotion) • Current projects at our lab  Context & listening behavior  Source separation  Modeling musical timbre  Music and emotion  2-D visualization  Time-varying prediction  Emotion-based music video composition  Music emotion and user mood; emotion-based recommendation 43

Int. Society for Society for Music Information Retrieval (ISMIR) • General chairs : Jyh-Shing Roger Jang (NTU) et al. • Program chairs : Yi-Hsuan Yang (Academia Sinica) et al. • Music chairs : Jeff Huang (Kainan University) et al. • Call for Music : ISMIR/WOCMAT 2014 Main Theme – “Oriental Thinking” (Due: June 1, 2014)

MIREX (MIR Evaluation eXchange) • • Audio Classification (Train/Test) Multiple Fundamental Frequency Tasks Estimation & Tracking • • Audio K-POP Genre Classification Real-time Audio to Score Alignment (a.k.a Score Following) • Audio K-POP Mood Classification • Audio Cover Song Identification • Audio Tag Classification • Discovery of Repeated Themes & • Audio Music Similarity and Retrieval Sections • Symbolic Melodic Similarity • Audio Melody Extraction • Structural Segmentation • Query by Singing/Humming • Audio Tempo Estimation • Query by Tapping • Audio Onset Detection • Audio Chord Estimation • Audio Beat Tracking • Audio Key Detection

Music Information Retrieval and Music Emotion Recognition Yi-Hsuan - PowerPoint PPT Presentation

2014 Music Information Retrieval and Music Emotion Recognition Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica About

Motivation and Emotion: Emotions, Stress and Health Unit Overview Theories of Emotion

Emotion and Child Development Eve Ekman, MSW, PhC UC Berkeley, UCSF Overview Science of Emotion

Emotion Lecturer: Dr Tony Mowbray (tony.mowbray@monash.edu) Learning Objectives Define

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

music genre recognition karin kosina kyrah@gnu.org systems in motion music genre recognition

Based on Signal Processing SHREEKANT MARWADI Why Emotion Recognition in HCI? 1 2 3 Natural

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Information Retrieval Introducing Information Retrieval and Web Search

EM EMOTION RECOGNITION IN IN SOUND ANASTASIYA S. POPOVA HSE NN 2017 INTRODUCTION THE

Approach Md Mahadi Hasan Nahid, Bishwajit Purkaystha, Md Saiful Islam Department of Computer

We considered age-related disguise as the intentional modification of the speaker's voice to

Building Competitive Advantage through Successful Training and Development Submitted in the

Chapter 1 Internals and Computer System Design Overview Principles Eighth Edition By

Disks and RAID (Chapter 12, 14.2) CS 4410 Operating Systems [R. Agarwal, L. Alvisi, A. Bracy,

FlashTier: A Lightweight, Consistent and Durable Storage Cache 1 Outline Introduction

Maximally entangled mixed states with fixed marginals Giuseppe Baio SUPA & University of

Music Information Retrieval and Music Emotion Recognition Yi-Hsuan - PowerPoint PPT Presentation

2014 Music Information Retrieval and Music Emotion Recognition Yi-Hsuan Yang Ph.D. http://www.citi.sinica.edu.tw/pages/yang/ yang@citi.sinica.edu.tw Music & Audio Computing Lab, Research Center for IT Innovation, Academia Sinica About

Motivation and Emotion: Emotions, Stress and Health Unit Overview Theories of Emotion

Emotion and Child Development Eve Ekman, MSW, PhC UC Berkeley, UCSF Overview Science of Emotion

Emotion Lecturer: Dr Tony Mowbray (tony.mowbray@monash.edu) Learning Objectives Define

Information Retrieval Introducing Information Retrieval and Web Search Information Retrieval

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &amp;

XML Retrieval XML Retrieval XML Retrieval XML Retrieval DB/IR in DB/IR in Theory Theory Web

MUSIC THERAPY MUSIC THERAPY What is music therapy? Music therapy is simply the process of using

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

music genre recognition karin kosina kyrah@gnu.org systems in motion music genre recognition

Based on Signal Processing SHREEKANT MARWADI Why Emotion Recognition in HCI? 1 2 3 Natural

CS54701: Information Retrieval CS-54701 Information Retrieval Luo Si Department of Computer

Retrieval by Content Part 2: Text Retrieval Term Frequency and Inverse Document Frequency

Model Divergence Retrieval LM, session 10 CS6200: Information Retrieval Slides by: Jesse

Information Retrieval CS276: Information Retrieval and Web Search Pandu Nayak and Prabhakar

Information Retrieval CS276: Information Retrieval and Web Search Text Classification 1 Chris

Information Retrieval Introducing Information Retrieval and Web Search

EM EMOTION RECOGNITION IN IN SOUND ANASTASIYA S. POPOVA HSE NN 2017 INTRODUCTION THE

Approach Md Mahadi Hasan Nahid, Bishwajit Purkaystha, Md Saiful Islam Department of Computer

We considered age-related disguise as the intentional modification of the speaker's voice to

Building Competitive Advantage through Successful Training and Development Submitted in the

Chapter 1 Internals and Computer System Design Overview Principles Eighth Edition By

Disks and RAID (Chapter 12, 14.2) CS 4410 Operating Systems [R. Agarwal, L. Alvisi, A. Bracy,

FlashTier: A Lightweight, Consistent and Durable Storage Cache 1 Outline Introduction

Maximally entangled mixed states with fixed marginals Giuseppe Baio SUPA &amp; University of

The intriguing case of sad music Dr. Jonna Vuoskoski jonna.vuoskoski@music.ox.ac.uk Music &

Maximally entangled mixed states with fixed marginals Giuseppe Baio SUPA & University of