Sound Event Detection in Multisource Environments Using Source - PowerPoint PPT Presentation

1 Sound Event Detection in Multisource Environments Using Source Separation Toni Heittola 1 , Annamaria Mesaros 1 , Tuomas Virtanen 1 , Antti Eronen 2 1 Tampere University of Technology, Department of Signal Processing 2 Nokia Research Center Tampere 1st September, 2011

2 Sound event detection • Aims at detecting acoustic events in an audio signal • Predefined event classes = supervised classification • Estimate the start and end time of each event

3 Environmental audio data • Audio from everyday environments: street, office, grocery store, in a car, etc. • Application areas of environmental sound event detection: context-aware devices, automatic annotation of videos

4 Outline of the presentation • Sound event detection and environmental audio data • Monophonic detection system • Sound source separation based polyphonic detection system • Evaluation & demonstration

5 Monophonic event detection system A. Mesaros, T. Heittola, A. Eronen, T. Virtanen. Acoustic event detection in real life recordings. In proc. EUSIPCO 2010. • HMM classifier • 61 event classes: (e.g. speech, music, beep, car, car door, bird, dog barking, footsteps, keyboard, coughing…) • Each class modeled with a 3-state HMM (16 Gaussians per state, MFCC features). • Train model for each event class separately using audio segments that are annotated to include the event

6 Monophonic event detection system A. Mesaros, T. Heittola, A. Eronen, T. Virtanen. Acoustic event detection in real life recordings. In proc. EUSIPCO 2010. • To model the whole signal, any event is allowed to follow any event

7 Output of the monophonic system • The output is a sequence of non-overlapping events

8 Non-negative spectrogram factorization based signal separation • One-channel input signal is separated into multiple tracks • NMF-based separation: magnitude spectrogram matrix represented as a of product of two non-negative matrices • Represents the signal as a sum of components having fixed spectrum and time-varying gain • Unsupervised separation: no prior knowledge about the sounds

9 Example of separated signals: kitchen

10 Example of separated signals: basketball game

11 Polyphonic event detection system • Separation used as a preprocessing step • Monophonic recognizer applied on all the separated tracks separately -> events obtained from the tracks are combined • Training: All the tracks are pooled to the training data of an events

12 Acoustic database • Material for the database was gathered from ten contexts • basketball game, beach, inside a bus, inside a car, hallway, office, restaurant, grocery store, street and stadium with track and field sports • Each context is represented by 8 to 14 recordings, to a total of 103 recordings included in the database. • In total ~19 hours of audio • In total ~10.000 annotated events

13 Annotations • Recordings were manually annotated indicating the start and end time of all clearly audible sound events • Annotated sound events present in the recordings were grouped into 61 event classes • Event classes include e.g. speech, laughter, applause, car door, road, dishes, door, chair, music, and footsteps

14 Demonstration

15 Evaluation metrics • Detected events are regarded only at the block level, within 30 seconds • Precision and recall is calculated inside the blocks, and combined into F-score • Data divided into 70% training / 30% testing sets, 5 folds

16 Event detection performance (average F-score %) Monophonic Polyphonic Overall 28.2 52.6 Context Basketball 30.3 68.2 Beach 23.0 38.7 Bus 24.4 57.6 Car 18.8 46.7 Hallway 37.0 51.1 Office 30.1 49.7 Restaurant 25.4 54.2 Shop 27.7 56.2 Street 26.4 50.1 Track&Field 41.7 57.4

17 Conclusions • NMF-based sound source separation can be used to do polyphonic event detection • It improves significantly the performance of a monophonic event detection system • It is possible to detect prominent sound events even in diverse real-world environments to some degree

Sound Event Detection in Multisource Environments Using Source - PowerPoint PPT Presentation

1 Sound Event Detection in Multisource Environments Using Source Separation Toni Heittola 1 , Annamaria Mesaros 1 , Tuomas Virtanen 1 , Antti Eronen 2 1 Tampere University of Technology, Department of Signal Processing 2 Nokia Research Center

Design and Analysis of Single-source and Layer Multicast Multisource Application Layer Multicast

Experiments with Multisource Decoding and A priori Fragments Speech and Hearing Research

SOUND SOUND Wha hat is t is sound sound? Click on the image below to find out. Sounds are

? Message sound Message P(wolf|sound) P(sound| wolf) x P(wolf) 1 9/4/19 P(sound| wolf)

Sonification - Sound of Science VU, WS 2013 Lecture 8 - Parameter Mapping Visda Goudarzi

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

SYNTHESIZING 3D SOUND SYNTHESIZING 3D SOUND AND AND SOUND LOCALIZATION SOUND LOCALIZATION

Sound & Editing Lily, Matt, Mei, Michaela Sound WHAT IS SOUND? An audible vibration of the

Sound 1 Sound "50% of the movie experience is sound - George Lucas Sound is used

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

DeepSkyFusion* multisource data fusion from astronomical images Andr Jalobeanu PASEO Research

Sound Slide 2 / 50 Characteristics of Sound Sound can travel through any kind of matter, but

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

Event-driven Architecture for Health Event Detection from Multiple Sources Dr. Kerstin Denecke

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

CSSE 220 Software Engineering Techniques Design Principles Encapsulation Todays Agenda

Click for Video COACH LIKE A CHAMPION BUILDING A CHAMPIONSHIP PROGRAM LARRY MCKENZIE A COACH

Danielle Richards, Bessemer Park Advisory Council Agenda Learn how 2 Park Advisory Councils

MATH 105: Finite Mathematics 8-3: Expected Value Prof. Jonathan Duncan Walla Walla College

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Feedback Loops and Balance CSC430/HCI530 Feedback loops: a special dynamic Positive:

Understand Basketball Games 2018.6.15 Sports Videos Large quantity, high

Strategy optimization in beachvolleyball applying a two scale approach to the olympic games

Sambuz

Useful Links

Newsletter

Mail Us

Sound Event Detection in Multisource Environments Using Source - PowerPoint PPT Presentation

1 Sound Event Detection in Multisource Environments Using Source Separation Toni Heittola 1 , Annamaria Mesaros 1 , Tuomas Virtanen 1 , Antti Eronen 2 1 Tampere University of Technology, Department of Signal Processing 2 Nokia Research Center

Design and Analysis of Single-source and Layer Multicast Multisource Application Layer Multicast

Experiments with Multisource Decoding and A priori Fragments Speech and Hearing Research

SOUND SOUND Wha hat is t is sound sound? Click on the image below to find out. Sounds are

? Message sound Message P(wolf|sound) P(sound| wolf) x P(wolf) 1 9/4/19 P(sound| wolf)

Sonification - Sound of Science VU, WS 2013 Lecture 8 - Parameter Mapping Visda Goudarzi

Detection of neutral particles detection of neutrons detection of neutrinons detection of low

SYNTHESIZING 3D SOUND SYNTHESIZING 3D SOUND AND AND SOUND LOCALIZATION SOUND LOCALIZATION

Sound &amp; Editing Lily, Matt, Mei, Michaela Sound WHAT IS SOUND? An audible vibration of the

Sound 1 Sound &quot;50% of the movie experience is sound - George Lucas Sound is used

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

Events Event-driven programming Event loop Event dispatch Event handling Event Driven

DeepSkyFusion* multisource data fusion from astronomical images Andr Jalobeanu PASEO Research

Sound Slide 2 / 50 Characteristics of Sound Sound can travel through any kind of matter, but

More Event Combinators CML provides two more event combinators: guard and withNack : val guard :

Event-driven Architecture for Health Event Detection from Multiple Sources Dr. Kerstin Denecke

Low Level Low Level Low Level Low Level Detection of Detection of Detection of Detection of

CSSE 220 Software Engineering Techniques Design Principles Encapsulation Todays Agenda

Click for Video COACH LIKE A CHAMPION BUILDING A CHAMPIONSHIP PROGRAM LARRY MCKENZIE A COACH

Danielle Richards, Bessemer Park Advisory Council Agenda Learn how 2 Park Advisory Councils

MATH 105: Finite Mathematics 8-3: Expected Value Prof. Jonathan Duncan Walla Walla College

CS54701: Information Retrieval CS-54701 Information Retrieval Retrieval Models: Language models

Feedback Loops and Balance CSC430/HCI530 Feedback loops: a special dynamic Positive:

Understand Basketball Games 2018.6.15 Sports Videos Large quantity, high

Strategy optimization in beachvolleyball applying a two scale approach to the olympic games

Sambuz

Useful Links

Newsletter

Mail Us

Sound & Editing Lily, Matt, Mei, Michaela Sound WHAT IS SOUND? An audible vibration of the

Sound 1 Sound "50% of the movie experience is sound - George Lucas Sound is used