Machine listening for birds: analysis techniques matched to the - PowerPoint PPT Presentation

Multiple birdsong tracking Representing fine modulations Machine listening for birds: analysis techniques matched to the characteristics of bird vocalisations Dan Stowell and Mark D Plumbley Centre for Digital Music School of Elec Eng & Computer Science Queen Mary, University of London June 2013, Listening in the Wild dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 1

Multiple birdsong tracking Representing fine modulations Motivation “Cocktail party” problems. . . dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 2

Multiple birdsong tracking Representing fine modulations Motivation Photo: Shutterstock / Romeo Mikulic dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 3

Multiple birdsong tracking Representing fine modulations Motivation We often have audio with multiple birds, and would like to perform automatic tasks (recognition, tracking, counting. . . ) Existing computational methods don’t quite fit the characteristics of bird vocalisations: 1. Multiple “speakers”, and discontinuous utterances —problematic for methods adapted from speech recognition 2. Birds often use very rapid modulations, yet typical signal representations (spectrograms, MFCCs, LPC) do not capture them dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 4

Multiple birdsong tracking Representing fine modulations Outline 1. Syllable-to-syllable tracking of multiple birds 2. Representing the fine detail of bird vocalisations 8000 6000 4000 2000 dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 5

Multiple birdsong tracking Representing fine modulations Multiple birdsong tracking Chiffchaff ( Phylloscopus collybita ) dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 6

Multiple birdsong tracking Representing fine modulations Automatic Speech Recognition Hidden Markov Model: y 1 y 2 y 3 y 4 x 1 x 2 x 3 x 4 t 1 t 2 t 3 t 4 dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 7 time

Multiple birdsong tracking Representing fine modulations Intermittent polyphonic sources dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 8

Multiple birdsong tracking Representing fine modulations Modelling an intermittent source Markov renewal process (“MRP”): P ( τ n +1 ≤ t , X n +1 = j | ( X 1 , T 1 ) , . . . , ( X n = i , T n ) ) = P ( τ n +1 ≤ t , X n +1 = j | X n = i ) where τ n +1 is the time difference T n +1 − T n . dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 9

Multiple birdsong tracking Representing fine modulations Multiple MRPs Problem sketch: assume multiple MRPs, plus potential “clutter”. Given transition probabilities, find the most likely set of paths. (Max 1 path per node) dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 10

Multiple birdsong tracking Representing fine modulations Flow networks, and minimum cost flow a c (X 1 ) V 1 a t (X 1 ,X 3 ,T 3 -T 1 ) a d (X 1 ) a c (X 3 ) a t (X 1 ,X 2 ,T 2 -T 1 ) a b (X 1 ) a d (X 3 ) V 3 t a b (X 3 ) a d (X 2 ) s a t (X 2 ,X 3 ,T 3 -T 2 ) a b (X 2 ) a c (X 2 ) V2 Convert likelihood expression to flow “costs”: a b ( X ) = − log p b ( X ) a d ( X ) = − log p d ( X ) a t ( X , X ′ , τ ) = − log f X ( X ′ , τ ) a c ( X ) = log p c ( X ) dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 11

Multiple birdsong tracking Representing fine modulations Minimum cost flow Minimum cost flow algorithms can therefore solve this problem: ◮ Optimal minimum-cost flow: Edmonds-Karp algorithm, asymptotic time complexity O ( | V || A | 2 ). ◮ Or use inexact (greedy) algorithm: O ( | V || A | ) or lower. dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 12

LR 6.33e+18 LR 1.45e+21 60 60 60 60 generator: locked 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 Multiple birdsong tracking 10 10 10 10 Representing fine modulations 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 LR 1.42e+12 LR 4.55e+17 Synthetic example generator: coherent 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10 LR 6.33e+18 LR 1.45e+21 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 LR 3.11e+16 generator: segregated 60 60 60 60 generator: locked 60 60 60 60 50 50 50 50 50 50 50 50 40 40 40 40 40 40 40 40 30 30 30 30 30 30 30 30 20 20 20 20 20 20 20 20 10 10 10 10 10 10 10 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 LR 1.42e+12 LR 4.55e+17 0 2 4 6 8 10 0 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 clean signal signal in noise inferred (coherent) inferred (segregated) generator: coherent 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 0 2 4 6 8 10 dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 13 LR 3.11e+16 generator: segregated 60 60 60 60 50 50 50 50 40 40 40 40 30 30 30 30 20 20 20 20 10 10 10 10 0 2 4 6 8 10 0 2 4 6 8 10 0.0 0.2 0.4 0.6 0.8 1.0 0 2 4 6 8 10 clean signal signal in noise inferred (coherent) inferred (segregated)

Multiple birdsong tracking Representing fine modulations Birdsong experiment 25 European recordings of Chiffchaff (source: Xeno Canto) Mixtures of 2–5 recordings, 5-fold crossvalidation Can it cluster the “syllables” in the same way as the source audio? dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 14

Multiple birdsong tracking Representing fine modulations Data preparation Syllables detected by spectrogram cross-correlation. 7400 6500 5700 Freq (Hz) 4800 XC25760-dn.xcor 4000 10000 3100 0.05 0.11 0.17 Time (s) 8000 Template 6000 Freq (Hz) 4000 2000 0 0 5 10 15 20 25 Time (s) dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 15

Multiple birdsong tracking Representing fine modulations Results 1.0 0.8 0.6 Ftrans 0.4 Ideal recovery, trained on test data Ideal recovery Ideal recovery plus synthetic noise 0.2 Recovery from audio Recovery from audio (greedy) Recovery from audio (baseline) 0.0 1 2 3 4 5 Number of signals in mixture Means and standard errors are shown (5-fold crossvalidation) dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 16

Multiple birdsong tracking Representing fine modulations dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 17

Multiple birdsong tracking Representing fine modulations Representing fine modulations Many (song)birds use very rapid frequency modulation (FM) ◮ Songbirds can perceive fine detail of FM (Dooling et al. 2002, Lohr et al. 2006) ◮ FM detail can affect behavioural responses (Trillo et al. 2005, de Kort et al. 2009) Yet... Standard representations assume local stationarity (i.e. signal parameters unchanging) at fine timescales. ◮ Fourier transform magnitudes (spectrograms, MFCCs) ◮ Linear prediction (LPC) Detail at < 20 ms likely to be smeared or discarded. dan.stowell@eecs.qmul.ac.uk Analysis techniques matched to bird vocalisations 18

Machine listening for birds: analysis techniques matched to the - PowerPoint PPT Presentation

Multiple birdsong tracking Representing fine modulations Machine listening for birds: analysis techniques matched to the characteristics of bird vocalisations Dan Stowell and Mark D Plumbley Centre for Digital Music School of Elec Eng &

Matched filtering 6.011, Spring 2018 Lec 24 1 Matched filtering for detecting known signal in

Oral Presentation Techniques Listening Skills In-class listening skills Take notes on

Habitat for Bush Birds 2014 - 2016 Making connections: a birds eye view 1 Habitat for Bush

Why Do Birds of Prey Fly in Circles? Does the Eagle Make It? p. 1/3 Why Do Birds of Prey Fly

Logic Puzzles Problem Solving Club Birds In Trees There are 2 trees in a garden (tree

Listening with Empathy Graham Bodie, Ph.D. Professor, IMC (U Mississippi) Chief Listening

Welcome! Workshop Motivation Machine Listening lacks a coherent community. Machine Listening

Curiosity and Active Listening Gail Chdrn WI LEND Training Director Listening Active

Curriculum on Communications L2/A: Listening & Cooperating Listening & Cooperating

Insights Webinar (Part I): Using Social Media Listening to Understand Consumers Todays Agenda

International Day of Listening International Day of Listening Presented by Sheila Bentley and

HOW TO OPTIMIZE YOUR SOCIAL LISTENING DATASET TO DRIVE ROI AGENDA: THE MUSIC OF SOCIAL LISTENING

1 1 A world of AnimAls Objectives Suggested answers 1 dogs/cats/birds FUNCTIONS asking and

Migratory Birds Preventing Heronries in Residential Areas Fort Worth 2012 Birds arrived in

1 04/12/12 Too much food and too many birds The birds loose their fear Goose dung

THE FOREST BIRDS SALT is not a Seed eating birds may eat PLANTS normal part of road salt

Listening for Invisible Axions with Gravitational Waves Ben A. Stefanek Audible Axions C.

Listening(to(big(data( ( Is(clone(analysis(/(empirical(SE(a(Big(Data(problem?(

Listening to the Masters Great Science Teachers and Science Teacher Mentors Redesign Science

Speech Question Answering TOEFL Listening Comprehension Test by Machine Wei Fang December 13,

Teaching and Learning Services LMS Review Listening session LMS Review What is it? How does

Resistance Unit Slides Discussion Norms and Vocabulary Todays Agenda: 1. Record homework

When learning a foreign language, ones grammar improves if one learns to listen to the

Current Solution MBTA Upgrade 7Y $10B Pole Partner 4-5 8 Ergonomics People naturally

Machine listening for birds: analysis techniques matched to the - PowerPoint PPT Presentation

Multiple birdsong tracking Representing fine modulations Machine listening for birds: analysis techniques matched to the characteristics of bird vocalisations Dan Stowell and Mark D Plumbley Centre for Digital Music School of Elec Eng &

Matched filtering 6.011, Spring 2018 Lec 24 1 Matched filtering for detecting known signal in

Oral Presentation Techniques Listening Skills In-class listening skills Take notes on

Habitat for Bush Birds 2014 - 2016 Making connections: a birds eye view 1 Habitat for Bush

Why Do Birds of Prey Fly in Circles? Does the Eagle Make It? p. 1/3 Why Do Birds of Prey Fly

Logic Puzzles Problem Solving Club Birds In Trees There are 2 trees in a garden (tree

Listening with Empathy Graham Bodie, Ph.D. Professor, IMC (U Mississippi) Chief Listening

Welcome! Workshop Motivation Machine Listening lacks a coherent community. Machine Listening

Curiosity and Active Listening Gail Chdrn WI LEND Training Director Listening Active

Curriculum on Communications L2/A: Listening &amp; Cooperating Listening &amp; Cooperating

Insights Webinar (Part I): Using Social Media Listening to Understand Consumers Todays Agenda

International Day of Listening International Day of Listening Presented by Sheila Bentley and

HOW TO OPTIMIZE YOUR SOCIAL LISTENING DATASET TO DRIVE ROI AGENDA: THE MUSIC OF SOCIAL LISTENING

1 1 A world of AnimAls Objectives Suggested answers 1 dogs/cats/birds FUNCTIONS asking and

Migratory Birds Preventing Heronries in Residential Areas Fort Worth 2012 Birds arrived in

1 04/12/12 Too much food and too many birds The birds loose their fear Goose dung

THE FOREST BIRDS SALT is not a Seed eating birds may eat PLANTS normal part of road salt

Listening for Invisible Axions with Gravitational Waves Ben A. Stefanek Audible Axions C.

Listening(to(big(data( ( Is(clone(analysis(/(empirical(SE(a(Big(Data(problem?(

Listening to the Masters Great Science Teachers and Science Teacher Mentors Redesign Science

Speech Question Answering TOEFL Listening Comprehension Test by Machine Wei Fang December 13,

Teaching and Learning Services LMS Review Listening session LMS Review What is it? How does

Resistance Unit Slides Discussion Norms and Vocabulary Todays Agenda: 1. Record homework

When learning a foreign language, ones grammar improves if one learns to listen to the

Current Solution MBTA Upgrade 7Y $10B Pole Partner 4-5 8 Ergonomics People naturally

Curriculum on Communications L2/A: Listening & Cooperating Listening & Cooperating