Sounds in Visual Space Yuan Hao Dept. of Computer Science & - PowerPoint PPT Presentation

Monitoring and Mining Animal Sounds in Visual Space Yuan Hao Dept. of Computer Science & Engineering University of California, Riverside

Task Task • Monitoring animals by examining the sounds they produce • Build animal sound recognition/classification framework Frequency (kHz) 0 3 Common Virtuoso Katydid Forty seconds ( Amblycorypha longinicta ) 2

Outline Outline • Motivation • Our approach • Experimental evaluation • Conclusion & future work 3

Motivation Motivation- application application Monitoring animals: Outdoors • The density and variety of animal sounds can act as a measure of biodiversity Laboratory setting • Researchers create control groups of animals, expose them to different settings, and test for different outcomes Commercial application: Acoustic animal detection can save money 4

Motivation Motivation- difficulties difficulties Most current bioacoustic classification tools have significant limitations They… • require careful tuning of many parameters • are too computationally expensive for sensors • are not accurate enough • too specialized 5

Related Related Work Work • Dietrich et al (MCS 01), several classifications methods for insect sounds – Preprocessing and complicated feature extraction – Up to eighteen parameters – Learned on a data set containing just 108 exemplars • Brown et al (J. Acoust. Soc 09), analyze Australian anurans (frogs and toads) – Identify the species of the frogs with an average accuracy of 98% – Requires extracting features from syllables – “ Once the syllables have been properly segmented, a set of features can be calculated to represent each syllable ” 6

Outline Outline • Motivation • Our approach – Visual space-spectrogram – CK distance measure – Sound fingerprint searching • Experimental evaluation • Conclusion & future work 7

Intuition of our Approach Intuition of our Approach • Classify the animal sounds in the visual space , by treating the texture of their spectrograms as an “acoustic fingerprint”, using a recently introduced parameter-free texture measure as a distance measure Can be considered the “ fingerprint” for this sound One second subset of a common cricket’ sound spectrogram 8

Intuition of our Approach Intuition of our Approach • Classify the animal sounds in the visual space , by treating the texture of their spectrograms as an “acoustic fingerprint”, using a recently introduced parameter-free texture measure as a distance measure Can be considered the “ fingerprint” for this sound One second subset of a common cricket’ sound spectrogram 9

Our Our Approach Approach minLen maxLen P U T = 0.43 10

Visual Visual Space Space Spectrogram • Algorithmic analysis needed instead of manual inspection • Significant noise artifacts • Avoid any type of data cleaning or explicit feature extraction, and use the raw spectrogram Frequency (kHz) 0 3 Common Virtuoso Katydid Forty seconds ( Amblycorypha longinicta ) 11

CK CK Distance Distance M Measure easure  C x y ( | ) C y x ( | )   d ( , ) x y 1 CK  C x x ( | ) C y y ( | ) • Distance measure of texture similarity • Robustly extracting features from noisy field recordings is non-trivial • Expands the scope of the compression-based similarity measurements to real-valued images by exploiting the compression technique used by MPEG video encoding. • Effective on images as diverse as moths, nematodes, wood grains, tire tracks etc (SDM 10) 12

Sanity Sanity Check Check CK as a tool for taxonomy Gryllus rubens National Geographic article 0.2 “ the sand field cricket (Gryllus firmus) and the southeastern field cricket 0 (Gryllus rubens) look nearly identical and inhabit the same geographical areas ” -0.2 Gryllus firmus -0.4 0 0.4 Gryllidae Gryllus firmus Gryllus rubens 13

Outline Outline • Motivation • Our approach – Visual space-spectrogram – CK distance measure – Sound fingerprint searching • Experimental evaluation • Conclusion & future work 14

Difficulties Difficulties • Do not have carefully extracted prototypes for each class – Only have a collection of sound files • Do not know the call duration • Do not know how many occurrences of it appear in each file • May have mislabeled data • Noisy: most of the recordings are made in the wild 15

Example: Discrete Text Strings Example: Discrete Text Strings Assume three observations that correspond to a particular species P = {rrbbcxcfbb, rrbbfcxc, rrbbrrbbcxcbcxcf} Given access to the universe of sounds that are known not to contain any example in P U = {rfcbc, crrbbrcb, rcbbxc, rbcxrf,..,rcc } Our task is equivalent to asking: Is there substring that appears only in P and not in U ? 16

Example: Discrete Text Strings Example: Discrete Text Strings Assume three observations that correspond to a particular species P = {rrbbcxcfbb, rrbbfcxc, rrbbrrbbcxcbcxcf} Given access to the universe of sounds that are known not to contain any example in P U = {rfcbc, crrbbrcb, rcbbxc, rbcxrf,..,rcc } Our task is equivalent to asking: Is there substring that appears only in P and not in U ? T 1 = rrbb, T 2 = rrbbc, T 3 = cxc 17

Case Case Studies Studies Six pairs of recordings of various Orthoptera . Visually determined and extracted one-second similar regions 3 4 2 1 8 10 11 5 12 9 6 7 One Second One size does not fit all , when it comes to the length of the sound sequence. Tettigonioidea Grylloidea 11 12 7 8 9 10 1 2 3 4 5 6 One Second 18

Sound Sound Fingerprint Fingerprint Given U and P P : Contains examples only from the “positive” species class U : Non-target species sounds To find a subsequence of one of the objects in P , which is close to at least one subsequence in each element of P , but far from all subsequences in every element of U Potential sound fingerprint 19

Example Example 1 5 3 4 2 Candidate being tested 0 1 Split point C B D A (threshold) To find a subsequence of one of the objects in P , which is close to at least one subsequence in each element of P , but far from all subsequences in every element of U 20

How How Hard Hard is is This This ? 1 5 3 4 2 Candidate being tested 0 1 Split point L C B D A (threshold) max     ( M l 1) i   l L S { } P min i where l is a certain length of candidate is the length of any sound sequence in P M S i i L L and is possible user defined length min max of sound fingerprint 21

Brute Brute Force Force S Search earch Generate and Evaluate Step 1 : Given P and U , generate all possible subsequences from the objects in P of length m as the sound fingerprint candidates. 2 3 4 5 6 7 8 0 1 Step 2 : 1 Using a sliding window with the same size 2 of candidate’ s, locate the minimum distance for each object in P and U 3 Step 3 : 4 Evaluation mechanism for splitting datasets 5 into two groups . Step 4 : . Sound fingerprint with the best splitting . point, which is the one can produce the largest information gain to separate two classes 22

Evaluation Evaluation Mechanism Mechanism Step3: Information gain to evaluate candidate splitting rules E ( D ) = - p ( X )log( p ( X ))- p ( Y )log( p ( Y )) where X and Y are two classes in D Gain = E ( D ) – E’ ( D ) where E ( D ) and E’ ( D ) are the entropy before and after partitioning D into D 1 and D 2 respectively. E’ ( D ) = f ( D 1 ) E ( D 1 ) + f ( D 2 ) E ( D 2 ) where f ( D 1 ) is the fraction of objects in D 1 , and f ( D 2 ) is the fraction of objects in D 2 . 23

Example Example A total of nine objects , five from P , and four from U . This gives us the entropy for the unsorted data [-(5/9)log(5/9)-(4/9)log(4/9)] = 0.991 1 5 3 4 2 Candidate being tested Information Gain = 0.991- 0.401 = 0.590 0 1 Split point C B D A (threshold) Four objects from P are the only four objects on the left side of the split point. Of the five objects to the right of the split point we have four objects from U and just one from P (4/9)[-(4/4)log(4/4)]+(5/9)[-(4/5)log(4/5)-(1/5)log(1/5)] = 0.401 24

Outline Outline • Motivation • Our approach – Visual space-spectrogram – CK distance measure – Sound fingerprint searching • Experimental evaluation – Brute force search evaluation – Speed up and efficiency • Conclusion & future work 25

Example Example P U The distance ordering The sound fingerprint 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Distance value 0.6 Distance value 0.4 Recognition Threshold 0.2 0 A demonstration of brute force search algorithm and the discrimination ability of the CK measure. One short template of insect sounds is scanned along a long sequence of sound, which contains one example of the target sound, plus three examples commonly confused insect sounds 26

P = Atlanticus dorsalis P U The distance ordering The sound fingerprint 4 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 Distance value 1 0.9 0.8 Information gain Running time: 7.5 hours 0.7 0.6 0.5 0.4 Brute-force search 0.3 terminates 0.2 0 100 200 300 400 500 600 700 800 900 27

Sounds in Visual Space Yuan Hao Dept. of Computer Science & - PowerPoint PPT Presentation

Monitoring and Mining Animal Sounds in Visual Space Yuan Hao Dept. of Computer Science & Engineering University of California, Riverside Task Task Monitoring animals by examining the sounds they produce Build animal sound

SOUND M Bethancourt What is Sound? Sounds as Physical Phenomena Sounds as Organized Beauty

Phase 1 Environmental sounds Environmental Sounds and Instrumental Sounds . To develop the

Letters and Sounds Phonics information for parents What is Letters and Sounds ? Letters and

Session 2 Session 2 Tool Time Tuesday Tool Time Tuesday Soothing Sounds, WebEx Sounds, Security

Concepts of Print Jolly Phonics and Active Literacy Learning the letter sounds The main sounds

sounds or phonemes. They are then taught how to blend these sounds together to read the

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

RWI Phonics Parent Meeting An overview of RWI Speed Sounds Children who read at home do well at

Welcome to Reading Books to Children 5 Basic Skills 1. Learning the letter sounds 2. Letter

and writing using the letter sounds. We follow the Letters and Sounds order of 2 teaching.

PRONUNCIATION UNIT 1 UNIT 3 /s/ , /z/ , /z/ sounds Vowel sounds: // and /i:/ Aim: Students

Overview/Questions How do we hear sounds? How can audio information (sounds) be

English sounds John Goldsmith September 27, 2011 John Goldsmith () English sounds September

Genomic Informatics Professors Elhanan Borenstein and Jim Thomas Genome 373 This course is

Desert Museum Trip Saturday Nov 4 th 10 AM COS funds for admission and lunch Tuesdays

2018: Fumigation and IR-4 Stanley Culpepper, University of Georgia Tifton Campus Fumigation /

2016 Fumigant Systems Stanley Culpepper, University of Georgia Tifton Campus Focus Points 1.

Management Guy D. Collins, Ph.D. Cotton Extension Associate Professor Cotton County Meetings

AI AI Department of Computer Science University of Calgary CPSC 601.73 Winter 2003

Fatima Parker-Allie 28 October 2014 Overview SANBI Background and Value Chain SABIF

Prefixation of Russian verbs of motion: a frame-based account Yulia Zinova & Rainer Osswald

Sounds in Visual Space Yuan Hao Dept. of Computer Science & - PowerPoint PPT Presentation

Monitoring and Mining Animal Sounds in Visual Space Yuan Hao Dept. of Computer Science & Engineering University of California, Riverside Task Task Monitoring animals by examining the sounds they produce Build animal sound

SOUND M Bethancourt What is Sound? Sounds as Physical Phenomena Sounds as Organized Beauty

Phase 1 Environmental sounds Environmental Sounds and Instrumental Sounds . To develop the

Letters and Sounds Phonics information for parents What is Letters and Sounds ? Letters and

Session 2 Session 2 Tool Time Tuesday Tool Time Tuesday Soothing Sounds, WebEx Sounds, Security

Concepts of Print Jolly Phonics and Active Literacy Learning the letter sounds The main sounds

sounds or phonemes. They are then taught how to blend these sounds together to read the

Biovision team 2 Retina Visual cortex 3 Retina Visual cortex 3 Retina Visual cortex 3

CHRONIC CHRONIC VISUAL LOSS VISUAL LOSS Wasu Supakornthanasarn, MD. Visual loss Sensory

A Model of Visual Imagery A Model of Visual Imagery John Abbondanza, OD, FCOVD John Abbondanza,

Overview Overview Visual displays Visual displays Visual and tactile displays Visual and

RWI Phonics Parent Meeting An overview of RWI Speed Sounds Children who read at home do well at

Welcome to Reading Books to Children 5 Basic Skills 1. Learning the letter sounds 2. Letter

and writing using the letter sounds. We follow the Letters and Sounds order of 2 teaching.

PRONUNCIATION UNIT 1 UNIT 3 /s/ , /z/ , /z/ sounds Vowel sounds: // and /i:/ Aim: Students

Overview/Questions How do we hear sounds? How can audio information (sounds) be

English sounds John Goldsmith September 27, 2011 John Goldsmith () English sounds September

Genomic Informatics Professors Elhanan Borenstein and Jim Thomas Genome 373 This course is

Desert Museum Trip Saturday Nov 4 th 10 AM COS funds for admission and lunch Tuesdays

2018: Fumigation and IR-4 Stanley Culpepper, University of Georgia Tifton Campus Fumigation /

2016 Fumigant Systems Stanley Culpepper, University of Georgia Tifton Campus Focus Points 1.

Management Guy D. Collins, Ph.D. Cotton Extension Associate Professor Cotton County Meetings

AI AI Department of Computer Science University of Calgary CPSC 601.73 Winter 2003

Fatima Parker-Allie 28 October 2014 Overview SANBI Background and Value Chain SABIF

Prefixation of Russian verbs of motion: a frame-based account Yulia Zinova &amp; Rainer Osswald

Prefixation of Russian verbs of motion: a frame-based account Yulia Zinova & Rainer Osswald