Bird Call Recognition with Artificial Neural Networks, Support - - PowerPoint PPT Presentation

bird call recognition with artificial neural networks
SMART_READER_LITE
LIVE PREVIEW

Bird Call Recognition with Artificial Neural Networks, Support - - PowerPoint PPT Presentation

M. Sc. Thesis Presentation Bird Call Recognition with Artificial Neural Networks, Support Vector Machines, and Kernel Density Estimation Derek Ross Advisors: Dr. H. Card, Electrical and Computer Eng. Dr. D. McNeill, Electrical and Computer


slide-1
SLIDE 1
  • M. Sc. Thesis Presentation

Bird Call Recognition with Artificial Neural Networks, Support Vector Machines, and Kernel Density Estimation

Derek Ross

Department of Electrical and Computer Engineering University of Manitoba Winnipeg, Canada March 20, 2006

Advisors:

  • Dr. H. Card, Electrical and Computer Eng.
  • Dr. D. McNeill, Electrical and Computer Eng.

Committee:

  • Dr. P. Yahampath, Electrical and Computer Eng.
  • Dr. J. Anderson, Computer Science
slide-2
SLIDE 2

Bird Call Recognition with ANNs, SVMs and KDE

Outline

  • Goal.
  • Why bird calls?
  • Inspiration.
  • Pre-processing.
  • Classifiers.
  • Post-processing.
  • Results, analysis, conclusion.

Bird Call Recognition with ANNs, SVMs and KDE Bird Call Recognition with ANNs, SVMs and KDE 2

slide-3
SLIDE 3

Bird Call Recognition with ANNs, SVMs and KDE

Goal

  • Take a recording of a bird call, and

automatically determine which of ten species it belongs to.

“ROBIN”

))))) 3

slide-4
SLIDE 4

Bird Call Recognition with ANNs, SVMs and KDE

Applications

  • Aviation industry,

bird strikes.

  • Electrical distribution.
  • Wind turbines.
  • Night-flight monitoring.
  • Entertainment –

“birding.”

4

slide-5
SLIDE 5

Bird Call Recognition with ANNs, SVMs and KDE

Previous Efforts

  • Not an active field.
  • Evans (2005), heuristic system.
  • Härmä and Somervuo (2004), sounds

classified by harmonics.

  • McIlraith and Card (1997), ANNs, statistical

methods.

5

slide-6
SLIDE 6

Bird Call Recognition with ANNs, SVMs and KDE

Inspiration

  • Musical instrument recognition research.
  • Tonal qualities:

– Timbre; – Purity; – Dissonance; – Harshness.

  • Can a bird be recognized by tonal qualities?

6

slide-7
SLIDE 7

Bird Call Recognition with ANNs, SVMs and KDE

Bird Species

  • 1. ALFL alder flycatcher
  • 2. AMCR American crow
  • 3. AMGO American goldfinch
  • 4. AMRE American redstart
  • 5. AMRO American robin
  • 6. BAOR Baltimore oriole
  • 7. BCCH black-capped chickadee
  • 8. BCTI black-crested titmouse
  • 9. BDOW barred owl
  • 10. BLJA blue jay

7

slide-8
SLIDE 8

Bird Call Recognition with ANNs, SVMs and KDE

Bird Species

8

slide-9
SLIDE 9

Bird Call Recognition with ANNs, SVMs and KDE

Data Sets

  • 900 recordings, 10 species.

9

slide-10
SLIDE 10

Bird Call Recognition with ANNs, SVMs and KDE 10 10

Implementation

slide-11
SLIDE 11

Bird Call Recognition with ANNs, SVMs and KDE

Overall Recognition Process

  • 11

11

slide-12
SLIDE 12

Bird Call Recognition with ANNs, SVMs and KDE

Pre-processing

  • Audio data separated into frames of 512

samples each.

  • Spectral parameters extracted.
  • Cepstral parameters extracted.
  • Derivatives of spectral and cepstral features.
  • Short term (1.5 sec) amplitude envelope

frequency.

  • 20 features.

12 12

slide-13
SLIDE 13

Bird Call Recognition with ANNs, SVMs and KDE

Pattern Recognition Techniques

  • Artificial neural networks.
  • Support vector machines.
  • Kernel density estimation.

13 13

slide-14
SLIDE 14

Bird Call Recognition with ANNs, SVMs and KDE

Artificial Neural Network

  • Coded with GNU C++.
  • Plain back-propagation algorithm (delta rule).
  • Variable learning rate, decreased exponentially.
  • Logistic neurons for hidden layer.
  • Linear neurons for output layer.
  • Training set is shuffled between epochs.

14 14

slide-15
SLIDE 15

Bird Call Recognition with ANNs, SVMs and KDE

Artificial Neural Network

15 15

slide-16
SLIDE 16

Bird Call Recognition with ANNs, SVMs and KDE

Artificial Neural Network

  • Training parameters:

16 16

slide-17
SLIDE 17

Bird Call Recognition with ANNs, SVMs and KDE

Support Vector Machine

  • Used LIBSVM library (based on SMO).
  • C-SVC: C-support vector classification.
  • Kernel is radial basis function, .
  • Probability estimates enabled for ROC.
  • Internal cross-validation: four-fold.
  • Termination threshold:

 = 0.0001.

17 17

slide-18
SLIDE 18

Bird Call Recognition with ANNs, SVMs and KDE

Support Vector Machine

  • Grid search:

18 18

slide-19
SLIDE 19

Bird Call Recognition with ANNs, SVMs and KDE

Support Vector Machine

  • Model parameters:

19 19

slide-20
SLIDE 20

Bird Call Recognition with ANNs, SVMs and KDE

Kernel Density Estimation

  • Coded with GNU C++.
  • Kernel is radial basis function (multivariate

normal)

  • “Bandwidth” selected with normal reference

rule:

20 20

slide-21
SLIDE 21

Bird Call Recognition with ANNs, SVMs and KDE

Post-Processing

  • Single frames
  • Entire calls

21 21

slide-22
SLIDE 22

Bird Call Recognition with ANNs, SVMs and KDE

Frame Post-Processing

  • Recognizers will try to

classify everything, even silence.

  • Setting high output

threshold will reject low- confidence frames.

  • Inspect ROC curve to find optimal

threshold.

  • Tradeoff: accuracy vs. rejection rate.

22 22

slide-23
SLIDE 23

Bird Call Recognition with ANNs, SVMs and KDE

Call Post-Processing

  • Recognizers are trained to classify single

frames.

  • How do you select a single species after

classifying multiple frames?

  • Two techniques used here:

– Voting. – Confusion matching: chi-square goodness-of-fit test.

23 23

slide-24
SLIDE 24

Bird Call Recognition with ANNs, SVMs and KDE

Confusion Row

  • Sum of all output vectors for a call.
  • Multinomial distribution of species estimates.
  • Example (simplified):

Frames of unknown call Species estimate for each frame

5 40 50 5

ALFL AMCR AMGO AMRE

Confusion row for call

24 24

slide-25
SLIDE 25

Confusion row for call

Bird Call Recognition with ANNs, SVMs and KDE

Post-processing: Voting

  • Winner is the species that was recognized in

the most frames.

  • Highest value of confusion row.

5 40 50 5

ALFL AMCR AMGO AMRE

Winner

25 25

slide-26
SLIDE 26

Bird Call Recognition with ANNs, SVMs and KDE

Post-processing: Chi-Test

  • Chi-square goodness-of-fit test can determine

similarity of multinomial distributions.

ALFL AMCR AMGO AMRE ALFL 100 0 0 0 AMCR 0 75 25 0 AMGO 0 50 50 0 AMRE 0 0 0 100

Confusion row

ALFL AMCR AMGO AMRE 5 40 50 5

Confusion matrix from training set

Scan through CM, find row that gives lowest χ2 statistic.

26 26

slide-27
SLIDE 27

Bird Call Recognition with ANNs, SVMs and KDE

Trials Run

  • For frame recognition:

3 datasets × 7 classifiers.

  • For call recognition:

3 datasets × 7 classifiers × 2 post-processors.

  • Total: 84 trials.

27 27

slide-28
SLIDE 28

Results

28 28 Bird Call Recognition with ANNs, SVMs and KDE 28 28

slide-29
SLIDE 29

Bird Call Recognition with ANNs, SVMs and KDE

Results (Uninterpreted)

  • Frame Results

Call Results

29 29

(Need reduced subset of data.)

slide-30
SLIDE 30

Bird Call Recognition with ANNs, SVMs and KDE

Results: Frame Accuracy

  • Best accuracy: 95%.
  • Accuracy floor: 36%.
  • Average accuracy: 67%.

30 30

slide-31
SLIDE 31

Bird Call Recognition with ANNs, SVMs and KDE

Results: Frame Rejection

  • Rejections table has same format.
  • Rejection rate seemed uncorrelated to other.

performance measures.

31 31

slide-32
SLIDE 32

Bird Call Recognition with ANNs, SVMs and KDE

Results: Call Accuracy

  • Best accuracy: 99%.
  • Accuracy floor: 0%.
  • Average accuracy: 76%.

32 32

slide-33
SLIDE 33

Bird Call Recognition with ANNs, SVMs and KDE

Results: Condensed Format

  • Accuracy, accuracy floor, rejection rate kept.

33 33

slide-34
SLIDE 34

Analysis of Results

34 34 Bird Call Recognition with ANNs, SVMs and KDE 34 34

slide-35
SLIDE 35

Bird Call Recognition with ANNs, SVMs and KDE

Single Frame Results

35 35

slide-36
SLIDE 36

Bird Call Recognition with ANNs, SVMs and KDE

Single Frame Accuracy

  • Best are NN-100, NN-500, SVMs.
  • NN-500 shows
  • vertraining.
  • NN-20 shows high

bias, low variance.

  • KDE is example of

biased estimator.

36 36

slide-37
SLIDE 37

Bird Call Recognition with ANNs, SVMs and KDE

Single Frame Rejections

  • High discrimination

threshold intended to reject silence.

  • Accuracy calculations

ignore rejected frames.

  • Not a strong correlation

(0.51) between accuracy and rejection rate.

37 37

slide-38
SLIDE 38

Bird Call Recognition with ANNs, SVMs and KDE

Frame Accuracy Floor

  • Similar to accuracy, but

greater variation (floor is outlier).

  • NN-500, KDE have big

gap between training set, CV set.

  • This hints at
  • vertraining.

38 38

slide-39
SLIDE 39

Bird Call Recognition with ANNs, SVMs and KDE

Entire Call Results

39 39

slide-40
SLIDE 40

Bird Call Recognition with ANNs, SVMs and KDE

Call Accuracy

  • Calls use two types of post-

processors: voting, chi-test.

  • Concentrate on comparing

performance w.r.t. post- processors.

  • (Training set performs worse

than superset -- has less frames.)

40 40

slide-41
SLIDE 41

Bird Call Recognition with ANNs, SVMs and KDE

Call Rejections

  • Calls have a much lower rejection rate.
  • All frames in a call have to be rejected for a

call to be rejected.

  • This is unlikely.

41 41

slide-42
SLIDE 42

Bird Call Recognition with ANNs, SVMs and KDE

Call Accuracy: Postproc. Effect

  • Chi-test gives only

moderate improvement for weaker classifiers.

42 42

slide-43
SLIDE 43

Bird Call Recognition with ANNs, SVMs and KDE

Note

  • I will usually refer to the cross-validation

(CV) results from here on.

43 43

slide-44
SLIDE 44

Bird Call Recognition with ANNs, SVMs and KDE

Call Accuracy Floor (Voting)

  • NN-20, KDE have
  • ne species with 0%

score for all 3 data sets.

  • SVM-FAR, score of

7% is less than chance.

  • Best is SVM-MID,

accuracy is 40%.

44 44

slide-45
SLIDE 45

Bird Call Recognition with ANNs, SVMs and KDE

  • Acc. Floor (Voting vs. Chi-Test)
  • Big improvement in accuracy floor.

Voting Chi-Test

45 45

slide-46
SLIDE 46

Bird Call Recognition with ANNs, SVMs and KDE

Difference: Chi-Test, Voting

  • Chi-test post-

processor raises accuracy floor by at least 16 points.

  • Up to 56 points

improvement for SVM-FAR.

  • On average, floor

raised by 33 points.

46 46

slide-47
SLIDE 47

Bird Call Recognition with ANNs, SVMs and KDE

Accuracy Variance

  • If chi-test doesn't affect average,

but raises floor, what happens to ceiling?

  • Chi-test raises floor but lowers

ceiling.

  • Reduces standard deviation of

accuracy from σ = 30 to σ = 15.

  • Accuracy becomes more

predictable.

47 47

slide-48
SLIDE 48

Bird Call Recognition with ANNs, SVMs and KDE

Meta-Analysis

48 48

slide-49
SLIDE 49

Bird Call Recognition with ANNs, SVMs and KDE

Median Confusion Matrices

  • Combine confusion matrices of all seven

classifiers.

  • Enables meta-analysis of performance.

49 49

slide-50
SLIDE 50

Bird Call Recognition with ANNs, SVMs and KDE

Confusion Matrices

  • Frames

Calls

Problem areas:

  • BCCH mistaken for ALFL
  • BAOR mistaken for AMRO
  • BLJA mistaken for AMRO, etc.

50 50

slide-51
SLIDE 51

Bird Call Recognition with ANNs, SVMs and KDE

Confusion Matrix Errors

  • Some errors can be explained because calls

are similar (BAOR, AMRO).

  • Other errors harder to explain, birds don't

sound similar (BLJA, AMRO).

51 51

slide-52
SLIDE 52

Bird Call Recognition with ANNs, SVMs and KDE

Cause of Errors

  • Inadequacies are common across classifiers.
  • Hints at deficiency of something outside of

recognition algorithms.

  • Pre-processing stage (initial feature extraction)

remains the same regardless of classifier.

  • Features being used are not useful for

distinguishing BLJA and AMRO.

  • Feature selection dictates that most data be

discarded.

52 52

slide-53
SLIDE 53

Bird Call Recognition with ANNs, SVMs and KDE

Possible Solutions

  • Select different feature set (time consuming).
  • Add more features (and increase

dimensionality).

  • Train additional classifier to distinguish

between AMRO and BLJA (and increase complexity.)

53 53

slide-54
SLIDE 54

Bird Call Recognition with ANNs, SVMs and KDE

Final Analysis

54 54

slide-55
SLIDE 55

Bird Call Recognition with ANNs, SVMs and KDE

Final Figure of Merit

  • Is there a way to condense results into a single

figure of merit (FOM)?

55 55

slide-56
SLIDE 56

Bird Call Recognition with ANNs, SVMs and KDE

FOM: Characteristics

  • Figure of merit should:

– Ignore the scores of the training sets; CV set is most indicative of generalization ability. – Should make use of accuracy and accuracy floor. – Should penalize very low accuracy or acc. floor values:

  • Addition no good: 100 + 0 = 50 + 50.
  • Multiplication better: 100 x 0 < 50 x 50.

56 56

slide-57
SLIDE 57

Bird Call Recognition with ANNs, SVMs and KDE

FOM: Geometric Mean

  • FOM used is geometric mean of CV accuracy

and CV accuracy floor:

57 57

slide-58
SLIDE 58

Bird Call Recognition with ANNs, SVMs and KDE

FOM: Comparison

  • Voting results:

– SVM-MID (57), NN-100 (47) are best. – NN-20, KDE score zero.

  • Chi-test always

better.

58 58

slide-59
SLIDE 59

Bird Call Recognition with ANNs, SVMs and KDE

FOM: Comparison

  • Chi-test results:

– SVM better. – SVM-FAR (70) is best,

  • acc. = 79%, floor = 63%.

Not best acc., but highest floor. – NN-20, KDE worse, but

  • avg. acc. Still 7 times

chance.

59 59

slide-60
SLIDE 60

Bird Call Recognition with ANNs, SVMs and KDE

Conclusions

  • Calls can be recognized based on short-term features.

Global features and order can be ignored.

  • Voting, chi-test (confusion matching) can be used to

convert a collection of frames into a species estimate.

  • Chi-test can raise accuracy floor, reduce accuracy

variance.

  • SVMs slightly outperform ANNs.
  • KDE and NN-20 have similar performance.

60 60

slide-61
SLIDE 61

Bird Call Recognition with ANNs, SVMs and KDE

Future Directions

  • More species.
  • Other features: global, structural.
  • Improve pre-processor robustness.
  • Apply technique to musical instruments, other

animals and sounds.

  • Continuous, as opposed to file based,

processing.

  • KDE: further investigation.

61 61

slide-62
SLIDE 62

Bird Call Recognition with ANNs, SVMs and KDE

Thank you!

slide-63
SLIDE 63

Bird Call Recognition with ANNs, SVMs and KDE

Overall Recognition Process

  • 63

63

slide-64
SLIDE 64

Bird Call Recognition with ANNs, SVMs and KDE

Neural Network: Overtraining

  • With mean squared

error, NN-500 exhibits

  • vertraining.
  • Accuracy

calculation “hides”

  • vertraining.

64 64