Tonic Identification System for Hindustani and Carnatic Music - - PowerPoint PPT Presentation

tonic identification system for hindustani and carnatic
SMART_READER_LITE
LIVE PREVIEW

Tonic Identification System for Hindustani and Carnatic Music - - PowerPoint PPT Presentation

Tonic Identification System for Hindustani and Carnatic Music Sankalp Gulati, Justin Salamon and Xavier Serra Music Technology Group Universitat Pompeu Fabra {sankalp.gulati, justin.salamon, xavier.serra}@upf.edu 7/23/12 Introduction: Tonic


slide-1
SLIDE 1

Tonic Identification System for Hindustani and Carnatic Music

Sankalp Gulati, Justin Salamon and Xavier Serra Music Technology Group Universitat Pompeu Fabra {sankalp.gulati, justin.salamon, xavier.serra}@upf.edu

slide-2
SLIDE 2

Introduction: Tonic in Indian art music

 The base pitch chosen by a performer that allows to explore the full pitch range in a comfortable way [1]  Anchored as ‘Sa’ swar in a performance (mostly)  All the other notes used in the raga exposition derive their meaning in relation to this pitch value  All other accompanying instruments are tuned using this pitch as reference

P i t c h time Tonic

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

2

slide-3
SLIDE 3

Role of Drone Instrument

 Performer and audience needs to hear this pitch throughout the concert  Reinforces the tonic and establishes all harmonic and melodic relationships

Tanpura Sitar Surpeti or Shrutibox Electronic Tanpura

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

3

slide-4
SLIDE 4

Introduction: Tonal structure of Tanpura

 Four strings  Tunings

 Sa-Sa’-Sa’-Pa  Sa-Sa’-Sa’-Ma  Sa-Sa’-Sa’-Ni

 Special bridge with thread inserted (Jvari)

 Violate Helmholtz law [2]

 Rich overtones [1]

Bridge

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

4

slide-5
SLIDE 5

Introduction: Goals and Motivation

 Automatic labeling of the tonic in large databases of Indian art music  Devise a system for identification of

 Tonic pitch for vocal excerpts  Tonic pitch class profile for instrumental excerpts

 Use all the available data (audio + metadata) to achieve maximum accuracy  Confidence measure for each output from the system

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

5

slide-6
SLIDE 6

Introduction: Goals and Motivation

 Fundamental information  Tonic identification: crucial input for:

 Intonation analysis  Raga recognition  Melodic motivic analysis

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

6

slide-7
SLIDE 7

Relevant work: Tonic Identification

 Very little work done in the past  Based on melody [ 4,5]  Ranjani et al. take advantage of melodic characteristics

  • f Carnatic music [4]

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

7

slide-8
SLIDE 8

Relevant work: Summary

 Utilized only the melodic aspects  Used monophonic pitch trackers for heterophonic data  Limited diversity in database

 Special raga categories, aalap sections, solo vocal recordings

 Unexplored aspects:

 Utilizing background audio content comprising drone sound  Taking advantage of different types of available data, like audio and metadata  Evaluation on diverse database

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

8

slide-9
SLIDE 9

Methodology: System Overview

Manual annotation Tonic

Yes Yes No No Audio Metadata

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

9

slide-10
SLIDE 10

Methodology: System Overview

 Culture specific characteristics for tonic identification

 Presence of drone*  Culture specific melodic characteristics  Raga knowledge  Melodic Motifs

 Use variable amount of data that is sufficient enough to identify tonic with maximum confidence.

 Audio data  Metadata (Male/Female, Hindustani/Carnatic, Raga etc.)

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

10

slide-11
SLIDE 11

Methodology: Tonic Identification

 Audio example:  Utilizing drone sound  Chroma or multi-pitch analysis

Multi-pitch Analysis [7]

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

11

slide-12
SLIDE 12

Tonic Identification: Signal Processing

Audio Sinusoids Time frequency salience

Sinusoid Extraction

Tonic candidates

Pitch Salience computation Tonic candidate generation

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

12

slide-13
SLIDE 13

Tonic Identification: Signal Processing

 STFT

 Hop size: 11 ms  Window length: 46 ms  Window type: hamming  FFT = 8192 points

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

13

slide-14
SLIDE 14

Tonic Identification: Signal Processing

 Spectral peak picking

 Absolute threshold: -60 dB  Relative threshold: -40 dB

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

14

slide-15
SLIDE 15

Tonic Identification: Signal Processing

 Frequency/Amplitude correction

 Parabolic interpolation

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

15

slide-16
SLIDE 16

Tonic Identification: Signal Processing

 Harmonic summation [7]

 Spectrum considered: 55-7200 Hz  Frequency range: 55-1760 Hz  Base frequency: 55 Hz  Bin resolution: 10 cents per bin (120 per

  • ctave)

 N octaves: 5  Maximum harmonics: 20  Alpha: 1  Beta: 0.8  Square cosine window across 50 cents

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

16

slide-17
SLIDE 17

Tonic Identification: Signal Processing

 Tonic candidate generation

 Number of salience peaks per frame: 5  Frequency range: 110-550 Hz  After candidate selection salience is no longer considered!!!!

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

17

slide-18
SLIDE 18

Tonic Identification : Two sub-tasks

 Caters to both vocal and instrumental excerpts

 Identify tonic pitch class (PC) using multi-pitch histogram  Estimate the correct octave using predominant melody

 Use predominant melody extraction approach proposed by Justin Salamon et al. [6]  Tonic PCP

 Peak Picking + Machine learning

 Tonic octave estimation

 Rule based method + Classification based approach

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

18

slide-19
SLIDE 19

Tonic Identification : PC identification

 Classification based template learning  Two kind of class mappings

 Rank of the highest tonic PC  Highest peak as Tonic or Non tonic

 Feature extracted # 20 (f1-f10, a1-a10)

100 150 200 250 300 350 400 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

Frequency bins (1 bin = 10 cents), Ref: 55Hz Normalized salience Multipitch Histogram

f2 f3 f4 f5

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

19

slide-20
SLIDE 20

Tonic Identification : PC identification

 Decision Tree:

>5 <=5 >-7 <=-7 >-11 <=-11 >5 <=5 >-6 <=-6

Sa Sa Pa

salience Frequency

Sa Sa Pa

salience Frequency

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

20

slide-21
SLIDE 21

Tonic Identification : Octave Identification

 Tonic octave

 Rule based method  Classification based approach

 25 Features: a1-a25

50 100 150 200 250 300 350 0.2 0.4 0.6 0.8 1

Frequency bins (1 bin = 10 cents), Ref: 55 Hz Normalized Salience Perdominent Melody Histogram

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

21

slide-22
SLIDE 22

Evaluation: Database

 Subset of CompMusic database (>300 Cds) [3]

Approach 2: #540, 3min (PCP) + 238, full recordings (Octave)

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

22

slide-23
SLIDE 23

Evaluation: Database

 Tonic distribution  Statistics (for 364 vocal excerpts)

 Male (80 %), Female (20%), Hindustani (38%), Carnatic (62%), Unique artist (#36)

 Statistics (for 540 vocal and instrumental excerpts)

 Hindustani (36%), Carnatic (64%), Unique artist (#55)

120 140 160 180 200 220 240 260 280 10 20 30 40 50 60 Frequency (Hz) Number of instances Female singers Male singers 2nd CompMusic Workshop, Istanbul, 2012 7/23/12

23

slide-24
SLIDE 24

Evaluation: Annotations

 Annotations done by the author  Extracted 5 tonic candidates from multi-pitch histograms between 110-370 Hz  Matlab GUI to speed up the annotation procedure

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

24

slide-25
SLIDE 25

Evaluation: Accuracy measures

 Output correct within 50 cents of the ground truth  10 fold cross validation + rule based classification  Weka: data mining tool  Feature selection: CfsSubsetEval (features > 80% folds)  Classifier: J48 decision tree  Performs better than

 SVM-polynomial kernel (6% difference in accuracy)  K* classifier (5% difference in accuracy)

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

25

slide-26
SLIDE 26

Results

Approach\(%) Map #folds Class EQ # Features Tonic pitch Tonic PCP 5th 4th Other AP1_EXP1

  • 85

10.7 0.93 3.3 AP1_EXP2 M1 1 no 1, S2

  • 93.7

1.48 8.9 0.9 AP1_EXP3 M1 10 no 4, S3

  • 92.9

1.9 3.5 1.7 AP1_EXP4 M1 10 yes 4, S4

  • 74.2

11 7.6 6.7 AP1_EXP5 M2 1 no 1, S2

  • 91

3.3 3 2.7 AP1_EXP6 M2 10 no 2, S5

  • 91.8

2.2 3 3 AP1_EXP7 M2 10 yes 2, S5

  • 87.8

4.2 4 3.9

 M1 : tonic PCP rank, M2 : highest peak tonic or non-tonic  S1: [f2, f3, f5], S2: [f2], S3: [f2, f4, f6, a5], S4: [f2, f3, a3, a5], S5: [f2, f3]

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

26

slide-27
SLIDE 27

Results

 Approach 2, Octave identification

 Rule based approach – 99 %  Classification based approach – 100%

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

27

slide-28
SLIDE 28

Discussion: PCP Identification

 AP-1: Performance for male singers (95%), female singers (88%)  Error cases

 Mostly Ma tuning songs  More female singers

 Sensitive to selected frequency range for tonic candidates, a range of 110-370 Hz works optimal

Sa Sa Pa

salience Frequency

Sa Sa Pa

salience Frequency

Sa Sa Ma

salience Frequency

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

28

slide-29
SLIDE 29

Discussion : Octave Identification

 Challenges faced by rule based approach

 Hindustani musicians go roughly -500 cents below tonic  Carnatic musicians generally don’t go that below tonic  Melody estimation errors at low frequency  Concept of Madhyam shruti

50 100 150 200 250 300 350 0.2 0.4 0.6 0.8 1

Frequency bins (1 bin = 10 cents), Ref: 55 Hz Normalized Salience Perdominent Melody Histogram

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

29

slide-30
SLIDE 30

Conclusions and Future Work

 Drone sound in the background provides an important cue for the identification of tonic and can be utilized to automatically perform this task  System should be fed with more information to differentiate between ‘Pa’ and ‘Ma’ tuning  Future Work:

 Exploring melodic characteristics for tonic identification  Deeper analysis of confidence measure concept  Study influence of cultural background on human performance for this task

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

30

slide-31
SLIDE 31

REFERENCES

  • 1. B. C. Deva. The Music of India: A Scientific Study. Munshiram Manoharlal

Publishers, Delhi, 1980.

  • 2. C. V. Raman. On some Indian stringed instruments. In Indian Association for the

Cultivation of Science, volume 33, pages 29-33, 1921.

  • 3. X. Serra. A multicultural approach in music information research. In 12th Int. Soc. for

Music Info. Retrieval Conf., Miami, USA, Oct. 2011.

  • 4. R. Sengupta, N. Dey, D. Nag, A. K. Datta, and A. Mukerjee, “Automatic Tonic ( SA )

Detection Algorithm in Indian Classical Vocal Music,” in National Symposium on Acoustics, 2005, pp. 1-5.

  • 5. T. V. Ranjani, H.G.; Arthi, S.; Sreenivas, “Carnatic music analysis: Shadja, swara

identification and rAga verification in AlApana using stochastic models,” Applications

  • f Signal Processing to Audio and Acoustics (WASPAA), IEEE Workshop, pp. 29-32,

2011.

  • 6. J. Salamon and E. G´omez. Melody extraction from polyphonic music signals using

pitch contour characteristics. IEEE Transactions on Audio, Speech, and Language Processing, 20(6):1759–1770, Aug. 2012.

  • 7. J. Salamon, E. G´omez, and J. Bonada. Sinusoid extraction and salience function

design for predominant melody estimation. In Proc. 14th Int. Conf. on Digital Audio Effects (DAFX-11), pages 73–80, Paris, France, Sep. 2011.

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

31

slide-32
SLIDE 32

Thank you Questions?

2nd CompMusic Workshop, Istanbul, 2012 7/23/12

32