July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema - PowerPoint PPT Presentation

Applause Identification and its relevance to Archival of Carnatic Music Padi Sarala 1 Vignesh Ishwar 1 Ashwin Bellur 1 Hema A.Murthy 1 1 Computer Science Dept, IIT Madras, India. July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Outline of the presentation Introduction to Carnatic music concert Problem definition Feature Extraction Spectral flux Spectral Entropy Characterising the applause using Cumulative sum Highlights detection using CUSUM Results Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Carnatic music concert (1) Carnatic music concert can be 2 to 3 hours long Concert consists of various pieces. Concert consists of compositions, interlaced with improvisational aspects like Raga Alapana , Nereval , Kalpanaswara , Thanam , Sloka , Thani Avarthanam . Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Carnatic music concert (2) In a Concert audience applauds the artist after end of piece. Some times audience applauds the artist in-between improvisational aspects like Raga vocal , Raga violin , After song , Kalpana swara , Thanam , Thani Avarthanam . Most of the carnatic music recordings which are archived today are Manually segmented into pieces. Entire recordings are stored as a single recording. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Applications of Applause Identification Existing work on Applause identification Manoj et al (2011) , discusses how applause is detected in a continuous speech meetings and how it can be used as a key indicator of highlights in speech meeting. Lie Lu et al (2001) , discusses techniques for audio classification and segmenting the audio signal into speech, music, silences, environmental sounds like applause, laughter etc and these segments can be used as an index for audio retrieval. Z. Xiong et al (2003), discusses how applause is detected for determining the highlights of the game. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Problem Definition Identifying the applauses in a given carnatic music concert using spectral domain features. Concert can be automatically segmented into individual pieces for archival purpose. Finding duration and strength of an applause using CUSUM technique. We can determine the highlights of the concert. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Characteristics of Applause and Music 30000.0 30000.0 10000.0 10000.0 20000.0 20000.0 6000.0 6000.0 Amplitude 10000.0 10000.0 Amplitude 0.0 0.0 2000.0 2000.0 -10000.0 -10000.0 -2000.0 -2000.0 -20000.0 -20000.0 -6000.0 -6000.0 -30000.0 -30000.0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 -10000.0 -10000.0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 30000.0 30000.0 10000.0 10000.0 20000.0 20000.0 Amplitude 6000.0 6000.0 10000.0 10000.0 Amplitude 0.0 0.0 2000.0 2000.0 -10000.0 -10000.0 -2000.0 -2000.0 -20000.0 -20000.0 -6000.0 -6000.0 -30000.0 -30000.0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Time in seconds Time in seconds -10000.0 -10000.0 0 0.2 0.4 0.6 0.8 1 0 0.2 0.4 0.6 0.8 1 Time in seconds Time in seconds Figure: Typical sequence of applause and music segments(time domain) In time domain applause segment is rhythmic not structured but corresponding to music it is more structured. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Characteristics of Applause and Music 80.0 80.0 80.0 80.0 Log Magnitude (dB) 60.0 60.0 Log Magnitude (dB) 60.0 60.0 40.0 40.0 40.0 40.0 20.0 20.0 20.0 20.0 0.0 0.0 0.0 0.0 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 80.0 80.0 80.0 80.0 60.0 60.0 60.0 60.0 Log Magnitude (dB) Log Magnitude (dB) 40.0 40.0 40.0 40.0 20.0 20.0 20.0 20.0 0.0 0.0 0.0 0.0 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 0 2000 4000 6000 8000 Frequency in Hz Frequency in Hz Frequency in Hz Frequency in Hz Figure: Typical sequence of applause and music segments(spectral domain) Power spectrum of applause is flat whereas spectrum of music is structured. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Feature Extraction Selecting a good feature for classification or segmentation is crucial task. Most of the audio signals spectral properties change slowly with respect to time. To discriminate between music and applause the following features are used. Spectral flux Spectral entropy Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Spectral flux (1) Spectral flux (SF), also called spectral variation, characterises the change in spectra between adjacent two frames of speech signal. It measures how quickly the power spectrum changes. � SF [ n ] = ( | X n ( ω ) | − | X n + 1 ( ω ) | ) 2 d ω (1) ω where X n ( w ) is the magnitude spectrum of nth frame of an audio signal. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Spectra flux (2) Different Normalisations of Spectral flux are: Spectral flux with no normalisation. 1 Power spectral density normalisation: In this approach XNorm n ( ω ) is 2 defined: X n ( ω ) XNorm n ( ω ) = (2) ω X n ( ω ) d ω � Peak normalisation: In this approach XNorm n ( ω ) is defined as: 3 X n ( ω ) XNorm n ( ω ) = (3) max ω ( X n ( ω )) Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Spectral flux (3) 9 x 10 −4 2.5 x 10 Spectral flux of Power Spectral Density Normalisation Music Segment Spectral flux of unnormalised spectra 2 Music Segment Appaluse Segment 2 Applause Segment 1.5 1 1 0.5 0 0 0 100 200 300 400 500 600 700 800 0 100 200 300 400 500 600 700 800 Time in Seconds Time in Seconds 0.018 Spectral flux of Peak Normalised Spectra 0.016 0.014 0.012 Music Segment 0.01 Applause Segment 0.008 0.006 0.004 0.002 0 0 100 200 300 400 500 600 700 800 Time in Seconds Figure: Different Normalisations of Spectral flux Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Spectral Entropy (1) Spectral Entropy (SE) is the measure of randomness of a system. Shannons entropy of a discrete stochastic variable X with probability mass function is given by N H(X) = − � p ( x i ) log 2 [ p ( x i )] (4) i = 1 | X n ( ω ) | 2 PSD n ( ω ) = ω | X n ( ω ) | 2 d ω � � SE [ n ] PSD n ( ω ) log PSD n ( ω ) d ω = − (5) ω Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Spectral Entropy (2) 4 3.5 Music Segment 3 Spectral Entropy 2.5 Applause Segment 2 1.5 1 0.5 0 0 100 200 300 400 500 600 700 800 Time in Seconds Figure: Spectral entropy of music signal Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Database Used 19 Concerts of male and female singers are taken for experiments. All concerts are Vocal, in that lead musician is a singer. Each concert has 15-20 applauses resulting a total of 343 applauses. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Experimental analysis For 19 concerts Spectral flux and Spectral entropy features are extracted for a frame of 0.25 s duration with a overlap of 0.01 s with a sampling frequency of 44.1KHz. Extracted features are smoothed by a rectangular moving average filter of length 15. For all concerts applause locations and type of applauses are marked manually by a musician. Based on the ground truth DET curve and Equal Error Rates (EER) are calculated for all above extracted features. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Experimental Analysis DET Curve is plotted for Applause detection for various thresholds. The Equal error rates(EER) are given in Table. Applause Detection Performance 80 Entropy fluxnonorm fluxnorm EER values 60 Miss probability (in %) 40 20 10 5 2 1 1 2 5 10 20 40 60 80 False Alarm probability (in %) Figure: DET Curve for appaluse detection Method EER Spectral Flux (no norm) 44.55 % Spectral Flux 23.33% Spectral Entropy 17.33% Table: EER for applause detection Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

Introduction to cumulative sum( CUSUM ) method In case of spectral flux and spectral entropy applause locations are identified based on threshold. It may not be sufficient to determine the duration and strength of an applause. So CUSUM is a non-parametric approach and it can be used to identify the statistical inhomogeneity of a given signal. CUSUM is estimated as Let X [ n ] be the value of feature extracted at time n , Y [ n ] X [ n ] − a = � Cusum [ n − 1 ] + Y [ n ] , Y [ n ] > 0 Cusum [ n ] = 0 Otherwise If Cusum [ n ] > Θ , then it suggests that there is a significant structural shift in the series. The values of ‘ a ’ and ‘ Θ ’ have to be estimated empirically and may vary across different data sets. Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema A.Murthy 2nd CompMusic Workshop

July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema - PowerPoint PPT Presentation

Applause Identification and its relevance to Archival of Carnatic Music Padi Sarala 1 Vignesh Ishwar 1 Ashwin Bellur 1 Hema A.Murthy 1 1 Computer Science Dept, IIT Madras, India. July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema

Date: July 23, 2012 Arrowhead Elementary 06/2012 Arrowhead Elementary 07/2012 Leisure Park

Wednesday, July 27, 16 Wednesday, July 27, 16 Wednesday, July 27, 16 Wednesday, July 27, 16

The Jewel of Dublins churches Thursday 18 July 13 1 Thursday 18 July 13 2 Thursday 18 July

2012 H1 Performance 2012 H1 Performance l Benot Potier l Chairman and CEO Paris, July 30, 2012

Be More strategy 2012-2015 March 2012 Be More strategy 2012-2015 BE MORE: THE NEW STRATEGY

Bank of Georgia Q2 2012 and 1H 2012 Results Presentation June 2012 October 2012 Contents Bank of

1H 2012 Result s Present at ion 26 t h July 2012 26 t h July 2012 Forward-looking S tatements

2 nd quarter 2012 Oslo 16 July 2012 ANNOUNCEMENT 16 JULY 2012 New agreements confirms good

July 2012 Income Statement July July m 2012 2011 Revenue 1,640 1,565 + 4.8% Operating

anton@linevich.com http://viewdle.com Friday, July 3, 2009 Friday, July 3, 2009 Friday, July 3,

2012 NOMCOM FINAL REPORT TRANSPARENCY & ACCOUNTABILITY Vanda Scartezini 2012 Chair 2012

2012 LEVY HEARING December 18, 2012 Meeting of the Board 2012 Levy Calendar Discussion

Q2 2012 RESULTS FOR THE PERIOD ENDED 30 JUNE 2012 www.goldfields.co.za Interim Results Period

H1 2012 Results Main results Key figures H1 2012 H1 2011 Q2 2012 Q1 2012 Q2 2011 Q1 2011

Half Year Results 2012 Half Year Results 2012 Half Year Results 2012 Roland Junck Greg McMillan

Increasingly Correct Scientific Computing Cezar Ionescu CICM 2012, Bremen, July 13 2012

DeepSkyFusion* multisource data fusion from astronomical images Andr Jalobeanu PASEO Research

The SpaceFusion* project: applications to remote sensing and 3D topographic reconstruction

Understanding Multimedia Systems Multimedia - Basics Lectures video as a medium video

Video in the Interface Video: the BEST * modality As passive or active as needed Simple

Faster Code Nicolas Limare 2014/11/19 faster? one task vs many speeds one operation vs many

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics

Computer Graphics Si Lu Fall 2017 http://web.cecs.pdx.edu/~lusi/CS447/CS447_547_Comp

Tools for viewing and editing binary data Eric McCreath cat and less The 'cat' command will

Sambuz

Useful Links

Newsletter

Mail Us

July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema - PowerPoint PPT Presentation

Applause Identification and its relevance to Archival of Carnatic Music Padi Sarala 1 Vignesh Ishwar 1 Ashwin Bellur 1 Hema A.Murthy 1 1 Computer Science Dept, IIT Madras, India. July 6, 2012 Padi Sarala, Vignesh Ishwar, Ashwin Bellur and Hema

Date: July 23, 2012 Arrowhead Elementary 06/2012 Arrowhead Elementary 07/2012 Leisure Park

Wednesday, July 27, 16 Wednesday, July 27, 16 Wednesday, July 27, 16 Wednesday, July 27, 16

The Jewel of Dublins churches Thursday 18 July 13 1 Thursday 18 July 13 2 Thursday 18 July

2012 H1 Performance 2012 H1 Performance l Benot Potier l Chairman and CEO Paris, July 30, 2012

Be More strategy 2012-2015 March 2012 Be More strategy 2012-2015 BE MORE: THE NEW STRATEGY

Bank of Georgia Q2 2012 and 1H 2012 Results Presentation June 2012 October 2012 Contents Bank of

1H 2012 Result s Present at ion 26 t h July 2012 26 t h July 2012 Forward-looking S tatements

2 nd quarter 2012 Oslo 16 July 2012 ANNOUNCEMENT 16 JULY 2012 New agreements confirms good

July 2012 Income Statement July July m 2012 2011 Revenue 1,640 1,565 + 4.8% Operating

anton@linevich.com http://viewdle.com Friday, July 3, 2009 Friday, July 3, 2009 Friday, July 3,

2012 NOMCOM FINAL REPORT TRANSPARENCY &amp; ACCOUNTABILITY Vanda Scartezini 2012 Chair 2012

2012 LEVY HEARING December 18, 2012 Meeting of the Board 2012 Levy Calendar Discussion

Q2 2012 RESULTS FOR THE PERIOD ENDED 30 JUNE 2012 www.goldfields.co.za Interim Results Period

H1 2012 Results Main results Key figures H1 2012 H1 2011 Q2 2012 Q1 2012 Q2 2011 Q1 2011

Half Year Results 2012 Half Year Results 2012 Half Year Results 2012 Roland Junck Greg McMillan

Increasingly Correct Scientific Computing Cezar Ionescu CICM 2012, Bremen, July 13 2012

DeepSkyFusion* multisource data fusion from astronomical images Andr Jalobeanu PASEO Research

The SpaceFusion* project: applications to remote sensing and 3D topographic reconstruction

Understanding Multimedia Systems Multimedia - Basics Lectures video as a medium video

Video in the Interface Video: the BEST * modality As passive or active as needed Simple

Faster Code Nicolas Limare 2014/11/19 faster? one task vs many speeds one operation vs many

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile &amp; Service Robotics

Computer Graphics Si Lu Fall 2017 http://web.cecs.pdx.edu/~lusi/CS447/CS447_547_Comp

Tools for viewing and editing binary data Eric McCreath cat and less The 'cat' command will

Sambuz

Useful Links

Newsletter

Mail Us

2012 NOMCOM FINAL REPORT TRANSPARENCY & ACCOUNTABILITY Vanda Scartezini 2012 Chair 2012

ROBOTICS 01PEEQW Basilio Bona DAUIN Politecnico di Torino Mobile & Service Robotics