 
              The Audio Degradation Toolbox http://code.soundsoftware.ac.uk/projects/audio-degradation-toolbox/ and its Application to Robustness Evaluation Sebastian Ewert and Matthias Mauch Friday, 1 November 13
reverb photo by steveleenow Friday, 1 November 13
lossy compression photo by dan taylor Friday, 1 November 13
bad analog-to-digital conversion photo by emilio di fabio Friday, 1 November 13
low quality microphone photo by JeffaCubed Friday, 1 November 13
Environmental noise Friday, 1 November 13
...and many other things degrade audio. irregular tape playback dynamic range compression in radio and tv broadcasts audio speedup on the radio noise clipping and other distortion ... and yet more. Friday, 1 November 13
Audio Collection Quality most audio collections bad contain some audio of low quality ok ok contain recordings of different qualities bad contain recording of unknown quality bad ok alright-ish bad ok Friday, 1 November 13
Audio Collection Quality most audio collections contain some audio of low quality contain recordings of different qualities contain recording of unknown quality Friday, 1 November 13
Impact on Music Informatics methods are usually tested only on one (or few) audio collections, hence: feature extractors (etc.) might fail in the real world affects MIR researchers’ work if feature extractors work, it is not clear if they corrleate with content or audio quality affects ‘digital musicologists’ and industry Friday, 1 November 13
Audio Degradation Toolbox most comprehensive collection of Matlab code for audio degradation designed to make it easy to degrade audio in many different ways aim: encourage MIR researchers to test their algorithms under many GPL open source different conditions on SoundSoftware Friday, 1 November 13
Degradation Units Add Noise Apply Impulse Response Add Sound High-pass filter Attenuation Low-pass filter Aliasing MP3 Compression Clipping Saturation Delay Speedup Dynamic Range Compr. Wow Resampling Friday, 1 November 13
Degradation Units Add Noise Apply Impulse Response Add Sound High-pass filter Attenuation Low-pass filter Aliasing MP3 Compression sounds included: Clipping Saturation pub sound env., Delay Speedup vinyl crackle Dynamic Range Compr. Wow Resampling Friday, 1 November 13
Degradation Units Add Noise Apply Impulse Response Add Sound High-pass filter Attenuation Low-pass filter room, microphone, Aliasing MP3 Compression sounds speaker and vinyl included: Clipping Saturation player IRs pub sound env., Delay Speedup vinyl crackle Dynamic Range Compr. Wow Resampling Friday, 1 November 13
Degradation Unit Example parameter.noiseColor = ’brown’; [audio_out, timestamps_out] = degradationUnit_addNoise(audio, samplingFreq, timestamps, parameter) example sound before / after why “timestamps” — we’ll see later. Friday, 1 November 13
Degradations to make complex “Degradations” we can make chains from degradation units ... like audio effects! Example: Radio Broadcast Degradation Dynamic Range Compr. Speedup Friday, 1 November 13
Degradations to make complex “Degradations” we can make chains from degradation units ... like audio effects! Example: Radio Broadcast Degradation Radio Broadcast Dynamic Range Compr. Speedup Friday, 1 November 13
Degradations — examples Lots of audio examples (file://localhost/Users/ matthiasm/code/audio-degradation-toolbox/ html/audio_examples.html) Examples with spectrogram: Wow resampling on cello (file6) Live Recording on file1 Friday, 1 November 13
Comparing to Ground Truth one main purpose: evaluate methods under different degradations problem — we have time-distorting degradations solution: every degradation can also transform ground truth to the time line of the degraded audio example: beat tracking ground truth after “Speedup” degradation Friday, 1 November 13
Comparing to Ground Truth one main purpose: evaluate methods under different degradations problem — we have time-distorting degradations solution: every degradation can also transform ground truth to the time line of the degraded audio example: beat tracking ground truth after “Speedup” degradation time original ground truth Friday, 1 November 13
Comparing to Ground Truth one main purpose: evaluate methods under different degradations problem — we have time-distorting degradations solution: every degradation can also transform ground truth to the time line of the degraded audio example: beat tracking ground truth after “Speedup” degradation time original ground truth transformed Friday, 1 November 13
Revisit Example audio Radio Broadcast degradation degradation unit Dynamic Range Compr. transformed ground truth timestamps degradation unit audio Speedup transformed ground truth Friday, 1 November 13
Experiments on ‘Real-World’ Degradations Live Recording Radio Broadcast Smartphone Playback Smartphone Recording Strong MP3 Compression Vinyl Recording Friday, 1 November 13
Results I — Audio ID audio ID fails for most “Real-World” degradations, not for mp3 robustness to pink noise is ok correct incorrect not identified Original 100 0 0 Live 0 0 100 Radio 3 3 94 PhonePlay 0 1 99 PhoneRec 5 7 88 MP3 100 0 0 Vinyl 4 0 96 Friday, 1 November 13
Results I — Audio ID audio ID fails for most “Real-World” degradations, not for mp3 robustness to pink noise is ok correct incorrect not identified 100 ● ● ● ● Original 100 0 0 80 ● correct Live 0 0 100 60 Radio 3 3 94 40 ● PhonePlay 0 1 99 20 PhoneRec 5 7 88 0 ● MP3 100 0 0 Vinyl 4 0 96 orig 40 30 20 10 5 0 dB SNR Friday, 1 November 13
Results II — Score-to-audio alignment pretty much falls over for explanations: onset “Live” and “Phone duplication; bass Playback” degradations harmony missing 100 percentage in 50ms window 90 80 70 60 50 40 30 PhonePlay Original PhoneRec Radio MP3 Vinyl Live Friday, 1 November 13
Results III — Beat-tracking compare two methods: BeatRoot, Davies very similar, but Davies more robust to “Live” degradation 1.0 0.8 F measure 0.6 0.4 0.2 BeatRoot Davies 0.0 PhonePlay PhoneRec Original Radio Vinyl MP3 Live Friday, 1 November 13
Results IV — Chord recognition HPA usually better, compare two methods: Chordino more robust Chordino, HPA on “Phone Play” 1.0 relative correct overlap 0.8 0.6 0.4 Chordino 0.2 HPA 0.0 PhonePlay PhoneRec Original Radio MP3 Vinyl Live Friday, 1 November 13
Results IV — Chord recognition HPA usually better, compare two methods: Chordino more robust Chordino, HPA on “Phone Play” 1.0 1.0 relative correct overlap 0.8 relative correct overlap 0.8 0.6 0.6 0.4 0.4 Chordino 0.2 Chordino 0.2 HPA HPA 0.0 0.0 PhonePlay PhoneRec Original Radio MP3 Vinyl Live Original HP 100 HP 200 HP 400 HP 800 HP 50 Friday, 1 November 13
Results IV — Chord recognition HPA usually better, compare two methods: Chordino more robust Chordino, HPA on “Phone Play” 1.0 400Hz High-pass original 1.0 B relative correct overlap 0.8 relative correct overlap 0.8 A 0.6 0.6 G 0.4 0.4 F Chordino 0.2 E Chordino 0.2 HPA HPA 0.0 D 0.0 PhonePlay PhoneRec Original Radio C MP3 Vinyl Live Original HP 100 HP 200 HP 400 HP 800 HP 50 time time Friday, 1 November 13
Summary Audio Degradation Toolbox offers easy-to-use degradations more comprehensive than other existing toolboxes ground truth time-line transform to evaluate on time- warping degradations Results show: ADT is useful to detect strengths and weaknesses of MIR methods For paper, audio examples, source code: http://code.soundsoftware.ac.uk/projects/audio- degradation-toolbox Friday, 1 November 13
What’s up next? convince everyone to use the ADT :) work with it ourselves... degraded audio as additional training data affect of degradation on human ground truth labelling Friday, 1 November 13
Recommend
More recommend