Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake - PowerPoint PPT Presentation

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting June 28, 2018 1 / 35

Outline What is Acoustic Fingerprinting 1 Fingerprinting for Music Identification 2 Spectrograms 3 History 4 My Implementation 5 Demo 6 References 7 Jake Runzer Acoustic Fingerprinting June 28, 2018 2 / 35

Overview of Acoustic Fingerprints An audio fingerprint is a compact signature that summarizes an audio signal. Jake Runzer Acoustic Fingerprinting June 28, 2018 3 / 35

Requirements for Acoustic Fingerprints A fingerprint should have the following properties Is unique to that specific audio signal Does not depend on the binary representation of the audio Represents how humans hear the audio Jake Runzer Acoustic Fingerprinting June 28, 2018 4 / 35

Overview of Music Identification (I) Use a database of fingerprints belonging to known sources to identify a fingerprint belonging to an unknown source. Jake Runzer Acoustic Fingerprinting June 28, 2018 5 / 35

Overview of Music Identification (II) Jake Runzer Acoustic Fingerprinting June 28, 2018 6 / 35

Examples You have probably have used or know of apps that used audio fingerprinting for music identification Shazam Soundhound Jake Runzer Acoustic Fingerprinting June 28, 2018 7 / 35

Requirements for Music Identification Music identification is often done on mobile devices in noisy environments Size of data generated is as small as possible Low computational footprint Length of audio required to get match is short (<10 sec) Noise/distortion agnostic Jake Runzer Acoustic Fingerprinting June 28, 2018 8 / 35

Typical Pipeline 1 Capture audio on mobile device 2 Create fingerprint on device and send to matching server 3 Database is normally inverted index of fingerprint -> song 4 Approximate nearest neighbour search is performed to find best candidates 5 Temporal alignment step applied to most similar matches 6 Return best matched song to mobile device Jake Runzer Acoustic Fingerprinting June 28, 2018 9 / 35

Why Spectrograms? Almost all fingerprinting techniques rely on audio spectrograms More closely represents how humans hear audio compared to the binary representation Time and frequency resolution can be adjusted to make algorithm more robust to noise Jake Runzer Acoustic Fingerprinting June 28, 2018 10 / 35

What are Spectrograms? Visual representation of the spectrum of frequencies of sound as they vary with time. Jake Runzer Acoustic Fingerprinting June 28, 2018 11 / 35

Short-Time Fourier Transform (STFT) Used to determine the frequency of local sections of a signal as it changes over time. An overlapping window is moved over the audio. At each step the Fourier Transform is computed using FFT. Jake Runzer Acoustic Fingerprinting June 28, 2018 12 / 35

STFT Parameters These values can be configured and modified to change how the spectrogram is generated Window length FFT Length Overlap amount Jake Runzer Acoustic Fingerprinting June 28, 2018 13 / 35

Approaches A few common approaches that have been used. All rely on audio spectrograms. Computer vision based Wavelet based Peak based Jake Runzer Acoustic Fingerprinting June 28, 2018 14 / 35

Computer Vision for Music Identification Intuition: 1D audio signals can be processed as conventional images when viewed in the time-frequency spectrogram representation. Spectrogram is treated as set of overlapping images Train AdaBoost classifiers on box-filters Output of classifier is binary value representing the differences between values aggregated in two sub-rectangular regions Use concatenated output of classifier as fingerprint https://ieeexplore.ieee.org/document/1467322/ Jake Runzer Acoustic Fingerprinting June 28, 2018 15 / 35

Wavelet-Based Compute overlapping spectrogram images Decompose images using multi-resolution Haar wavelets Retain only top-t wavelets, where t is much smaller than the size of spectrogram Only keep sign information Compare two spectrograms by computing byte wise Hamming distance https://www.sciencedirect.com/science/article/pii/S0031320308001702 Jake Runzer Acoustic Fingerprinting June 28, 2018 16 / 35

Peak-Pair Hashing The original Shazam algorithm. Look only at spectrogram peaks Peaks are more likely to survive ambient noise A peak analysis of music and noise together will contain spectral peaks due to the music and noise as if they were analyzed separately Look at pairs of peaks and create lots of fingerprints per audio sample https://www.ee.columbia.edu/~dpwe/papers/Wang03-shazam.pdf Jake Runzer Acoustic Fingerprinting June 28, 2018 17 / 35

Peak-Pair Improvements Improvement on Wang’s algorithm (peak-pair hashing) "Fingerprints are generated using a modulated complex lapped transform-based non- repeating foreground audio extraction and an adaptive threshold method for promi- nent peak detection". Jake Runzer Acoustic Fingerprinting June 28, 2018 18 / 35

My Implementation I implemented the music identification using peak-pair hashing (Shazam original algorithm) in Python using the Numpy and Scipy libraries. Jake Runzer Acoustic Fingerprinting June 28, 2018 19 / 35

Architecture Jake Runzer Acoustic Fingerprinting June 28, 2018 20 / 35

Working Example Throughout the next slides we will look at the song "Kids" by "MGMT" https://www.youtube.com/watch?v=aBd46BbdTfs Jake Runzer Acoustic Fingerprinting June 28, 2018 21 / 35

Spectrogram Creation I use the following parameters to create the spectrogram Window: Hamming Window size: 1024 Overlap: 0.5 FFT size: 1024 Jake Runzer Acoustic Fingerprinting June 28, 2018 22 / 35

Constellations Time-frequency peaks are found using an image local maxima filter with a neighbourhood of 15 pixels (freq + time axes). For Kids , there are 14425 peaks. Jake Runzer Acoustic Fingerprinting June 28, 2018 23 / 35

Finding Pair For each peak, the closest 15 neighbouring peaks within 200 seconds create a pair. For Kids , there are 8514 fingerprints. Jake Runzer Acoustic Fingerprinting June 28, 2018 24 / 35

Creating Hashes (I) A hash is created for each pair (not a cryptographic hash). Each has is composed of the frequency of point 1 the frequency of point 2 the difference in their times The hash is combined with the time offset of the first point, as it will be necessary for matching, to create a fingerprint. fingerprint = hash:time = [f1, f2, t2 - t1]:t1 Jake Runzer Acoustic Fingerprinting June 28, 2018 25 / 35

Creating Hashes (II) Jake Runzer Acoustic Fingerprinting June 28, 2018 26 / 35

Database Information about the source songs and each fingerprint are stored in a PostgreSQL database. Song Id Fingerprint Artist Id Album Hash Title Time offset Track Song Id Year Duration Jake Runzer Acoustic Fingerprinting June 28, 2018 27 / 35

Identification When an unknown audio sample needs to be identified, Fingerprints are created Matching fingerprints are retrieved from the database Fingerprints are aligned Song associated with best matched set of fingerprints is returned Jake Runzer Acoustic Fingerprinting June 28, 2018 28 / 35

Fingerprint Aligning (I) We cannot know the time offset the unknown audio was recorded at We can find matched fingerprints that occur successively after each other The time offsets from the unknown fingerprints are subtracted from the time offsets of the matched fingerprints Jake Runzer Acoustic Fingerprinting June 28, 2018 29 / 35

Fingerprint Aligning (II) Diagonal is present where matched fingerprints occur successively after each other. Jake Runzer Acoustic Fingerprinting June 28, 2018 30 / 35

A Match! Jake Runzer Acoustic Fingerprinting June 28, 2018 31 / 35

Source The source code can be found on Github. github.com/coffee-cup/soundz Jake Runzer Acoustic Fingerprinting June 28, 2018 32 / 35

Demo And now. . . a demo! Jake Runzer Acoustic Fingerprinting June 28, 2018 33 / 35

Thanks Thanks for listening! Jake Runzer Acoustic Fingerprinting June 28, 2018 34 / 35

References A review of audio fingerprinting Computer Vision for Music Identification Waveprint: Efficient wavelet-based audio fingerprinting A Review of algorithms for audio fingerprinting Survey and evaluation of audio fingerprinting schemes for mobile query-by-example applications Landmark-based music recognition systems optimisation using genetic algorithms An Industrial Strength Audio Search Algorithm Robust audio fingerprinting use peak-pair-based hash of non-repeating foreground audio in a real environment Jake Runzer Acoustic Fingerprinting June 28, 2018 35 / 35

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake - PowerPoint PPT Presentation

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting June 28, 2018 1 / 35 Outline What is Acoustic Fingerprinting 1 Fingerprinting for Music Identification 2 Spectrograms 3 History 4 My

Acoustic Acoustic Control Systems BV Acoustic Acoustic Control Systems BV Control Systems BV

k -fingerprinting: a Robust Scalable Website Fingerprinting Technique George Danezis Jamie Hayes

Fingerprinting hardware devices Fingerprinting hardware devices using clock-skewing using

CO 447 | LEC6 BLOCKCHAIN SECURITY Dr. Benjamin Livshits Stateless Fingerprinting 2 EFF

The Center for Acoustic Neuroma Translabyrinthine Resection of Acoustic Neuroma Indications 1 -

VARIFLEX operable walls Introduction Acoustic overview Acoustic selection table Types of VX

Acoustic Modeling: Tied-state HMMs & DNN-based models Lecture 7 CS 753 Instructor: Preethi

Adaptation Techniques for Acoustic Adaptation Techniques for Acoustic Adaptation Techniques for

Articulus Detecting IP Hijacking Through Server Fingerprinting Research Question How can we

Fingerprinting of Defendants October 11, 2018 VIRGINIA STATE CRIME COMMISSION N I A S I G

Website fingerprinting attacks against Tor Browser Bundle: a comparison between HTTP/1.1 and

Fingerprinting ECUs for Vehicle Intrusion Detection Kyong-Tak Cho, Kang G. Shin, University of

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24,

Clock Around the Clock Time-Based Device Fingerprinting Iskander Sanchez-Rola, Igor Santos,

Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Carnegie

Blind Elephant: Web Application Fingerprinting & Vulnerability Inferencing Patrick Thomas

Kernel Spectrogram Models for source separation Antoine Liutkus 1 , Zafar Rafii 2 , Bryan Pardo 2

Audio Data Representations Juhan Nam Types of Music Data Audio MP3, WAV Score

E9 205 Machine Learning for Signal Processing Non-negative Matrix Factorization 16-09-2019 Audio

CTP431- Music and Audio Computing Audio Signal Processing (Part #1) Graduate School of Culture

Telefonica Research @ Trecvid 2011 Xavier Anguera, Daru Xu 1

CSE 562: Mobile Systems & Applications Quals Course Systems Area Shyam Gollakota First

Semi-Supervised Adversarial Audio Source Separation applied to Singing Voice Extraction Daniel

DNN Based TTS Systems TTS Architecture: Traditional Pipeline Typical statistical parametric

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake - PowerPoint PPT Presentation

Acoustic Fingerprinting Soundz Jake Runzer June 28, 2018 Jake Runzer Acoustic Fingerprinting June 28, 2018 1 / 35 Outline What is Acoustic Fingerprinting 1 Fingerprinting for Music Identification 2 Spectrograms 3 History 4 My

Acoustic Acoustic Control Systems BV Acoustic Acoustic Control Systems BV Control Systems BV

k -fingerprinting: a Robust Scalable Website Fingerprinting Technique George Danezis Jamie Hayes

Fingerprinting hardware devices Fingerprinting hardware devices using clock-skewing using

CO 447 | LEC6 BLOCKCHAIN SECURITY Dr. Benjamin Livshits Stateless Fingerprinting 2 EFF

The Center for Acoustic Neuroma Translabyrinthine Resection of Acoustic Neuroma Indications 1 -

VARIFLEX operable walls Introduction Acoustic overview Acoustic selection table Types of VX

Acoustic Modeling: Tied-state HMMs &amp; DNN-based models Lecture 7 CS 753 Instructor: Preethi

Adaptation Techniques for Acoustic Adaptation Techniques for Acoustic Adaptation Techniques for

Articulus Detecting IP Hijacking Through Server Fingerprinting Research Question How can we

Fingerprinting of Defendants October 11, 2018 VIRGINIA STATE CRIME COMMISSION N I A S I G

Website fingerprinting attacks against Tor Browser Bundle: a comparison between HTTP/1.1 and

Fingerprinting ECUs for Vehicle Intrusion Detection Kyong-Tak Cho, Kang G. Shin, University of

Feature Selection in Website Fingerprinting Junhua Yan Advisor: Prof. Jasleen Kaur July 24,

Clock Around the Clock Time-Based Device Fingerprinting Iskander Sanchez-Rola, Igor Santos,

Overview n Melody-Based Retrieval n Audio-Score Alignment n Music Fingerprinting 2 Carnegie

Blind Elephant: Web Application Fingerprinting &amp; Vulnerability Inferencing Patrick Thomas

Kernel Spectrogram Models for source separation Antoine Liutkus 1 , Zafar Rafii 2 , Bryan Pardo 2

Audio Data Representations Juhan Nam Types of Music Data Audio MP3, WAV Score

E9 205 Machine Learning for Signal Processing Non-negative Matrix Factorization 16-09-2019 Audio

CTP431- Music and Audio Computing Audio Signal Processing (Part #1) Graduate School of Culture

Telefonica Research @ Trecvid 2011 Xavier Anguera, Daru Xu 1

CSE 562: Mobile Systems &amp; Applications Quals Course Systems Area Shyam Gollakota First

Semi-Supervised Adversarial Audio Source Separation applied to Singing Voice Extraction Daniel

DNN Based TTS Systems TTS Architecture: Traditional Pipeline Typical statistical parametric

Acoustic Modeling: Tied-state HMMs & DNN-based models Lecture 7 CS 753 Instructor: Preethi

Blind Elephant: Web Application Fingerprinting & Vulnerability Inferencing Patrick Thomas

CSE 562: Mobile Systems & Applications Quals Course Systems Area Shyam Gollakota First