The Extraction of Structure from a Musical Piece Kasper.Souren @ - - PowerPoint PPT Presentation

the extraction of structure from a musical piece
SMART_READER_LITE
LIVE PREVIEW

The Extraction of Structure from a Musical Piece Kasper.Souren @ - - PowerPoint PPT Presentation

The Extraction of Structure from a Musical Piece Kasper.Souren @ ircam.fr http://www.ircam.fr/anasyn/souren/ Musical Structure related to human perception rather from the listener's standpoint than from the composer's standpoint Finding


slide-1
SLIDE 1

The Extraction of Structure from a Musical Piece

Kasper.Souren@ircam.fr http://www.ircam.fr/anasyn/souren/

slide-2
SLIDE 2

 related to human perception  rather from the listener's standpoint

than from the composer's standpoint

Musical Structure

slide-3
SLIDE 3

 audio, no MIDI or symbolic information  audio descriptors  not (yet) limited to one style  looking for similarity and borders

Finding structure

slide-4
SLIDE 4

raw audio (11025 Hz)

log FFT log FFT

power spectrogram spectrum band variation information

EOF

principal components

most significant spectrum band variations musical piece (ogg, wav, mp3)

Most significant spectrum variations

slide-5
SLIDE 5

Most significant spectrum variations

PC of “log FFT” of frames from every band

spectrogram

100 feature vectors per second

time frequency

most significant spectrum band variations

about 1 feature vector per second

slide-6
SLIDE 6

EOF based on SVD

 Empirical Orthogonal Functions, based on

Singular Value Decomposition

 popular in climate research  type of Principal Component Analysis  useful for reducing number of dimensions

while explaining large part of variance

slide-7
SLIDE 7

Similarity matrix

  • J. Foote, 1999

1) the audio descriptors are N-dimensional space 2) calculate mutual distances: distance matrix 3) rescale: similarity matrix

slide-8
SLIDE 8

Similarity matrix

time time time

most significant spectrum variations Similarity Matrix Chardonnay Says by Nood/Banana

slide-9
SLIDE 9

Finding similar parts

step 1: calculate lag matrix

` similarity matrix lag matrix

time time delay time time

slide-10
SLIDE 10

Finding similar parts

step 2: apply 2D FIR filter to blur

blurred lag matrix lag matrix

time delay time delay time time

slide-11
SLIDE 11

Finding similar parts

step 3: find vertical local maxima

its local maxima

(values from non-blurred matrix)

blurred lag matrix

time time delay time delay time

slide-12
SLIDE 12

Finding similar parts

step 4: post-processing

0) forget first column (diagonal of similarity matrix) 1) localize sufficiently long contiguous parts 2) remove overlaps 3) remove diagonal parts

local maxima similar parts

slide-13
SLIDE 13

Finding borders

step 1: convolution, kernels of different sizes

similarity matrix filtered matrices

slide-14
SLIDE 14

Finding borders

step 2: diagonals => columns

filtered matrices diagonals of filtered matrices

time kernel size

slide-15
SLIDE 15

Finding borders

step 3: find local maxima in columns

diagonals of filtered matrices local maxima

time

slide-16
SLIDE 16

Finding borders

step 4: post-processing

1) localize contiguous parts 2) sum their values 3) throw away positions with too low values 4) refine the positions using the spectrogram time

slide-17
SLIDE 17

Structural Information Theory

 formal calculus for Gestalt laws  focus on visual patterns  experimented with Genetic Programming  problem:

need for much higher description, musical objects, thus source seperation, classification, ...

slide-18
SLIDE 18

Framework for Audio Analysis

 functionality interesting for

audio and music research

 integrating research could be fruitful

 finding musical structure  audio signal separation  sound classification  ...

slide-19
SLIDE 19

 scripting language, interpreted  object-oriented  flexible, extensible, easy to embed  modular  free software (BSD style license)

Python

slide-20
SLIDE 20

 Scientific analysis environment

 stand-alone application: QtFfAA  GUI + command line, object viewer, visualisation

 Embeddable in free audio software

 for audio editors and recorders  for music players, DJ tools

FfAA modes

slide-21
SLIDE 21

 versatile interface

 MDI GUI (PyQt)  commandline (IPython)

 load and analyse sound files  database  visualisation  easily extensible

FfAA right now

slide-22
SLIDE 22

The Extraction of Structure from a Musical Piece

Kasper.Souren@ircam.fr http://www.ircam.fr/anasyn/souren/