Score informed audio source separation using a parametric model of - - PowerPoint PPT Presentation

score informed audio source separation using a parametric
SMART_READER_LITE
LIVE PREVIEW

Score informed audio source separation using a parametric model of - - PowerPoint PPT Presentation

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Score informed audio source separation using a parametric model of non-negative spectrogram Romain Hennequin, Roland


slide-1
SLIDE 1

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Score informed audio source separation using a parametric model of non-negative spectrogram

Romain Hennequin, Roland Badeau and Bertrand David

Telecom ParisTech <forename>.<surname>@telecom-paristech.fr

May 24, 2011

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 1/27

slide-2
SLIDE 2

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Introduction

Monaural source separation in a musical signal: separation of the signal of each instrument.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 2/27

slide-3
SLIDE 3

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Introduction

The information in the score of the piece is used to guide the separation. The score is here a MIDI file aligned on the signal (this paper does not deal with alignment). Only harmonic instruments are modeled.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 3/27

slide-4
SLIDE 4

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Introduction

Overview Parametric spectrogram model derived from non-negative matrix factorization (NMF) to decompose the mixture spectrogram. A parametric time/frequency mask is computed for each instrument. Masks are initialized (and constrained) from the score and then finely estimated to fit the mixture spectrogram. Masks are used to separate the instruments (Wiener filtering).

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 4/27

slide-5
SLIDE 5

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Introduction

Why use the score? MIDI files widely available. Very compact description of the audio. Under determined blind separation remains a very difficult problem. Sometimes, blind separation is hopeless (separation of several voices played by the same instrument).

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 5/27

slide-6
SLIDE 6

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Outline

1

Non-negative Matrix Factorization Principle Features

2

Parametric spectrogram model Source parametric spectrogram Example Mixture model

3

Score informed source separation Separation process Results

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 6/27

slide-7
SLIDE 7

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Principle Features

Outline

1 Non-negative Matrix Factorization Principle Features 2 Parametric spectrogram model 3 Score informed source separation

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 7/27

slide-8
SLIDE 8

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Principle Features

Principle of NMF

Low-rank approximation: Vft ≈ ˆ Vft =

R

  • r=1

WfrHrt with W ≥ 0, H ≥ 0, R ≪ min(F, T)

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 8/27

slide-9
SLIDE 9

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Principle Features

Principle of NMF

Features Extract redundant patterns from the data. Fundamental property: non-negativity constraint.

Atoms lie in the same space as the data. Only positive combinations (no black energy). Perceptive description: decomposition of musical spectrograms

  • n a basis of notes.

Application in automatic transcription, source separation, audio inpainting. . .

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 9/27

slide-10
SLIDE 10

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Principle Features

Principle of NMF

Limitations Does not permit to deal with time-frequency variations (vibrato) We needed a representation linked with parameters of interest (fundamental frequency)

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 10/27

slide-11
SLIDE 11

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Outline

1 Non-negative Matrix Factorization 2 Parametric spectrogram model Source parametric spectrogram Example Mixture model 3 Score informed source separation

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 11/27

slide-12
SLIDE 12

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Parametric spectrogram of a single instrument [Hennequin et al., DAFx 2010]

What does an atom look like in a musical spectrogram? In a musical spectrogram most of the (non-percussive) elements are instruments notes which are generally harmonic tones. Parameters of interest are generally the fundamental frequency of these tones, and the shape of the amplitudes of the harmonics. Proposed method: parametric model of spectrogram with harmonic atoms.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 12/27

slide-13
SLIDE 13

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Parametric spectrogram of a single instrument

Time-varying atoms in NMF: ˆ Vft =

R

  • r=1

WfrHrt → ˆ Vft =

R

  • r=1

Wf rt

fr Hrt

f rt is the time-varying fundamental frequency associated to each atom.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 13/27

slide-14
SLIDE 14

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Parametric atoms

Parametric harmonic atom construction Wf rt

fr = nh(f rt

0 )

  • k=1

akg(f − kf rt

0 )

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 14/27

slide-15
SLIDE 15

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Algorithm

Parametric spectrogram (single instrument) ˆ Vft =

R

  • r=1

nh

  • k=1

akg(f − kf rt

0 )

  • W

f rt fr

hrt Minimization Global optimization w.r.t. f rt is impossible (numerous local minima in cost function). ⇒ one atom is introduced for each MIDI

  • note. Optimization thus becomes local (fine estimate of f rt

0 ).

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 15/27

slide-16
SLIDE 16

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Decomposition of a synthetic spectrogram

Time (frame) Frequency (kHz) Original power spectrogram

50 100 150 200 250 300 1 2 3 4 5 −5 5 10 15 20 25 30 35 40

Spectrogram of the first bars of Bach’s first prelude played by a synthesizer.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 16/27

slide-17
SLIDE 17

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Obtained decomposition

Frames Semitones 50 100 150 200 250 300 10 20 30 40 50 60 70 −35 −30 −25 −20 −15

Activations hrt for each MIDI note.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 17/27

slide-18
SLIDE 18

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Mixture spectrogram model

Mixture model The mixture is made up of K sources indexed by k. Source k is modelized with spectrogram ˆ Vk following: ˆ Vk

ft = R

  • r=1

nh

  • p=1

akpg(f − pf krt )

  • W

f krt kfr

hkrt Mixture spectrogram is then: ˆ Vmix =

K

  • k=1

ˆ Vk

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 18/27

slide-19
SLIDE 19

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Source parametric spectrogram Example Mixture model

Mixture spectrogram model

Mixture model Parameters to be estimated for each source k:

Fundamental frequency of each atom r at each time t: f krt , Amplitudes of harmonics: akp, Activations of each note r at each time t: hkrt.

Decomposition obtained with a multiplicative algorithm aiming at minimizing a β-divergence between Vmix and ˆ Vmix.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 19/27

slide-20
SLIDE 20

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Outline

1 Non-negative Matrix Factorization 2 Parametric spectrogram model 3 Score informed source separation Separation process Results

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 20/27

slide-21
SLIDE 21

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Score informed source separation

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 21/27

slide-22
SLIDE 22

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Score informed source separation

MIDI file: notes positions and durations for each instrument. A piano-roll is built for each instrument. Piano-rolls are used to initialize and constrain activations hkrt.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 22/27

slide-23
SLIDE 23

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Score informed source separation

ˆ Vk are finely estimated from the actual mixture spectrogram. ˆ Vk are used as time-frequency masks to separate the tracks (Wiener filtering).

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 23/27

slide-24
SLIDE 24

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Sound example

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 24/27

slide-25
SLIDE 25

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Separation process Results

Experiment

Comparison with a PLCA-based algorithm [Ganseman et al., ICMC 2010]. Needs to synthesize MIDI tracks. Two datasets M1 and M2: same MIDI, different soundbanks.

SIR: Source to Interferences Ratio - SAR: Source to artifacts Ratio - SDR: Source to Distorsion Ratio Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 25/27

slide-26
SLIDE 26

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Conclusion

Synthesis Efficient method of score informed source separation. Parametric model: allows fine handling of the sound. Perspectives Model of percussive instruments. Including other timbral parameters in the spectrogram model. Make the model more robust. Supervised learning of harmonic templates.

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 26/27

slide-27
SLIDE 27

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion

Conclusion

Questions?

Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 27/27

slide-28
SLIDE 28

Introduction Non-negative Matrix Factorization Parametric spectrogram model Score informed source separation Conclusion Ganseman, J., Scheunders, P., Mysore, G. J., and Abel, J. S. (2010). Source separation by score synthesis. In International Computer Music Conference, New York, NY, USA. Hennequin, R., Badeau, R., and David, B. (2010). Time-dependent parametric and harmonic templates in non-negative matrix factorization. In International Conference On Digital Audio Effects, pages 246–253, Graz, Austria. Romain Hennequin, Roland Badeau and Bertrand David Score informed source separation - slide 27/27