An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation
1
Nicholas J. Bryan, Stanford University Gautham J. Mysore, Adobe Research ICML 2013
Sound Check
An Efficient Posterior Regularized Latent Variable Model for - - PowerPoint PPT Presentation
An Efficient Posterior Regularized Latent Variable Model for Interactive Sound Source Separation Nicholas J. Bryan, Stanford University Gautham J. Mysore, Adobe Research ICML 2013 Sound Check 1 Motivation I Real world
1
Sound Check
§ Real world sounds are mixtures of many individual sounds
2
3
§ Non-negative matrix factorization (NMF)
P(f|z) P(z)
P(t|z)
P(f|z) P(z)
6
P(f|z) P(z) P(t|z)
7
§ Requires isolated training data (supervised/semi-supervised)
§ One-shot process, cannot correct for poor results § Very difficult, underdetermined problem
8
§ Eliminate the need to explicit training data § Method of user feedback to guide separation § Algorithm to incorporate the user feedback
9
looping playback
p(f|z)
p(t|z)
p(z)
10
§ Incorporate painting annotations into the model
11
§ Constraints typical encoded as:
§ Prior probabilities on model parameters § Direct observations
§ Complementary method that allows time-frequency constraints § Iterative optimization procedure for each E step § Well suited for our problem
12
Θ
Q
Q
13
Θ
Q∈Q
Q∈Q
14
Q∈Q
q
15
16
17
18
§ Perceptual domain, objective evaluation is difficult § Human evaluation within the learning process
19
20
§ Sound source separation algorithm
§ Time-frequency constraints via posterior regularization § No explicit training data § Efficient, interactive algorithm w/closed-form update equations § Improved separation quality over prior work § Open source software
§ Poster ID: 348 § Demos at ccrma.stanford.edu/~njb/research/iss
21