Audio Declipping Using Sparse Multiscale Representations Boris - - PowerPoint PPT Presentation

audio declipping using sparse multiscale representations
SMART_READER_LITE
LIVE PREVIEW

Audio Declipping Using Sparse Multiscale Representations Boris - - PowerPoint PPT Presentation

Audio Declipping Using Sparse Multiscale Representations Boris Mailh Queen Mary University of London Center for Digital Music Boris.Mailhe@eecs.qmul.ac.uk November 4, 2012 SMALL project: Adler, Emyia, Jafari et al. Later works:


slide-1
SLIDE 1

Audio Declipping Using Sparse Multiscale Representations

Boris Mailhé

Queen Mary University of London Center for Digital Music Boris.Mailhe@eecs.qmul.ac.uk

November 4, 2012

◮ SMALL project: Adler, Emyia, Jafari et al. ◮ Later works: Jafari, Clifford et al.

slide-2
SLIDE 2

Audio clipping Declipping with sparse representations Multiscale undersampling for heavily clipped signals Experimental results Conclusion

slide-3
SLIDE 3

Audio clipping

◮ Audio recording devices (microphones, amplifiers, ADCs,...)

have a maximum input level.

◮ If the input signal in any component excedes this level, the

  • utput is clipped.

◮ Typical situations:

◮ the sound is louder than expected, ◮ the source is closer to the microphone than expected, ◮ the recording chain was not properly set up, ◮ .wav encoding,...

slide-4
SLIDE 4

Effect in time domain

10 20 30 40 50 60 70 80 90 100 −4 −2 2 4 samples amplitude 0.5 1 1.5 2 2.5 3 3.5 4 x 10

4

−1 −0.5 0.5 1 samples amplitude

slide-5
SLIDE 5

Audio inpainting [Adler et al.]

◮ Decompose the 8kHz signal into 50% overlapping frames of

length 512, restore each frame then reconstruct.

◮ A clean audio frame s of length N is sparse on a Gabor

dictionary D. s ≈ Dx x0 ≪ N

◮ The clipped samples are considered missing. The observed

signal y is split into a reliable and a clipped part: yr = Mry = Mrs ≈ MrDx yc = Mcy = sign(Mcs)θ

◮ Find a sparse represention ˜

x of yr = sr over Φ = MrD, then reconstruct the clipped part. ˜ x = argmin

x

x0 s.t. yr − MrDx2 ≤ ǫ ˜ sc = McD˜ x

slide-6
SLIDE 6

Orthogonal Matching Pursuit (OMP)

◮ Select the atoms one by one ◮ Initially, Φ0 = ∅, r0 = yr. ◮ Atom selection:

Φi+1 = Φi ∪ {argmax

ϕ∈Φ

|ϕ, ri|}

◮ Coefficient and residual update:

xi+1 = argmin

x

yr − Φi+1x2

2

ri+1 = y − Φi+1xi+1

slide-7
SLIDE 7

Additional declipping constraints

◮ All information is not lost on the clipped samples:

◮ the signs are known, ◮ the clipped values are higher than the clipping level θ.

◮ Modified OMP solver:

◮ select the highest correlated atom as in OMP

Φi+1 = Φi ∪ {argmax

ϕ∈Φ

|ϕ, ri|}

◮ add the constraints when updating the coefficients

xi+1 = argmin

x

y r − Φi+1x2

2

s.t. y[n] = θ ⇒ (Di+1x)[n] ≥ θ and y[n] = −θ ⇒ (Di+1x)[n] ≤ −θ

slide-8
SLIDE 8

Declipping intervals

◮ The most challenging inpainting tasks are the ones where long

intervals are missing.

◮ In clipped audio, the missing samples are always clustered. ◮ Idea: decimate the signal by a factor 2 to reduce the length of

the clipped intervals y → (y[2n])n≤N/2

◮ First declip the decimated signal. ◮ Same window length (512):

◮ longer in time, ◮ undersampled in frequency.

◮ Then declip the whole signal, with only the odd clipped

samples left to estimate.

slide-9
SLIDE 9

On simulated clippings

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 20 30 40 50 60 70 80 Clipping level Average SNR (dB) Janssen dual−constraint OMP msC−OMP

(a) Music

0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 10 15 20 25 30 35 40 45 50 55 Clipping level Average SNR (dB) Janssen dual−constraint OMP msC−OMP

(b) Speech

slide-10
SLIDE 10

On recorded clipped data

5 4.5 4 3.5 3 2.5 2 1.5 1 10 15 20 25 30 35 40 45 Average SNR (dB) Janssen dual−constraint OMP msC−OMP

(c) Music

4 3.5 3 2.5 2 1.5 1 10 15 20 25 30 35 Average SNR (dB) Janssen dual−constraint OMP msC−OMP

(d) Speech

slide-11
SLIDE 11

Conclusion

Contributions:

◮ Audio clipping can be modelled as an inpainting problem with

additional constraints

◮ Sparse representations can solve that problem ◮ Downsampling the signal improves the quality on real recorded

signals Future works:

◮ investigate the influence of the parameters: window length,

frequency sampling, overlap,...

◮ extend the method to soft clipping ◮ refine the evaluation on recorded data