Sparse signal representation and the tunable Q-factor wavelet - - PowerPoint PPT Presentation

sparse signal representation and the tunable q factor
SMART_READER_LITE
LIVE PREVIEW

Sparse signal representation and the tunable Q-factor wavelet - - PowerPoint PPT Presentation

Sparse signal representation and the tunable Q-factor wavelet transform Ivan Selesnick Polytechnic Institute of New York University Brooklyn, New York 1 Introduction Problem: Decomposition of a signal into the sum of two components: 1.


slide-1
SLIDE 1

Sparse signal representation and the tunable Q-factor wavelet transform

Ivan Selesnick Polytechnic Institute of New York University Brooklyn, New York

1

slide-2
SLIDE 2

Introduction Problem: Decomposition of a signal into the sum of two components:

  • 1. Oscillatory (rhythmic, tonal) component
  • 2. Transient (non-oscillatory) component

Outline

  • 1. Signal resonance and Q-factors
  • 2. Morphological component analysis (MCA)
  • 3. Tunable Q-factor wavelet transform (TQWT)
  • 4. Split augmented Lagrangian shrinkage algorithm (SALSA)
  • 5. Examples

References (MDCT, etc)

  • S. N. Levine and J. O. Smith III. A sines+transients+noise audio representation for data compression and time/pitch scale
  • modications. (1998)
  • L. Daudet and B. Torr´
  • esani. Hybrid representations for audiophonic signal encoding. (2002)
  • S. Molla and B. Torr´
  • esani. An hybrid audio coding scheme using hidden Markov models of waveforms. (2005)
  • M. E. Davies and L. Daudet. Sparse audio representations using the MCLT. (2006)

2

slide-3
SLIDE 3

Oscillatory (rhythmic) and Transient Components in EEG Many measured signals have both an oscillatory and a non-oscillatory component.

1 2 3 4 5 6 7 8 9 10 −40 −20 20 40 EEG SIGNAL TIME (SECONDS)

Rhythms of the EEG: Delta 0 - 3 Hz Theta 4 - 7 Hz Alpha 8 - 12 Hz Beta 12 - 30 Hz Gamma 26 - 100 Hz Transients in EEG due to: 1) unwanted measurement artifacts 2) non-rhythmic brain activity (spikes, spindles, and vertex waves)

3

slide-4
SLIDE 4

Signal resonance and Q-factor

100 200 300 400 500 −1 1 PULSE 1 100 200 300 400 500 −1 1 PULSE 2 100 200 300 400 500 −1 1 PULSE 3 100 200 300 400 500 −1 1 PULSE 4 TIME (SAMPLES) 0.02 0.04 0.06 0.08 0.1 0.12 5 10 SPECTRUM

Q−factor = 1.15

0.02 0.04 0.06 0.08 0.1 0.12 20 40

Q−factor = 4.6

0.02 0.04 0.06 0.08 0.1 0.12 10 20

Q−factor = 1.15

0.02 0.04 0.06 0.08 0.1 0.12 50 100

Q−factor = 4.6

FREQUENCY (CYCLES/SAMPLE)

(a) Signals. (b) Spectra.

Figure 1: The resonance of an isolated pulse can be quantified by its Q-factor, defined as the ratio of its center frequency to its bandwidth. Pulses 1 and 3, essentially a single cycle in duration, are low-resonance pulses. Pulses 2 and 4, whose oscillations are more sustained, are high-resonance pulses. 4

slide-5
SLIDE 5

Resonance-based signal decomposition

100 200 300 400 500 −2 −1 1 2 TEST SIGNAL 100 200 300 400 500 −2 −1 1 2 HIGH−RESONANCE COMPONENT 100 200 300 400 500 −2 −1 1 2 LOW−RESONANCE COMPONENT 100 200 300 400 500 −2 −1 1 2 RESIDUAL TIME (SAMPLES) 100 200 300 400 500 −2 −1 1 2 TEST SIGNAL 100 200 300 400 500 −2 −1 1 2 LOW−PASS FILTERED 100 200 300 400 500 −2 −1 1 2 BAND−PASS FILTERED 100 200 300 400 500 −2 −1 1 2 HIGH−PASS FILTERED TIME (SAMPLES)

(a) Resonance-based decomposition. (b) Frequency-based filtering. Figure 2: Resonance- and frequency-based filtering. (a) Decomposition of a test signal into high- and low-resonance components. The high-resonance signal component is sparsely represented using a high Q-factor WT. Similarly, the low-resonance signal component is sparsely represented using a low Q-factor WT. (b) Decomposition of a test signal into low, mid, and high frequency components using LTI discrete-time filters. 5

slide-6
SLIDE 6

Resonance-based signal decomposition must be nonlinear

SIGNAL LOW−RESONANCE COMPONENT

= + = + = + = + = + = + = +

HIGH−RESONANCE COMPONENT Figure 3: Resonance-based signal decomposition must be nonlinear: The signal in the bottom left panel is the sum of the signals above it; however, the low-resonance component of a sum is not the sum of the low-resonance components. The same is true for the high-resonance component. Neither the low- nor high-resonance components satisfy the superposition property.

F(s1 + · · · + s6) 6= F(s1) + · · · + F(s6)

6

slide-7
SLIDE 7

Rational-dilation wavelet transform (RADWT) Prior work on rational-dilation wavelet transforms addresses the critically-sampled case.

  • 1. K. Nayebi, T. P. Barnwell III and M. J. T. Smith (1991)
  • 2. P. Auscher (1992)
  • 3. J. Kovacevic and M. Vetterli (1993)
  • 4. T. Blu (1993, 1996, 1998)
  • 5. A. Baussard, F. Nicolier and F. Truchetet (2004)
  • 6. G. F. Choueiter and J. R. Glass (2007)

RADWT (2009) gives a solution for the overcomplete case.

Reference: Bayram, Selesnick. Frequency-domain design of overcomplete rational-dilation wavelet transforms. IEEE Trans. on Signal Processing, 57, August 2009.

New: Tunable Q-factor wavelet transform — dilation need not be rational.

7

slide-8
SLIDE 8

Low-pass Scaling x(n) LPS ↵ y(n)

−π −απ απ π 0.5 1 FREQUENCY (ω) X(ω) −π π 0.5 1 FREQUENCY (ω) Y(ω) −π π 0.5 1 FREQUENCY (ω) X(ω) −π −π/α π/α π 0.5 1 FREQUENCY (ω) Y(ω)

Low-pass scaling with ↵ < 1 Low-pass scaling with ↵ > 1

8

slide-9
SLIDE 9

High-pass Scaling x(n) HPS y(n)

−π −(1−β)π (1−β)π π 0.5 1 FREQUENCY (ω) X(ω) −π π 0.5 1 FREQUENCY (ω) Y(ω) −π π 0.5 1 FREQUENCY (ω) X(ω) −π −(1−1/β)π (1−1/β)π π 0.5 1 FREQUENCY (ω) Y(ω)

High-pass scaling with < 1 High-pass scaling with > 1

9

slide-10
SLIDE 10

Tunable Q-factor wavelet transform (TQWT) x(n) H0(!) LPS ↵ v0(n) H1(!) HPS v1(n) LPS 1/↵ H⇤

0(!)

+

y(n) y0(n) HPS 1/ H⇤

1(!)

y1(n) 0 <  1, 0 < ↵ < 1, ↵ + > 1 Y0(!) = 8 > < > : |H0(!)|2 X(!) |!|  ↵ ⇡ ↵ ⇡ < |!|  ⇡ Y1(!) = 8 > < > : |!| < (1 ) ⇡ |H1(!)|2 X(!) (1 ) ⇡  |!|  ⇡ Y (!) = 8 > > > > > > < > > > > > > : |H0(!)|2 X(!) |!| < (1 ) ⇡ (|H0(!)|2 + |H1(!)|2) X(!) (1 ) ⇡  |!| < ↵ ⇡ |H1(!)|2 X(!) ↵ ⇡  |!|  ⇡

10

slide-11
SLIDE 11

For perfect reconstruction, the filters should satisfy |H0(!)| = 1, H1(!) = 0, |!|  (1 ) ⇡ H0(!) = 0, |H1(!)| = 1, ↵ ⇡  |!|  ⇡ The transition bands of H0(!) and of H1(!) must be chosen so that |H0(!)|2 + |H1(!)|2 = 1 (1 ) ⇡ < |!| < ↵ ⇡. (1)

−π −απ −(1−β)π (1−β)π απ π 0.5 1 FREQUENCY (ω) H0(ω) −π −απ −(1−β)π (1−β)π απ π 0.5 1 FREQUENCY (ω) H1(ω)

11

slide-12
SLIDE 12

The transition band of H0(!) and H1(!) can be constructed using any power-complementary function, ✓(!), ✓2(!) + ✓2(⇡ !) = 1, (2) We use the Daubechies filter frequency response with two vanishing moments, ✓(!) = 1 2 (1 + cos(!)) p 2 cos(!), |!|  ⇡. (3) Scale and dilate ✓(!) to obtain transition bands for H0(!) and H1(!).

12

slide-13
SLIDE 13

−π −απ −(1−β)π (1−β)π απ π 0.5 1 ω X(ω)

(a) Fourier transform of input signal, X(ω).

−π −απ −(1−β)π (1−β)π απ π 0.5 1 FREQUENCY (ω) H0(ω) −π −απ −(1−β)π (1−β)π απ π 0.5 1 FREQUENCY (ω) H1(ω)

(b) Frequency responses H0(ω) and H1(ω).

−π −απ −(1−β)π (1−β)π απ π 0.5 1 ω H0(ω) X(ω) −π −απ −(1−β)π (1−β)π απ π 0.5 1 ω H1(ω) X(ω)

(c) Fourier transforms of input signal after filtering.

−π −0.5π 0.5π π 0.5 1 ω V0(ω) −π −0.5π 0.5π π 0.5 1 ω V1(ω)

(d) Fourier transforms after scaling. 13

slide-14
SLIDE 14

Iterated Filters x(n) stage 1 stage 2 stage 3 c(n) d1(n) d2(n) d3(n) Redundancy (total oversampling rate) for many stages: r =

  • 1 ↵

When ↵  1,  1, we have the system equivalence:

x(n) H0(!) LPS ↵ · · · H0(!) LPS ↵ H1(!) HPS ⌘ H(j)

1 (!)

LPS ↵j1 HPS dj(n) j 1 stages

14

slide-15
SLIDE 15

The equivalent filter is H(j)

1 (!) :=

8 > > > > > > < > > > > > > : |!| < (1 ) ↵j1 ⇡ H1(!/↵j1)

j2

Y

m=0

H0(!/↵m) (1 ) ↵j1 ⇡  |!|  ↵j1 ⇡ ↵j1 ⇡ < |!|  ⇡. (4)

! (1 ) ↵j1⇡ ↵j1 ⇡ ⇡ Hj

1(!)

Q-factor: Q = !c BW = 2

  • Scaling factors ↵, :

= 2 Q + 1, ↵ = 1 r

15

slide-16
SLIDE 16

LOW Q-FACTOR WT HIGH Q-FACTOR WT

π/4 π/2 3π/4 π 0.5 1 FREQUENCY RESPONSES − 4 LEVELS alpha = 0.67, beta = 1.00, Q = 1.00, R = 3.00 FREQUENCY (ω) 50 100 150 200 250 300 −0.1 −0.05 0.05 0.1 WAVELET TIME (SAMPLES) π/4 π/2 3π/4 π 0.5 1 FREQUENCY RESPONSES − 13 LEVELS alpha = 0.87, beta = 0.40, Q = 4.00, R = 3.00 FREQUENCY (ω) 50 100 150 200 250 300 −0.1 −0.05 0.05 0.1 WAVELET TIME (SAMPLES)

Q-factor = 1 Q-factor = 4 Redundancy = 3 Redundancy = 3

16

slide-17
SLIDE 17

Finite-length Signals... The TWQT can be implemented for finite-length signals using the DFT (FFT).... x(n) H0(k) LPS N :N0 v0(n) H1(k) HPS N :N1 v1(n) LPS N0 :N H⇤

0(k)

+

y(n) y0(n) HPS N1:N H⇤

1(k)

y1(n)

17

slide-18
SLIDE 18

Low-pass scaling N :N0 x(n) LPS N :N0 y(n)

N/2 N − 1 X(k) N0/2 N0 − 1 Y (k) N/2 N − 1 X(k) N0/2 N0 − 1 Y (k)

Low-pass scaling with N0 < N Low-pass scaling with N0 > N.

18

slide-19
SLIDE 19

High-pass scaling N :N1 x(n) HPS N :N1 y(n)

N/2 N − 1 X(k) N1/2 N1 − 1 Y (k) N/2 N − 1 X(k) N1/2 N1 − 1 Y (k)

High-pass scaling with N1 < N High-pass scaling with N1 > N

19

slide-20
SLIDE 20

N/2 N − 1 X(k)

(a) DFT of input signal, X(k).

N/2 N − 1 H0(k)

T T

N/2 N − 1 H1(k)

T T

(b) Filters H0(k) and H1(k). The transitions-bands are indicated by ‘T’.

N/2 N − 1 H0(k)X(k) N/2 N − 1 H1(k)X(k)

(c) DFT of input signal after filtering.

N0/2 N0 − 1 V0(k) N1/2 N1 − 1 V1(k)

(d) DFT after scaling. 20

slide-21
SLIDE 21

Example: Low Q-factor vs High Q-factor WT

100 200 300 400 500 9 8 7 6 5 4 3 2 1

1.89% 6.14% 11.72% 16.33% 22.82% 26.80% 13.92% 0.38% 0.00%

SUBBANDS (LOW−Q RADWT) TIME (SAMPLES) SUBBAND

P = 2, Q = 3, S = 1, Levels = 10 Dilation = 1.50, Redundancy = 3.00

100 200 300 400 500 17 16 15 14 13 12 11 10 9 8 7

2.43% 7.97% 3.52% 0.54% 7.30% 16.03% 4.72% 1.74% 20.12% 29.69% 5.75%

SUBBANDS (HIGH−Q RADWT) TIME (SAMPLES) SUBBAND

P = 5, Q = 6, S = 2, Levels = 20 Dilation = 1.20, Redundancy = 3.00

21

slide-22
SLIDE 22

Low Q-factor vs High Q-factor WT after sparsification

100 200 300 400 500 9 8 7 6 5 4 3 2 1

0.00% 2.07% 8.78% 34.39% 0.79% 53.29% 0.68% 0.00% 0.00%

SUBBANDS (LOW−Q RADWT) TIME (SAMPLES) SUBBAND

P = 2, Q = 3, S = 1, Levels = 10 Dilation = 1.50, Redundancy = 3.00

100 200 300 400 500 17 16 15 14 13 12 11 10 9 8 7

0.01% 15.64% 0.11% 0.00% 0.02% 38.66% 0.02% 0.02% 6.26% 39.25% 0.00%

SUBBANDS (HIGH−Q RADWT) TIME (SAMPLES) SUBBAND

P = 5, Q = 6, S = 2, Levels = 20 Dilation = 1.20, Redundancy = 3.00

22

slide-23
SLIDE 23

Low Q-factor vs High Q-factor WT

100 200 300 400 500 9 8 7 6 5 4 3 2 1

2.58% 6.31% 10.70% 16.11% 21.44% 22.50% 13.51% 4.87% 1.42%

SUBBANDS (LOW−Q RADWT) TIME (SAMPLES) SUBBAND

P = 2, Q = 3, S = 1, Levels = 10 Dilation = 1.50, Redundancy = 3.00

100 200 300 400 500 19 18 17 16 15 14 13 12 11 10 9 8 7 6

1.82% 2.75% 3.02% 3.82% 5.65% 6.82% 6.95% 8.72% 12.37% 13.98% 12.04% 8.44% 5.41% 3.07%

SUBBANDS (HIGH−Q RADWT) TIME (SAMPLES) SUBBAND

P = 5, Q = 6, S = 2, Levels = 20 Dilation = 1.20, Redundancy = 3.00

23

slide-24
SLIDE 24

Low Q-factor vs High Q-factor WT after sparsification

100 200 300 400 500 9 8 7 6 5 4 3 2 1

0.00% 3.12% 10.99% 27.98% 10.60% 40.59% 3.53% 2.64% 0.23%

SUBBANDS (LOW−Q RADWT) TIME (SAMPLES) SUBBAND

P = 2, Q = 3, S = 1, Levels = 10 Dilation = 1.50, Redundancy = 3.00

100 200 300 400 500 19 18 17 16 15 14 13 12 11 10 9 8 7 6

2.43% 3.78% 2.35% 2.16% 4.12% 5.87% 6.25% 4.01% 18.71% 12.72% 9.69% 19.51% 1.36% 3.11%

SUBBANDS (HIGH−Q RADWT) TIME (SAMPLES) SUBBAND

P = 5, Q = 6, S = 2, Levels = 20 Dilation = 1.20, Redundancy = 3.00

24

slide-25
SLIDE 25

Constant-Q vs Constant-BW Constant-Q Constant-BW

0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.5 1 FREQUENCY (CYCLES/SAMPLE) FREQUENCY RESPONSES 50 100 150 200 250 300 −1 1 ANALYSIS FUNCTION TIME (SAMPLES) 50 100 150 200 250 300 −1 1 ANALYSIS FUNCTION TIME (SAMPLES) 0.02 0.04 0.06 0.08 0.1 0.12 0.14 0.16 0.18 0.2 0.5 1 FREQUENCY (CYCLES/SAMPLE) FREQUENCY RESPONSES 50 100 150 200 250 300 −1 1 TIME (SAMPLES) ANALYSIS FUNCTION 50 100 150 200 250 300 −1 1 TIME (SAMPLES) ANALYSIS FUNCTION

fixed ‘resonance’ frequency-dependent ‘resonance’ frequency-dependent temporal duration fixed temporal duration

25

slide-26
SLIDE 26

Tunable Q-factor wavelet transform (TQWT) — Run Times Total execution time for forward/inverse N-point TQWT. N time (ms) N time (ms) 32 0.010 8192 4.938 64 0.025 16384 10.367 128 0.056 32768 21.935 256 0.120 65536 46.280 512 0.260 131072 96.659 1024 0.537 262144 203.580 2048 1.118 524288 448.498 4096 2.350 1048576 1014.719

  • Run times measured on a 2010 base-model Apple MacBook Pro (2.4 GHz Intel Core 2 Duo).
  • C implementation.

26

slide-27
SLIDE 27

Tunable Q-factor wavelet transform (TQWT) Summary:

  • 1. Fully-discrete, modestly overcomplete
  • 2. Exact perfect reconstruction (‘self-inverting’)
  • 3. Adjustable Q-factor:

Can attain higher Q-factors than (or same low Q-factor of) the dyadic WT. = ) Can achieve higher-frequency resolution needed for oscillatory signals.

  • 4. Samples the time-frequency plane more densely in both time and frequency.

= ) Exactly invertible, fully-discrete approximation of the continuous WT.

  • 5. FFT-based implementation

27

slide-28
SLIDE 28

Morphological Component Analysis (MCA) Given an observed signal x = x1 + x2, with x, x1, x2 2 RN, the goal of MCA is to estimate/determine x1 and x2 individually. Assuming that x1 and x2 can be sparsely represented in bases (or frames) Φ1 and Φ2 respectively, they can be estimated by minimizing the

  • bjective function,

J(w1, w2) = 1kw1k1 + 2kw2k1 with respect to w1 and w2, subject to the constraint: Φ1w1 + Φ2w2 = x. Then MCA provides the estimates ˆ x1 = Φ1w1 and ˆ x2 = Φ2w2.

Reference: Starck, Elad, Donoho. Image Decomposition via the Combination of Sparse Representations and a Variational Approach, IEEE Trans. on Image Processing, Oct 2005.

28

slide-29
SLIDE 29

Why not a quadratic cost function? If a quadratic cost function is minimized, J(w1, w2) = 1kw1k2

2 + 2kw2k2 2

subject to Φ1w1 + Φ2w2 = x, then, using Φ1Φt

1 = Φ2Φt 2 = I, the w1 and w2 can be found in closed form:

w1 = 2

2

2

1 + 2 2

Φt

1 x,

w2 = 2

1

2

1 + 2 2

Φt

2 x

and the estimated components, ˆ x1 = Φ1w1 and ˆ x2 = Φ2w2, are given by ˆ x1 = 2

2

2

1 + 2 2

x, ˆ x2 = 2

1

2

1 + 2 2

x Both ˆ x1 and ˆ x2 are just scaled versions of x. = ) No separation at all!

29

slide-30
SLIDE 30

MCA as a linear inverse problem The constrained optimization problem min

w1,w2

1kw1k1 + 2kw2k1 subject to Φ1w1 + Φ2w2 = x can be written as min

w

kλ wk1 subject to Hw = x where H = h Φ1 Φ2 i , w = 2 4w1 w2 3 5 and denotes point-by-point multiplication. This is ‘basis pursuit’, an `1-regularized linear inverse problem . . .

  • Non-differentiable
  • Convex

= ) We use a variant of SALSA (split augmented Lagrangian shrinkage algorithm).

Reference: Afonso, Bioucas-Dias, Figueiredo. Fast Image Recovery Using Variable Splitting and Constrained Optimization. IEEE Trans. on Image Processing, 2010.

30

slide-31
SLIDE 31

SALSA for MCA (bais pursuit form) Applying SALSA to the MCA problem yields the iterative algorithm: initialize: µ > 0, di (5) ui soft(wi + di, 0.5λi/µ) di, i = 1, 2 (6) c x Φ1 u1 Φ2 u2 (7) di 1 2Φt

i c

i = 1, 2 (8) wi di + ui i = 1, 2 (9) repeat (10) where soft(x, T) is the soft-threshold rule with threshold T, soft(x, T) = x max(0, 1 T/|x|). Note: no matrix inverses; only forward and inverse transforms.

31

slide-32
SLIDE 32

10 20 30 40 50 60 70 80 90 100 5.5 6 6.5 7 7.5 8 8.5 9 9.5 ITERATION OBJECTIVE FUNCTION ISTA SALSA

Figure 4: Reduction of objective function during the first 100 iterations. SALSA converges faster than ISTA. 32

slide-33
SLIDE 33

Example: Resonance-selective nonlinear band-pass filtering

100 200 300 400 −2 −1 1 2 (a) TEST SIGNAL TIME (SAMPLES) 0.1 0.2 0.3 0.4 0.5 0.5 1 FREQUENCY (CYCLES/SAMPLE) (b) TWO BAND−PASS FILTERS BAND−PASS FILTER 1 BAND−PASS FILTER 2 100 200 300 400 −2 −1 1 2 (c) OUTPUT OF BAND−PASS FILTER 1 TIME (SAMPLES) 100 200 300 400 −2 −1 1 2 (d) OUTPUT OF BAND−PASS FILTER 2 TIME (SAMPLES)

Figure 5: LTI band-pass filtering. The test signal (a) consists of a sinusoidal pulse of frequency 0.1 cycles/sample and a transient. Band-pass filters 1 and 2 in (b) are tuned to the frequencies 0.07 and 0.10 cycles/second respectively. The output signals, obtained by filtering the test signal with each of the two band-pass filters, are shown in (c) and (d). The output of band-pass filter 1, illustrated in (c), contains oscillations due to the transient in the test signal. Moreover, the transient oscillations in (c) have a frequency of 0.07 Hz even though the test signal (a) contains no sustained oscillatory behavior at this frequency. 33

slide-34
SLIDE 34

100 200 300 400 −2 −1 1 2 (a) HIGH−RESONANCE COMPONENT TIME (SAMPLES) 100 200 300 400 −2 −1 1 2 (b) LOW−RESONANCE COMPONENT TIME (SAMPLES) 100 200 300 400 −2 −1 1 2 (c) OUTPUT OF BAND−PASS FILTER 1 TIME (SAMPLES) 100 200 300 400 −2 −1 1 2 (d) OUTPUT OF BAND−PASS FILTER 2 TIME (SAMPLES)

Figure 6: Resonance-based decomposition and band-pass filtering. When resonance-based analysis method is applied to the test signal in Fig. 5a, it yields the high- and low-resonance components illustrated in (a) and (b). The output signals, obtained by filtering the high-resonance component (a) with each of the two band-pass filters shown in Fig. 5b, are illustrated in (c) and (d). The transient oscillations in (c) are substantially reduced compared to Fig. 5c. 34

slide-35
SLIDE 35

Example: Resonance-based decomposition of speech

0.05 0.1 0.15 −0.4 −0.2 0.2 0.4 (a) SPEECH SIGNAL [Fs = 16,000 SAMPLES/SECOND] 0.05 0.1 0.15 −0.4 −0.2 0.2 0.4 (b) HIGH−RESONANCE COMPONENT 0.05 0.1 0.15 −0.4 −0.2 0.2 0.4 (c) LOW−RESONANCE COMPONENT TIME (SECONDS) Figure 7: Decomposition of a speech signal (“I’m”) into high- and low-resonance components. The high-resonance component (b) contains the sustained oscillations present in the speech signal, while the low-resonance component (c) contains non-oscillatory transients. (The residual is not shown.) 35

slide-36
SLIDE 36

500 1000 1500 2000 2500 3000 0.005 0.01 (a) ORIGINAL SPEECH 500 1000 1500 2000 2500 3000 0.005 0.01 (b) HIGH−RESONANCE COMPONENT 500 1000 1500 2000 2500 3000 0.005 0.01 (c) LOW−RESONANCE COMPONENT FREQUENCY (Hz)

Figure 8: Frequency spectra of the speech signal in Fig. 7 and of the extracted high- and low-resonance components. The spectra are computed using the 50 msec segment from 0.05 to 0.10 seconds. The energy of each resonance component is widely distributed in frequency and their frequency-spectra overlap. 36

slide-37
SLIDE 37

0.05 0.1 0.15 −0.2 0.2 (a) RECONSTRUCTION FROM SUBBANDS 9, 10 0.05 0.1 0.15 −0.2 0.2 (b) RECONSTRUCTION FROM SUBBANDS 18, 19 0.05 0.1 0.15 −0.2 0.2 (c) RECONSTRUCTION FROM SUBBANDS 21−24 0.05 0.1 0.15 −0.2 0.2 (D) SUM OF ABOVE 3 SIGNALS TIME (SECONDS)

Figure 9: Frequency decomposition of high-resonance component in Fig. 7. Reconstructing the high-resonance component from a few subbands of the high Q-factor WT at a time, yields an efficient AM/FM decomposition. 37

slide-38
SLIDE 38

Constant bandwidth + Constant Q-factor

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT BANDWITH FREQUENCY 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT Q−FACTOR FREQUENCY

A constant bandwidth and a constant Q-factor decomposition can have high coherence due to some analysis functions, from each decomposition, having similar frequency support. This can degrade the results of MCA in principle.

38

slide-39
SLIDE 39

Constant Q-factor: High Q-factor + Low Q-factor

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT Q−FACTOR (HIGH Q−FACTOR) FREQUENCY 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT Q−FACTOR (LOW Q−FACTOR) FREQUENCY

Two constant Q-factor decompositions with markedly different Q-factors will have low coherence because no analysis functions from the two decompositions will have similar frequency support. This is beneficial for the operation of MCA.

39

slide-40
SLIDE 40

Small coherence between low and high Q-factor WTs f1 p Q1/f1 f1/Q1 Ψ1(f) f2 p Q2/f2 f2/Q2 Ψ2(f) f

Figure 10: For reliable resonance-based decomposition, the inner product between the low-Q and high-Q wavelets should be small for all dilations and translations. The computation of the maximum inner product is simplified by assuming the wavelets are ideal band-pass functions and expressing the inner product in the frequency domain.

The inner products can be defined in the frequency domain, ⇢(f1, f2) := Z Ψ1(f)Ψ2(f) d f, as a function of their center frequencies (equivalently, dilation). The maximum value of the inner product, ⇢(f1, f2), occurs when f2 = f1(2 + 1/Q1)/(2 + 1/Q2) and is given by ⇢max = s Q1 + 1/2 Q2 + 1/2, Q2 > Q1. (11)

40

slide-41
SLIDE 41

Constant Bandwidth: Narrow-band + Wide-band

0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT BANDWITH (NARROW−BAND) FREQUENCY 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1 0.5 1 CONSTANT BANDWITH (WIDE−BAND) FREQUENCY

Two constant bandwidth decompositions with markedly different bandwidths will also have low coherence and are therefore also suitable transform for MCA-based signal decomposition. This gives a bandwidth- based decomposition, rather than a resonance-based decomposition.

41

slide-42
SLIDE 42

Conclusion: Resonance-based signal decomposition Low Q-factor WT used for sparse representation of the transient component. High Q-factor WT used for sparse representation of the oscillatory (rhythmic) component. Morphological component analysis (MCA) used to separate the two signal components.

  • Oscillatory component not necessarily high-pass — contains both low and high frequencies.
  • Transient component not necessarily a low-pass signal — contains sharp bumps and jumps.
  • Software available (Matlab software on web, C code by request).
  • http://eeweb.poly.edu/iselesni/TQWT/

42

slide-43
SLIDE 43

Graphical User Interface: Facilitates interactive selection and tuning of parameters.

43

slide-44
SLIDE 44

Group Sparsity with Fully Overlapping Groups

Po-Yu Chen

Basis pursuit denoising (no group structure): argmin

c

ky Φck2

2 + λkck1

Generalized basis pursuit denoising (overlapping group sparsity): argmin

c

ky Φck2

2 + λ N−3

X

i=0

p |c(i)|2 + |c(i + 1)|2 + |c(i + 2)|2 (group size 3)

1

slide-45
SLIDE 45

Example: Speech Enhancement

TIME (SECONDS) FREQUENCY 0.5 1 1.5 2 2.5 3 1000 2000 3000 4000 5000 6000 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 TIME (SECONDS) FREQUENCY 0.5 1 1.5 2 2.5 3 1000 2000 3000 4000 5000 6000 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

Noise-free signal spectrogram Noisy signal spectrogram

36

slide-46
SLIDE 46

Example: Speech Enhancement

(K1, K2) = (1, 1) Basis pursuit denoising using `1 norm, i.e. no group structure.

TIME (SECONDS) FREQUENCY 0.5 1 1.5 2 2.5 3 1000 2000 3000 4000 5000 6000 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5 TIME (SECONDS) FREQUENCY 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 4000 4500 5000 5500 6000 6500 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

The spectrogram exhibits many spurious noise spikes which produces ‘musical noise’

37

slide-47
SLIDE 47

Example: Speech Enhancement

(K1, K2) = (2, 8)

TIME (SECONDS) FREQUENCY 0.5 1 1.5 2 2.5 3 1000 2000 3000 4000 5000 6000 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

TIME (SECONDS) FREQUENCY 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 4000 4500 5000 5500 6000 6500 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

The spectrogram does not exhibit spurious noise spikes. The denoised signal is free of ‘musical noise’ artifact

38

slide-48
SLIDE 48

Example: Speech Enhancement

(K1, K2) = (8, 2)

TIME (SECONDS) FREQUENCY 0.5 1 1.5 2 2.5 3 1000 2000 3000 4000 5000 6000 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

TIME (SECONDS) FREQUENCY 0.7 0.8 0.9 1 1.1 1.2 1.3 1.4 1.5 1.6 4000 4500 5000 5500 6000 6500 7000 −50 −45 −40 −35 −30 −25 −20 −15 −10 −5

The spectrogram does not exhibit spurious noise spikes. The denoised signal is free of ‘musical noise’ artifact

39

slide-49
SLIDE 49

Quasi-Polynomial Approximation via Sparse Derivatives

Xiaoran Ning

y = x + noise Total variation denoising: argmin

x

ky xk2

2 + λkDxk1

Dual-component polynomial denoising: argmin

x1,x2

ky x1 x2k2

2 + λ1kDx1k1 + λ2kD2x2k1

2

First order difference (derivative) Second order derivative

slide-50
SLIDE 50

100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 The clean signal 100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 Y: Noisy signal, SNR = 15 dB 100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 The recovered signal: SNR = 29.1205 dB

Total Variation Denoising Stair case artifact

slide-51
SLIDE 51

100 200 300 400 500 600 700 800 900 100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 Y: Noisy signal, SNR = 15 dB 100 200 300 400 500 600 700 800 900 −1 1 R1: Estimate of X1 100 200 300 400 500 600 700 800 900 0.5 1 R2: Estimate of X2 1.5 The recovered signal = R1 + R2, SNR = 31.2347 dB

Quasi-piecewise constant componet Quasi-piecewise linear component

slide-52
SLIDE 52

To compare the two methods

100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 The clean signal 100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 Sparse Derivative Denoising: SNR = 31.2347 dB 100 200 300 400 500 600 700 800 900 −0.5 0.5 1 1.5 Total Variation Filtering: SNR = 29.1205 dB

Figure: Sparse Derivative Denoising and TV Filtering.

24

slide-53
SLIDE 53

. More slides...

44

slide-54
SLIDE 54

Comparison: EMD & TQWT We compare

  • Empirical Mode Decomposition (EMD)
  • Sparse TQWT Representation

A goal of EMD is to decompose a multicomponent signal into several narrow-band components (intrinsic mode functions). For some real signals, a sparse TQWT representation leads to a more reasonable decomposition than EMD (next two slides).

45

slide-55
SLIDE 55

0.02 0.04 0.06 0.08 0.1 0.12 1 2 3 4 5 6 7 Empirical mode decomposition (EMD) Time (seconds) IMF index

46

slide-56
SLIDE 56

0.02 0.04 0.06 0.08 0.1 0.12 recon residual band 4 band 3 band 2 band 1 Time (seconds) Sparse TQWT Representation

47

slide-57
SLIDE 57

Morphological Component Analysis (MCA) — Noisy signal case In the noisy case, we should not ask for exact equality. Given an observed signal x = x1 + x2 + n, with x, x1, x2 2 RN, where n is noise, the components x1 and x2 can be estimated by minimizing the objective function, J(w1, w2) = kx Φ1w1 Φ2w2k2

2 + 1kw1k1 + 2kw2k1

with respect to w1 and w2. Then MCA provides the estimates ˆ x1 = Φ1w1 and ˆ x2 = Φ2w2.

Reference: Starck, Elad, Donoho. Image Decomposition via the Combination of Sparse Representations and a Variational Approach, IEEE Trans. on Image Processing, Oct 2005.

48

slide-58
SLIDE 58

Why not a quadratic cost function? If the `2-norm is used for the penalty term, J(w1, w2) = kx Φ1w1 Φ2w2k2

2 + 1kw1k2 2 + 2kw2k2 2,

then, using Φ1Φt

1 = Φ2Φt 2 = I, the minimizing w1 and w2 can be found in closed form:

w1 = 1 1 + 2 + 1 2 Φt

1 x

w2 = 2 1 + 2 + 1 2 Φt

2 x

and the estimated components, ˆ x1 = Φ1w1 and ˆ x2 = Φ2w2, are given by ˆ x1 = 1 1 + 2 + 1 2 x ˆ x2 = 2 1 + 2 + 1 2 x Both ˆ x1 and ˆ x2 are just scaled versions of x. = ) No separation at all!

49

slide-59
SLIDE 59

MCA as a linear inverse problem The objective function J(w1, w2) = kx Φ1w1 Φ2w2k2

2 + 1kw1k1 + 2kw2k1

can be written as J(w) = kx Hwk2

2 + kλ wk1

where H = h Φ1 Φ2 i , w = 2 4w1 w2 3 5 and denotes point-by-point multiplication. An `1-regularized linear inverse problem . . .

  • Non-differentiable
  • Convex

= ) Use Iterative Soft Thresholding Algorithm (ISTA) or another algorithm to minimize J(w). Other algorithms include SparSA, TwIST, FISTA, SALSA, etc.

50

slide-60
SLIDE 60

Split augmented Lagrangian shrinkage algorithm (SALSA) SALSA is an algorithm for minimizing J(w) = kx Hwk2

2 + kwk1

SALSA is based on the minimization of min

u

f1(u) + f2(u) (12) by the alternating split augmented Lagrangian algorithm: u(k+1) = arg min

u f1(u) + µku v(k) d(k)k2 2

(13) v(k+1) = arg min

v f2(v) + µku(k+1) v d(k)k2 2

(14) d(k+1) = d(k) u(k+1) + v(k+1) (15)

Reference: Afonso, Bioucas-Dias, Figueiredo. Fast Image Recovery Using Variable Splitting and Constrained Optimization. IEEE Trans. on Image Processing, 2010.

51

slide-61
SLIDE 61

SALSA Applying SALSA to the MCA problem yields the iterative algorithm: initialize: µ > 0, d (16) ui soft(wi + di, 0.5λi/µ) di, i = 1, 2 (17) c x Φ1u1 Φ2u2 (18) di 1 µ + 2Φt

i c

i = 1, 2 (19) wi di + ui, i = 1, 2 (20) repeat (21) where soft(x, T) is the soft-threshold rule with threshold T, soft(x, T) = x max(0, 1 T/|x|). Note: no matrix inverses; only forward and inverse transforms.

52

slide-62
SLIDE 62

0.02 0.04 0.06 0.08 0.1 0.12 −0.2 −0.1 0.1 0.2 NOISY SPEECH WAVEFORM 0.02 0.04 0.06 0.08 0.1 0.12 −0.2 −0.1 0.1 0.2 HIGH Q−FACTOR COMPONENT 0.02 0.04 0.06 0.08 0.1 0.12 −0.2 −0.1 0.1 0.2 LOW Q−FACTOR COMPONENT 0.02 0.04 0.06 0.08 0.1 0.12 −0.2 −0.1 0.1 0.2 RESIDUAL TIME (SECONDS)

53