fast l0 based sparse signal recovery
play

Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong - PowerPoint PPT Presentation

Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong Zhang Signal Processing Group Department of Engineering ACVT seminar, Adelaide May 2011 Outline Introduction 1 L0 reweighted-L2 Minimization 2 IRLS Basic Model Basic


  1. Fast L0-based Sparse Signal Recovery Nick Kingsbury and Yingsong Zhang Signal Processing Group Department of Engineering ACVT seminar, Adelaide – May 2011

  2. Outline Introduction 1 L0 reweighted-L2 Minimization 2 IRLS Basic Model Basic Algorithm Fast Algorithm L 0 RL 2 Continuation Strategy 3 Geometry of the penalty function L 0 RL 2 with continuation Numerical Results 4 Super-resolution 5 Experimental results 6 Conclusion 7 Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 2 / 28 L 0 RL 2

  3. Introduction Introduction The L 0 minimisation problem min � x � 0 subject to � y − Φ x � 2 ≤ ξ Find x which gives (1) where Φ is known and of size M × N with M < N L 0 : NP-hard combinatorial search for exact solution L 1 : convex problem, but poor computation speed for large problems; greedy algorithms can be used, e.g. CoSaMP, NESTA. L p : iterative reweighted techniques, e.g. IRL1, IRLS L 0 : greedy algorithms usually used, e.g. IHT, ℓ 0 -AP, SL0 We propose a highly efficient form of IRLS which approximates L 0 minimisation, allows N ∼ 10 7 or 10 8 , and has good reconstruction performance. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 3 / 28 L 0 RL 2

  4. L 0 RL 2 IRLS Iterative reweighted least-squares minimization (IRLS) min x T Sx subject to � y − Φ x � 2 = 0 S is a diagonal weight matrix. The solution of each step is the solution of min � v � 2 subject to � y − Φ x � 2 = 0 and x = S − 1 2 v which is unique and the reweighted iterative rule is: x = S − 1 2 (Φ S − 1 2 ) † y , (2) where † denotes the pseudo inverse (i.e. H † = ( H T H ) − 1 H T ). The inverse of ( H T H ) is difficult for large problems. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 4 / 28 L 0 RL 2

  5. L 0 RL 2 IRLS Iterative reweighted least squares minimizations (IRLS) 1 for 0 ≤ p ≤ 1 . Gorodnitsky & Rao (1997) S j = 2 ) , (1 − p ( | x j | 2 ) 1 Chartrand & Yin (2008) S j = 2 ) , for 0 ≤ p ≤ 1 (1 − p ( | x j | 2 + ǫ 2 ) Daubechies et al (2009) ◮ Theoretical analysis shows that local convergence rate of the algorithm is superlinear; smaller p values result in faster convergence rate. ◮ Experimentally shows that L p norm with p → 0 can help to achieve a higher success rate in exact signal recovery. However, IRLS in this form strictly only models noise-free observations. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 5 / 28 L 0 RL 2

  6. L 0 RL 2 Basic Model Basic Model – Gaussian measurement noise Consider the noisy system y = Φ x + n (3) where we assume : n ∼ N (0 , ν 2 ) is noise Φ is a known M × N matrix, and the prior of x is a zero-mean scaled adaptive Gaussian model such that � | S | exp( − 1 2 x T Sx ) , p ( x ) ∝ where S is a diagonal matrix, whose j th diagonal entry S j = 1 /σ 2 j . This is well suited for modeling wavelet coefs. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 6 / 28 L 0 RL 2

  7. L 0 RL 2 Basic Model Basic Model – Prior for σ j We further assume an independent prior ∝ exp( − 1 2 ǫ 2 /σ 2 j ) for each σ j . 1 This prior can be regarded as an 0.9 0.8 approximation to the lower bounded 0.7 uniform prior U ( ǫ, + ∞ ). It is exp(− ε 2 /x 2 ) 0.6 approximately uniformly distributed 0.5 in the region (3 ǫ, + ∞ ). Meanwhile it 0.4 tends to 0 as σ j approaches 0 so as 0.3 to prevent σ 2 getting too small and 0.2 exp(− ε 2 /x 2 ) 0.1 U( ε ,+ ∞ ) hence avoid numerical instability in 0 0 x 1 ε 3 ε 5 ε 7 ε 9 ε the model. Figure: Illustration of the prior for σ j . Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 7 / 28 L 0 RL 2

  8. L 0 RL 2 Basic Algorithm Log MAP function and basic algorithm The prior on σ j gives the following negative log MAP function: Negative log MAP function   J ( x , S ) = ν 2  x T Sx − ln | S | + ǫ 2 �  + � y − Φ x � 2 S j (4) j Minimizing J ( x , S ) results in the following iteration rules: Basic algorithm J ( x , S ) = (Φ T Φ + ν 2 S ) − 1 Φ T y  x = arg min  x    J ( x , S ) = | x j | 2 + ǫ 2 σ 2  j = arg min   (5) σ 2 j  S j = 1 1   = ∀ j   | x j | 2 + ǫ 2 σ 2   j Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 8 / 28 L 0 RL 2

  9. L 0 RL 2 Fast Algorithm L 0 RL 2 Fast Algorithm L 0 RL 2 – to avoid (Φ T Φ + ν 2 S ) − 1 For wavelet-like signal spaces, we introduce the vector α = [ α 1 . . . α K ] and the diagonal operator Λ α that multiplies the k th subspace/subband by α k : (Λ α x ) k = α k x k for k = 1 · · · K where x k is a masked version of x with non-zeros only in the subband k , and where α k may be optimized independently for each subband to be the minimum α k such that k x k ≥ � Φ x k � 2 α k x T for any k and x . Then using majorisation minimization (MM) [3], we have the following new auxiliary function: J α ( x , S , z ) = J ( x , S ) + ( x − z ) T Λ α ( x − z ) − � Φ( x − z ) � 2 (6) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 9 / 28 L 0 RL 2

  10. L 0 RL 2 Fast Algorithm L 0 RL 2 Fast Algorithm L 0 RL 2 The new auxiliary function eliminates the difficult x T Φ T Φ x term from J ( x , S ) and results in the following iteration rules: L 0 RL 2 x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Note that (Λ α + ν 2 S n ) is now a diagonal matrix and hence is easy to invert! Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 10 / 28 L 0 RL 2

  11. L 0 RL 2 Fast Algorithm L 0 RL 2 J ( x , S ) = ν 2 � � x T Sx − ln | S | + ǫ 2 � + � y − Φ x � 2 (4) j S j x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Substituting eq(7b) into log MAP eq(4) gives cost function J ( x , S ) = ν 2 � � � j ln( x 2 j + ǫ 2 ) /ǫ + � y − Φ x � 2 const + (9) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 11 / 28 L 0 RL 2

  12. L 0 RL 2 Fast Algorithm L 0 RL 2 J ( x , S ) = ν 2 � � x T Sx − ln | S | + ǫ 2 � + � y − Φ x � 2 (4) j S j x n +1 = (Λ α + ν 2 S n ) − 1 � �  (Λ α − Φ T Φ) z n + Φ T y  (7a) z n +1 = arg min J α ( x n +1 , S n , z ) = x n +1  z | x j , n +1 | 2 + ǫ 2 � − 1 � S n +1 = diag ( j =1 , ··· , N ) (7b) Substituting eq(7b) into log MAP eq(4) gives cost function J ( x , S ) = ν 2 � � � j ln( x 2 j + ǫ 2 ) /ǫ + � y − Φ x � 2 const + (9) Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 11 / 28 L 0 RL 2

  13. Continuation Strategy Geometry of the penalty function Geometry of the log-sum penalty function f ln ,ǫ f ln ,ǫ = C ln x 2 + ǫ 2 with ǫ = 0 . 1 , compared to � x � 1 and thresholded � x � 0 . ǫ 2 1.6 0.15 L1 1.4 f ln, ε 1.2 0.1 1 penalty penalty 0.8 0.6 0.05 0.4 f ln, ε 0.2 L1 Thresholded L0 0 − ε ε 0 −0.1 −0.05 0 0.05 0.1 −2 −1.5 −1 −0.5 0 0.5 1 1.5 2 Figure: Compared to the L1 norm, the geometry of the log-sum penalty function lends itself well to detecting sparsity as ǫ → 0. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 12 / 28 L 0 RL 2

  14. Continuation Strategy Geometry of the penalty function Geometry: unit ball of f ln ,ǫ ( x ) 1 ε 2 = 1 ← L2 ball 0.8 ← ε 2 = 10 ε 2 = 0.1 0.6 0.4 ε 2 = 0.01 0.2 ε 2 = 0.001 ε 2 = 0.0001 0 0 0.2 0.4 0.6 0.8 1 Figure: The effect of ǫ on the geometry of the unit ball of f ln ,ǫ ( x ). When ǫ is large, the geometry of f ln ,ǫ approximates the L2 ball; when ǫ becomes small, the geometry of f ln ,ǫ approaches that of the L0 norm. Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 13 / 28 L 0 RL 2

  15. Continuation Strategy L 0 RL 2 with continuation Acceleration and parameter selection Now we consider the minimization of our problem: ln | x j | 2 + ǫ 2 F ǫ,ν ( x ) = ν 2 � + � y − Φ x � 2 , (10) ǫ j where there are two parameters which decide the solution path. ν balances the fidelity and sparsity, and ǫ decides the geometry of the penalty function. We formulate a sequence of such problems F ǫ n ,ν n ( x ): ln | x j | 2 + ǫ 2 � F ǫ n ,ν n ( x ) = ν 2 n + � y − Φ x � 2 (11) n ǫ n j starting from large ν 0 and ǫ 0 , then simultaneously reducing ν n and ǫ n until ν n = ν and ǫ n = ǫ . It is easy to see that F ǫ n ,ν n ( x ) continuously deforms to F ǫ,ν ( x ). We try to ensure that the path of the global minima of F ǫ n ,ν n ( x ) leads to the global minimum of F ǫ,ν ( x ). Kingsbury & Zhang (University of Cambridge) ACVT – May 2011 14 / 28 L 0 RL 2

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend