Iteratively Reweighted 1 Approaches to Sparse Composite - PowerPoint PPT Presentation

Iteratively Reweighted ℓ 1 Approaches to Sparse Composite Regularization Phil Schniter Joint work with Prof. Rizwan Ahmad (OSU) Supported in part by NSF grant CCF-1018368. MATHEON Conf. on Compressed Sensing and its Applications TU-Berlin — Dec 11, 2015

Introduction and Motivation for Composite Penalties Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 2 / 29

Introduction and Motivation for Composite Penalties Introduction Goal: Recover signal x ∈ C N from noisy linear measurements y = Φ x + w ∈ C M where usually M ≪ N . Approach: Solve the optimization problem x γ � y − Φ x � 2 x = arg min ˆ 2 + R ( x ) , with γ > 0 controlling the measurement fidelity. Question: How should we choose penalty/regularization R ( x ) ? Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 3 / 29

Introduction and Motivation for Composite Penalties Typical Choices of Penalty Say Ψ x is (approximately) sparse for “analysis operator” Ψ ∈ C L × N . ℓ 0 penalty: R ( x ) = � Ψ x � 0 Impractical: optimization problem is NP hard ℓ 1 penalty (generalized LASSO): R ( x ) = � Ψ x � 1 Tightest convex relaxation of ℓ 0 penalty Fast algorithms: ADMM, MFISTA, NESTA-UP, grAMPa . . . non-convex penalties R ( x ) = � Ψ x � p for p ∈ (0 , 1) (via IRW-L2) R ( x ) = � L l =1 log( ǫ + | ψ T l x | ) with ǫ ≥ 0 (via IRW-L1) many others... Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 4 / 29

Introduction and Motivation for Composite Penalties Choice of Analysis Operator How to choose Ψ in practice? Maybe a wavelet transform? Which one?   Ψ 1 . .  (e.g., SARA 1 )? Maybe a concatenation of several transforms .  Ψ D What if signal is more sparse in one dictionary than another? Can we compensate for this? Can we exploit this? 1 Carrillo, McEwen, Van De Ville, Thiran, Wiaux, “Sparsity averaged reweighted analysis,” IEEE SPL , 2013 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 5 / 29

Introduction and Motivation for Composite Penalties Example: Undecimated Wavelet Transform of MRI Cine Note different sparsity rate in each subband of 1-level UWT: Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 6 / 29

Introduction and Motivation for Composite Penalties Composite ℓ 1 Penalties We propose to use composite ℓ 1 (Co-L1) penalties of the form D � R ( x ; λ ) � λ d � Ψ d x � 1 , λ d ≥ 0 d =1 where Ψ d ∈ C L d × N have unit-norm rows. The Ψ d could be chosen, for example, as different DWTs (i.e., db1,db2,db3,. . . ,db10), different subbands of a given DWT, row-subsets of I (i.e., group/hierarchical sparsity), or all of the above. We then aim to simultaneously tune the weights { λ d } and recover the signal x . Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 7 / 29

Introduction and Motivation for Composite Penalties The Co-L1 Algorithm { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 , ǫ ≥ 0 2: if Ψ d x ∈ R L d then C d = 1 , elseif Ψ d x ∈ C L d then C d = 2 . λ (1) 3: initialization: = 1 ∀ d d 4: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) γ � y − Φ x � 2 5: 2 + d � Ψ d x � 1 x d =1 C d L d λ ( t +1) ← , d = 1 , . . . , D 6: d ǫ + � Ψ d x ( t ) � 1 7: end 8: output: x ( t ) leverages existing ℓ 1 solvers (e.g., ADMM, MFISTA, NESTA-UP, grAMPa), reduces to the IRW-L1 algorithm [Figueiredo,Nowak’07] when L d = 1 ∀ d (single-atom dictionaries). applies to both real- and complex-valued cases, Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 8 / 29

Introduction and Motivation for Composite Penalties The Co-IRW-L1 Algorithm { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 λ (1) = 1 ∀ d , W (1) 2: initialization: = I ∀ d d d 3: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) d � W ( t ) γ � y − Φ x � 2 4: 2 + d Ψ d x � 1 x d =1 ( λ ( t +1) , ǫ ( t +1) λ d ∈ Λ ,ǫ d > 0 log p ( x ( t ) ; λ , ǫ ) , d = 1 , ..., D ) ← arg max 5: d d � � 1 1 W ( t +1) ← diag 6: , · · · , , d = 1 , ..., D d ǫ ( t +1) ǫ ( t +1) + | ψ T + | ψ T d, 1 x ( t ) | d,L d x ( t ) | d d 7: end 8: output: x ( t ) tunes both λ d and diagonal W d for all d : hierarchical weighting. also tunes regularization parameters ǫ d for all d . Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 9 / 29

Introduction and Motivation for Composite Penalties Understanding Co-L1 and Co-IRW-L1 In the sequel, we provide four interpretations of each algorithm: 1 Majorization-minimization (MM) for a particular non-convex penalty, 2 a particular approximation of ℓ 0 minimization, 3 Bayesian estimation according to a particular hierarchical prior, 4 variational EM algorithm under a particular prior. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 10 / 29

Co-L1 and its Interpretations Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 11 / 29

Co-L1 and its Interpretations Optimization Interpretations of Co-L1 Co-L1 is an MM approach to the weighted log-sum optimization problem D � � γ � y − Φ x � 2 � arg min 2 + L d log( ǫ + � Ψ d x � 1 ) x d =1 and As ǫ → 0 , Co-L1 aims to solve the weighted ℓ 1 , 0 problem D � � γ � y − Φ x � 2 � arg min 2 + L d 1 � Ψ d x � 1 > 0 x d =1 Note: L d is # atoms in dictionary Ψ d , and 1 � is the indicator function. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 12 / 29

Co-L1 and its Interpretations Approximate- ℓ 0 Interpretation of Log-Sum Penalty 1.5 N eps=1e-13 1 eps=0.001 � eps=0.1 log( ǫ + | u n | ) ell1 ell0 log(1 /ǫ ) n =1 1 � 1 � = log( ǫ ) log(1 /ǫ ) n : x n =0 � 0.5 � + log( ǫ + | u n | ) n : x n � =0 � n : x n � =0 log( ǫ + | u n | ) 0 -1.5 -1 -0.5 0 0.5 1 1.5 = � x � 0 − N + log(1 /ǫ ) As ǫ → 0 , the log-sum penalty becomes a scaled and shifted version of the ℓ 0 penalty. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 13 / 29

Co-L1 and its Interpretations Bayesian Interpretations of Co-L1 Co-L1 is an MM approach to Bayesian MAP estimation under an AWGN likelihood and the hierarchical prior D � L d � λ d � � � p ( x | λ ) = exp − λ d � Ψ d x � 1 i.i.d. Laplacian 2 d =1 D � � 0 , 1 i.i.d. Gamma � p ( λ ) = Γ , ǫ (i.i.d. Jeffrey’s as ǫ → 0 ) d =1 and As ǫ → 0 , Co-L1 is a variational EM approach to estimating (determin- istic) λ under an AWGN likelihood and the prior D � L d � λ d � � � p ( x ; λ ) = exp − λ d ( � Ψ d x � 1 + ǫ ) i.i.d. Laplacian as ǫ → 0 2 d =1 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 14 / 29

Co-IRW-L1 and its Interpretations Outline Introduction and Motivation for Composite Penalties 1 Co-L1 and its Interpretations 2 Co-IRW-L1 and its Interpretations 3 Numerical Experiments 4 Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 15 / 29

Co-IRW-L1 and its Interpretations A Simplified Version of Co-IRW-L1 Consider the real-valued and fixed- ǫ d variant of Co-IRW-L1. { Ψ d } D 1: input: d =1 , Φ , y , γ > 0 , ǫ d > 0 ∀ d λ (1) = 1 ∀ d , W (1) 2: initialization: = I ∀ d d d 3: for t = 1 , 2 , 3 , . . . D � � x ( t ) ← arg min � λ ( t ) d � W ( t ) γ � y − Φ x � 2 4: 2 + d Ψ d x � 1 x d =1 �� − 1 � L d 1 + | ψ T d,l x ( t ) | 1 � λ ( t +1) � ← log + 1 , d = 1 , ..., D 5: d L d ǫ d l =1 � � 1 1 W ( t +1) ← diag , · · · , , d = 1 , ..., D, 6: d ǫ d + | ψ T ǫ d + | ψ T d, 1 x ( t ) | d,L d x ( t ) | 7: end 8: output: x ( t ) Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 16 / 29

Co-IRW-L1 and its Interpretations Optimization Interpretations of real-Co-IRW-L1- ǫ Real-Co-IRW-L1- ǫ is an MM approach to the non-convex optimization � D L d � L d 1 + | ψ T �� d,i x | �� γ � y − Φ x � 2 ǫ d + | ψ T arg min 2 + log d,l x | log ǫ d x d =1 l =1 i =1 and As ǫ d → 0 , real-Co-IRW-L1- ǫ aims to solve the ℓ 0 + weighted ℓ 0 , 0 problem � D � � γ � y − Φ x � 2 arg min 2 + � Ψ x � 0 + L d 1 � Ψ d x � 0 > 0 x d =1 Note: L d is the size of dictionary Ψ d , and 1 � is the indicator function. Phil Schniter (Ohio State) Composite ℓ 1 Regularization MATHEON — Dec’15 17 / 29

Iteratively Reweighted 1 Approaches to Sparse Composite - PowerPoint PPT Presentation

Iteratively Reweighted 1 Approaches to Sparse Composite Regularization Phil Schniter Joint work with Prof. Rizwan Ahmad (OSU) Supported in part by NSF grant CCF-1018368. MATHEON Conf. on Compressed Sensing and its Applications TU-Berlin

Iteratively reweighted penalty alternating minimization methods with continuation for image

Regression via Iteratively Reweighted Least Squares Alina Ene, Adrian Vladu IRLS Method Basic

High-Rate Sparse Superposition Codes with Iteratively Optimal Estimates Andrew Barron, Sanghee

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

COMPOSITE OF PLAGE AREAS OVER COMPOSITE OF PLAGE AREAS OVER COMPOSITE OF PLAGE AREAS OVER

Calibrating semi-analytic V T s against reweighted injection V T s Daniel Wysocki and

Spectral Methods for Analyzing Large Data using Reweighted Topic Modeling Blake Hunter Jason

The Chain Rule Given a composite function: The Chain Rule Given a composite function: h ( x ) =

Plan Composite Likelihood Methods What are composite likelihoods? David Firth Where are

Composite Trust Composite Trust Composite Trust A formal derivation of conjunction A formal

Sparse tensors are a natural way of representing real-world data 1 Sparse tensors are a natural

MLSS 06 - Canberra Elements Hierarchical Basis Sparse Grids Sparse Grids Combination

CNBC Matlab Mini-Course Sparse Matrices Sparse matrices provide an efficient means to store

Tutorial: TF-Ranking for sparse features Tutorial: TF-Ranking for sparse features This tutorial

Parallel Numerical Algorithms Chapter 4 Sparse Linear Systems Section 4.1 Direct Methods

Early online classification of encrypted traffic streams using multi-fractal features Erik

Empirical Mode Decomposition, Lifting and Block Wavelet Transform April 9 Empirical Mode

type of artificial neural network techniques Amir Shokri Amirsh.nll@gmail.com a b s t r a c t

Precomputed Light Transport Indirect Lighting Many indirect lighting effects are subtle, yet

LDP Typed Wildcard PW FEC Elements draft-raza-pwe3-pw-typed-wc-fec-00.txt Kamran Raza Sami

Overcoming the Challenges of Obesity Management in Primary Care Learning Objectives Discuss

Weakening Aggregated Traffic of Weakening Aggregated Traffic of DHCP Discover Messages draft

Encryption in the Plain Model Giulio Malavolta Abhishek Jain Zhengzhong Jin Prabhanjan Ananth