toward fast transform learning
play

Toward Fast Transform Learning Olivier Chabiron 1 , Franc ois - PowerPoint PPT Presentation

Toward Fast Transform Learning Toward Fast Transform Learning Olivier Chabiron 1 , Franc ois Malgouyres 2 , Jean-Yves Tourneret 1 , Nicolas Dobigeon 1 1 Institut de Recherche en Informatique de Toulouse (IRIT) 2 Institut de Math ematiques de


  1. Toward Fast Transform Learning Toward Fast Transform Learning Olivier Chabiron 1 , Franc ¸ois Malgouyres 2 , Jean-Yves Tourneret 1 , Nicolas Dobigeon 1 1 Institut de Recherche en Informatique de Toulouse (IRIT) 2 Institut de Math´ ematiques de Toulouse (IMT) This work is supported by the CIMI Excellence Laboratory C URVES AND S URFACES 2014 1/45

  2. Toward Fast Transform Learning Introduction Introduction 1 Problem studied 2 ALS Algorithm 3 Approximation experiments 4 Convergence experiments 5 2/45

  3. Toward Fast Transform Learning Introduction Introduction to sparse representation Notations Objects u live in R P where P is a set of pixels (such as { 1 ,..., N } 2 ). In image processing, many problems are underdetermined. For example, in sparse representation, we want to solve min � α � ∗ subject to � D α − u � 2 ≤ τ Principle of sparse representation/approximation For many applications, � . � ∗ should be � . � 0 . � α � 0 = # { j ; α j � = 0 } Issue The sparse representation problem is (in general) NP-hard. However, successful algorithms exist when the columns of D are almost orthogonal. 3/45

  4. Toward Fast Transform Learning Introduction Dictionary learning Choosing a dictionary (cosines, Learning the dictionary from the data wavelets, curvelets, ...) + fast transform - no fast transform - limited sparsity + better sparsity The DL problem Learn an efficient representation frame for an image class, solving � � argmin D , α ∑ µ � D α − u � 2 2 + � α � ∗ u DL problems are often resolved in two steps argmin α − → Sparse coding stage, argmin D − → Dictionary update stage. 4/45

  5. Toward Fast Transform Learning Introduction Motivations (1)             . . . . .      . . . . .     . . . . .                . . . .       # P image size . . . . α # D = # P u     . . D . .                   . . . . .    . . . . .     . . . . .        � �� �  # D number of atoms Usually, # D ≫ # P . Computing D α costs O (# D # P ) > O (# P 2 ) operations. Computing sparse codes is very expensive. Storing D is very expensive. 5/45

  6. Toward Fast Transform Learning Introduction Motivations (2) Our objectives: Define a fast transform to compute D α . Ensure a fast update so that larger atoms can be learned. 6/45

  7. Toward Fast Transform Learning Introduction Model Model for a dictionary update with a single atom H ∈ R P . How to include every possible translation of H ? ∑ α p ′ H p − p ′ = ( α ∗ H ) p p ′ ∈ P Model Image is a sum of weighted translations of one atom u = α ∗ H + b , (1) where u ∈ R P is the image data, α ∈ R P is the code, H ∈ R P the target and b is noise. 7/45

  8. Toward Fast Transform Learning Introduction Fast Transform How ? Atoms computed with a composition of K convolutions H ≈ h 1 ∗ h 2 ∗···∗ h K Kernels ( h k ) 1 ≤ k ≤ K have constrained supports defined by a mapping S k : � h k � � S k � ∀ k ∈ { 1 ,..., K } , supp ⊂ rg � S k � = { S k ( 1 ) ,..., S k ( S ) } where rg contains all the possible locations of the non-zero elements of h k . Figure: Tree structure for a dictionary. Notation : h = ( h k ) 1 ≤ k ≤ K ∈ ( R P ) K . 8/45

  9. Toward Fast Transform Learning Introduction Example of support mapping Figure: Supports ( S k ) 1 ≤ k ≤ 4 of size S = 3 × 3 upsampled by a factor k . 9/45

  10. Toward Fast Transform Learning Problem studied Introduction 1 Problem studied 2 ALS Algorithm 3 Approximation experiments 4 Convergence experiments 5 10/45

  11. Toward Fast Transform Learning Problem studied ( P 0 ) First formulation ( h k ) 1 ≤ k ≤ K ∈ ( R P ) K � α ∗ h 1 ∗···∗ h K − u � 2 � h k � � S k � ( P 0 ) : ⊂ rg argmin s.t. supp 2 Energy gradient ∂ E 0 ( h ) H k ∗ ( α ∗ h 1 ∗···∗ h K − u ) , = 2 ˜ (2) ∂ h k where H k = α ∗ h 1 ∗···∗ h k − 1 ∗ h k + 1 ∗···∗ h K , (3) . operator is defined for any h ∈ R P as and where the ˜ ˜ h p = h − p , ∀ p ∈ P . (4) 11/45

  12. Toward Fast Transform Learning Problem studied ( P 0 ) Shortcoming If h 1 = h 2 = 0, ∇ E 0 ( h ) = 0 but not a global minimum. Another view ∀ ( µ k ) 1 ≤ k ≤ K ∈ R K such that ∏ K k = 1 µ k = 1, we have � � ( µ k h k ) 1 ≤ k ≤ K = E 0 ( h ) , E 0 for any k ∈ { 1 ,..., K } , ∂ E 0 ∂ E 0 = 1 � � ( µ k h k ) 1 ≤ k ≤ K ∂ h k ( h ) . ∂ h k µ k The gradient depends on quantities which are irrelevant regarding the value of the objective function. 12/45

  13. Toward Fast Transform Learning Problem studied New formulation: Problem ( P 1 ) Second formulation argmin λ ≥ 0 , h ∈ D � λα ∗ h 1 ∗···∗ h K − u � 2 ( P 1 ) : 2 , with � S k �� � h k � � h ∈ ( R P ) K | ∀ k ∈ { 1 ,..., K } , � h k � 2 = 1 and supp D = ⊂ rg Reminder : h = ( h k ) 1 ≤ k ≤ K ∈ ( R P ) K . See On the best rank-1 and rank-(R 1, R 2,..., Rn) approximation of higher-order tensors , L. De Lathauwer, B. De Moor, J. Vandewalle, SIAM Journal on Matrix Analysis and Applications 21 (4), 1324-1342, 2000. 13/45

  14. Toward Fast Transform Learning Problem studied Existence of a solution of ( P 1 ) Proposition. [Existence of a solution] � R P × R P × ( P S ) K � For any ( u , α , ( S k ) 1 ≤ k ≤ K ) ∈ , if α ∗ h 1 ∗ ... ∗ h K � = 0 , ∀ h ∈ D , (5) then the problem ( P 1 ) has a minimizer. Proof. Idea : use compacity of D and λ -coercivity of the objective function. 14/45

  15. Toward Fast Transform Learning Problem studied Link between ( P 0 ) and ( P 1 ) Proposition. [ ( P 1 ) is equivalent to ( P 0 ) ] � R P × R P × ( P S ) K � Let ( u , α , ( S k ) 1 ≤ k ≤ K ) ∈ be such that (5) holds. For any ( λ , h ) ∈ R × ( R P ) K , we consider the kernels g = ( g k ) 1 ≤ k ≤ K ∈ ( R P ) K defined by g 1 = λ h 1 and g k = h k , ∀ k ∈ { 2 ,..., K } . (6) The following statements hold: if ( λ , h ) ∈ R × ( R P ) K is a stationary point of ( P 1 ) and λ > 0 then g is a 1 stationary point of ( P 0 ) . if ( λ , h ) ∈ R × ( R P ) K is a global minimizer of ( P 1 ) then g is a global 2 minimizer of ( P 0 ) . 15/45

  16. Toward Fast Transform Learning ALS Algorithm Introduction 1 Problem studied 2 ALS Algorithm 3 Principle of the algorithm Computations Initialization and restart Approximation experiments 4 5 Convergence experiments 16/45

  17. Toward Fast Transform Learning ALS Algorithm Principle of the algorithm Block formulation of ( P 1 ) Problem ( P k ) � argmin λ ≥ 0 , h ∈ R P � λα ∗ h 1 ∗···∗ h k − 1 ∗ h ∗ h k + 1 ∗ ... ∗ h K − u � 2 2 , ( P k ) : � S k � s.t. supp ( h ) ⊂ rg and � h � 2 = 1 where the kernels ( h k ′ p ) p ∈ P are fixed ∀ k ′ � = k . 17/45

  18. Toward Fast Transform Learning ALS Algorithm Principle of the algorithm Algorithm overview Algorithm 1: Overview of the ALS algorithm Input : u : target measurements; α : known coefficients; ( S k ) 1 ≤ k ≤ K : supports of the kernels ( h k ) 1 ≤ k ≤ K . Output : λ and kernels ( h k ) 1 ≤ k ≤ K such that λ h 1 ∗ ... ∗ h K ≈ H . begin Initialize the kernels ( h k ) 1 ≤ k ≤ K ; while not converged do for k = 1 ,..., K do Update λ and h k with a minimizer of ( P k ) . 18/45

  19. Toward Fast Transform Learning ALS Algorithm Computations Matrix formulation of ( P k ) ( P k ) argmin λ ≥ 0 , h ∈ R S � λ C k h − u � 2 ( P k ) : s.t. � h � 2 = 1 2 Alternative: ( P ′ k ) ( P ′ argmin h ∈ R S � C k h − u � 2 k ) : 2 . k ) has a minimizer h ∗ ∈ R S . ( P ′ Computation of a stationary point yields h ∗ = ( C T k C k ) − 1 C T k u (7) 19/45

  20. Toward Fast Transform Learning ALS Algorithm Computations Update rule Find h ∗ solution of ( P ′ k ) Update � h ∗ , if � h ∗ � 2 � = 0 , h k = � h ∗ � 2 λ = � h ∗ � 2 and (8) 1 √ S ✶ { 1 ,..., S } , otherwise, 20/45

  21. Toward Fast Transform Learning ALS Algorithm Computations Matrix C k                 H k C k h = # P  h s  S    p − S k ( s )          � �� � S    .    .  .      H k C T = S # P = S � ...,... � R P complexity O ( S # P ) k u u p     p − S k ( s )   ����   .  .  S × 1 . � �� � # P         = S 2 � ...,... � R P complexity O ( S 2 # P ) C T H k H k k C k =     p − S k ( s ) p − S k ( s )   � �� �   S × S 21/45

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend