dictionary learning fast and dirty
play

Dictionary learning - fast and dirty Karin Schnass Department of - PowerPoint PPT Presentation

Dictionary learning - fast and dirty Karin Schnass Department of Mathematics University of Innsbruck karin.schnass@uibk.ac.at Der Wissenschaftsfonds Dagstuhl, August 31 Karin Schnass ITKM 1 / 16 why do we care about sparsity again? A


  1. Dictionary learning - fast and dirty Karin Schnass Department of Mathematics University of Innsbruck karin.schnass@uibk.ac.at Der Wissenschaftsfonds Dagstuhl, August 31 Karin Schnass ITKM 1 / 16

  2. why do we care about sparsity again? A sparse representation of the data is the basis for Karin Schnass ITKM 2 / 16

  3. why do we care about sparsity again? A sparse representation of the data is the basis for efficient data processing, e.g denoising, compressed sensing, inpainting Example: inpainting a a J. Mairal, F. Bach, J. Ponce, G. Sapiro, Online learning for matrix factorization and sparse coding. Karin Schnass ITKM 2 / 16

  4. why do we care about sparsity again? A sparse representation of the data is the basis for efficient data processing, e.g denoising, compressed sensing, inpainting efficient data analysis, e.g source separation, anomaly detection, sparse components Example: sparse components a a D.J. Field, B.A. Olshausen, Emergence of simple-cell receptive field properties by learning a sparse code for natural images. Karin Schnass ITKM 2 / 16

  5. why do we care about sparsity again? A sparse representation of the data is the basis for efficient data processing, e.g denoising, compressed sensing, inpainting efficient data analysis, e.g source separation, anomaly detection, sparse components In all examples: the sparser - the more efficient Karin Schnass ITKM 2 / 16

  6. why do we care about dictionary learning? data: Y = ( y 1 , . . . , y N ) N vectors y n ∈ R d d , N large Karin Schnass ITKM 3 / 16

  7. why do we care about dictionary learning? data: Y = ( y 1 , . . . , y N ) N vectors y n ∈ R d d , N large Karin Schnass ITKM 3 / 16

  8. why do we care about dictionary learning? data: Y = ( y 1 , . . . , y N ) N vectors y n ∈ R d d , N large No need for intuition time (days vs. years) Karin Schnass ITKM 3 / 16

  9. why do we care about dictionary learning? data: Y = ( y 1 , . . . , y N ) N vectors y n ∈ R d d , N large No need for intuition time (days vs. years) Karin Schnass ITKM 3 / 16

  10. Let’s do it. We have: data Y a model ( Y is S-sparse in a d × K dictionary Φ) Karin Schnass ITKM 4 / 16

  11. Let’s do it. We have: data Y a model ( Y is S-sparse in a d × K dictionary Φ) We want: an algorithm (fast, cheap) guarantees that the algorithm will find Φ. Karin Schnass ITKM 4 / 16

  12. Let’s do it. We have: data Y a model ( Y is S-sparse in a d × K dictionary Φ) We want: an algorithm (fast, cheap) guarantees that the algorithm will find Φ. Promising directions: Graph clustering algorithms (not so cheap) Tensor methods (not so cheap) - later today! (Alternating) Optimisation (not so many guarantees) Karin Schnass ITKM 4 / 16

  13. Let’s do it. We have: data Y a model ( Y is S-sparse in a d × K dictionary Φ) We want: an algorithm (fast, cheap) guarantees that the algorithm will find Φ. Promising directions: Graph clustering algorithms (not so cheap) Tensor methods (not so cheap) - later today! (Alternating) Optimisation (not so many guarantees) Karin Schnass ITKM 4 / 16

  14. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) Karin Schnass ITKM 5 / 16

  15. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n Karin Schnass ITKM 5 / 16

  16. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n but this leads to K-SVD which is slow. Karin Schnass ITKM 5 / 16

  17. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n but this leads to K-SVD which is slow. So let’s modify the optimisation programme | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (2) Ψ ∈D n Karin Schnass ITKM 5 / 16

  18. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n but this leads to K-SVD which is slow. So let’s modify the optimisation programme � |� ψ i , y n �| 2 , max max (2) Ψ ∈D i n Karin Schnass ITKM 5 / 16

  19. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n but this leads to K-SVD which is slow. So let’s modify the optimisation programme � max max |� ψ i , y n �| , (2) Ψ ∈D i n Karin Schnass ITKM 5 / 16

  20. warm up - a bit of K-SVD. Since our signals are S-sparse, let’s minimise � Y − Ψ X � 2 min F Ψ ∈D , X ∈X S (Ψ ∈ D has normalised columns, X ∈ X S has S-sparse columns) or equivalently maximise | I |≤ S � Ψ I Ψ † � I y n � 2 max max 2 , (1) Ψ ∈D n but this leads to K-SVD which is slow. So let’s modify the optimisation programme � | I |≤ S � Ψ ⋆ max max I y n � 1 , (2) Ψ ∈D n Karin Schnass ITKM 5 / 16

  21. Iterative Thresholding and K signal means (ITKsM) To optimise: � | I | = S � Ψ ⋆ max max I y n � 1 (3) Ψ ∈D n Karin Schnass ITKM 6 / 16

  22. Iterative Thresholding and K signal means (ITKsM) To optimise: � | I | = S � Ψ ⋆ max max I y n � 1 (3) Ψ ∈D n Algorithm (ITKsM one iteration) Given an input dictionary Ψ and N training signals y n do: For all n find I t Ψ , n = arg max I : | I | = S � Ψ ⋆ I y n � 1 . For all k calculate ψ k = 1 ¯ � y n · sign( � ψ k , y n � ) · χ ( I t Ψ , n , k ) . (4) N n Output ¯ Ψ = ( ¯ ψ 1 / � ¯ ψ 1 � 2 , . . . , ¯ ψ K / � ¯ ψ K � 2 ) . Karin Schnass ITKM 6 / 16

  23. ITKsM is ridiculously cheap O ( dKN ) (parallelisable, online version) robust to noise, not exact or low sparsity ( S = O ( µ − 2 )) locally convergent (radius 1 / √ log K ) for sparsity S = O ( µ − 2 ), needs only O ( K log K ε − 2 ) samples, Karin Schnass ITKM 7 / 16

  24. ITKsM is ridiculously cheap O ( dKN ) (parallelisable, online version) robust to noise, not exact or low sparsity ( S = O ( µ − 2 )) locally convergent (radius 1 / √ log K ) for sparsity S = O ( µ − 2 ), needs only O ( K log K ε − 2 ) samples, but is not globally convergent Karin Schnass ITKM 7 / 16

  25. ITKsM is ridiculously cheap O ( dKN ) (parallelisable, online version) robust to noise, not exact or low sparsity ( S = O ( µ − 2 )) locally convergent (radius 1 / √ log K ) for sparsity S = O ( µ − 2 ), needs only O ( K log K ε − 2 ) samples, but is not globally convergent Algorithm (ITKrM one iteration) Given an input dictionary Ψ and N training signals y n do: For all n find I t Ψ , n = arg max I : | I | = S � Ψ ⋆ I y n � 1 . For all k calculate ¯ � ψ k = sign( � ψ k , y n � ) · y n . n : k ∈ I t Ψ , n Output ¯ Ψ = ( ¯ ψ 1 / � ¯ ψ 1 � 2 , . . . , ¯ ψ K / � ¯ ψ K � 2 ) . Karin Schnass ITKM 7 / 16

  26. ITKsM is ridiculously cheap O ( dKN ) (parallelisable, online version) robust to noise, not exact or low sparsity ( S = O ( µ − 2 )) locally convergent (radius 1 / √ log K ) for sparsity S = O ( µ − 2 ), needs only O ( K log K ε − 2 ) samples, but is not globally convergent Algorithm (ITKrM one iteration) Given an input dictionary Ψ and N training signals y n do: For all n find I t Ψ , n = arg max I : | I | = S � Ψ ⋆ I y n � 1 . For all k calculate ¯ � � � ψ k = sign( � ψ k , y n � ) · I − P (Ψ I t n ) + P ( ψ k ) y n . n : k ∈ I t Ψ , n Output ¯ Ψ = ( ¯ ψ 1 / � ¯ ψ 1 � 2 , . . . , ¯ ψ K / � ¯ ψ K � 2 ) . Karin Schnass ITKM 7 / 16

  27. intermediate quiz Are we going to recover the dictionary? Karin Schnass ITKM 8 / 16

  28. intermediate quiz Are we going to recover the dictionary? Karin Schnass ITKM 8 / 16

  29. intermediate quiz Are we going to recover the dictionary? Karin Schnass ITKM 8 / 16

  30. intermediate quiz Are we going to recover the dictionary? No, no, no!! Karin Schnass ITKM 8 / 16

  31. intermediate quiz Are we going to recover the dictionary? No, no, no!! We need a sparse model Karin Schnass ITKM 8 / 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend