Test of Time Award Online Dictionary Learning for Sparse Coding - PowerPoint PPT Presentation

Test of Time Award Online Dictionary Learning for Sparse Coding Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro International Conference on Machine Learning, 2019 Julien Mairal Online Dictionary Learning for Sparse Coding 1/15

Test of Time Award Online Learning for Matrix Factorization and Sparse Coding Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro International Conference on Machine Learning, 2019 Julien Mairal Online Dictionary Learning for Sparse Coding 1/15

Francis Jean Guillermo Julien Mairal Online Dictionary Learning for Sparse Coding 2/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D when a factor is sparse . Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D or the other one. Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D or both. Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D or not only one factor is sparse , but it admits a particular structure . Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What are these papers about? n They are dealing with matrix factorization p p n A m ≈ × X D or one factor admits a particular structure ( e.g. , piecewise constant), but it is not sparse. Julien Mairal Online Dictionary Learning for Sparse Coding 3/15

What these papers are about? n → + ∞ In these papers, data matrices have many columns , p p n → + ∞ A m ≈ × X D or an infinite number of columns , or columns are streamed online . Julien Mairal Online Dictionary Learning for Sparse Coding 4/15

Formulation(s) X = [ x 1 , x 2 , . . . , x n ] is a data matrix . We may call D = [ d 1 , . . . , d p ] a dictionary . A = [ α 1 , . . . , α n ] carries the decomposition coefficients of X onto D . Julien Mairal Online Dictionary Learning for Sparse Coding 5/15

Formulation(s) X = [ x 1 , x 2 , . . . , x n ] is a data matrix . We may call D = [ d 1 , . . . , d p ] a dictionary . A = [ α 1 , . . . , α n ] carries the decomposition coefficients of X onto D . Interpretation as signal/data decomposition p � X ≈ DA ⇐ ⇒ ∀ i, x i ≈ D α i = α i [ j ] d j . j =1 Julien Mairal Online Dictionary Learning for Sparse Coding 5/15

Formulation(s) X = [ x 1 , x 2 , . . . , x n ] is a data matrix . We may call D = [ d 1 , . . . , d p ] a dictionary . A = [ α 1 , . . . , α n ] carries the decomposition coefficients of X onto D . Interpretation as signal/data decomposition p � X ≈ DA ⇐ ⇒ ∀ i, x i ≈ D α i = α i [ j ] d j . j =1 Generic formulation n 1 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ min L ( x i , D ) with = min n D ∈D α ∈A i =1 Julien Mairal Online Dictionary Learning for Sparse Coding 5/15

Formulation(s) X = [ x 1 , x 2 , . . . , x n ] is a data matrix . We may call D = [ d 1 , . . . , d p ] a dictionary . A = [ α 1 , . . . , α n ] carries the decomposition coefficients of X onto D . Interpretation as signal/data decomposition p � X ≈ DA ⇐ ⇒ ∀ i, x i ≈ D α i = α i [ j ] d j . j =1 Generic formulation / stochastic case n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Julien Mairal Online Dictionary Learning for Sparse Coding 5/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + [Paatero and Tapper, ’94] Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + R p sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 [Paatero and Tapper, ’94], [Olshausen and Field, ’96] Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + R p sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 R p non-negative sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + [Paatero and Tapper, ’94], [Olshausen and Field, ’96], [Hoyer, 2002] Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + R p sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 R p non-negative sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + R p structured sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + Ω( . ) [Paatero and Tapper, ’94], [Olshausen and Field, ’96], [Hoyer, 2002], [Mairal et al., 2011] Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + R p sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 R p non-negative sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + R p structured sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + Ω( . ) { D : ∀ j, � d j � 2 R p ≈ sparse PCA 2 + � d j � 1 ≤ 1 } � . � 1 [Paatero and Tapper, ’94], [Olshausen and Field, ’96], [Hoyer, 2002], [Mairal et al., 2011], [Zou et al., 2004]. Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

Formulation(s) n 1 2 � x − D α � 2 + λψ ( α ) . � L ( x , D ) △ D ∈D E x [ L ( x , D )] min with = min α ∈A i =1 Which formulations does it cover? D A ψ R m × p R p non-negative matrix factorization 0 + + R p sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 R p non-negative sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + R p structured sparse coding { D : ∀ j, � d j � ≤ 1 } � . � 1 + Ω( . ) { D : ∀ j, � d j � 2 R p ≈ sparse PCA 2 + � d j � 1 ≤ 1 } � . � 1 . . . . . . . . . . . . [Paatero and Tapper, ’94], [Olshausen and Field, ’96], [Hoyer, 2002], [Mairal et al., 2011], [Zou et al., 2004]. Julien Mairal Online Dictionary Learning for Sparse Coding 6/15

The sparse coding context was introduced by Olshausen and Field, ’96. It was the first time (together with ICA, see [Bell and Sejnowski, ’97]) that a simple unsupervised learning principle would lead to various sorts of “Gabor-like” filters, when trained on natural image patches. Julien Mairal Online Dictionary Learning for Sparse Coding 7/15

The sparse coding context Remember that we can play with various structured sparsity-inducing penalties : [Jenatton et al. 2010], [Kavukcuoglu et al., 2009], [Mairal et al. 2011], [Hyv¨ arinen and Hoyer, 2001]. Julien Mairal Online Dictionary Learning for Sparse Coding 8/15

Sparsity and simplicity principles 1921 1921: Wrinch and Jeffrey’s simplicity principle. Julien Mairal Online Dictionary Learning for Sparse Coding 9/15

Sparsity and simplicity principles 1921 1950 1921: Wrinch and Jeffrey’s simplicity principle. 1952: Markowitz’s portfolio selection. Julien Mairal Online Dictionary Learning for Sparse Coding 9/15

Sparsity and simplicity principles 1921 1950 1960 1970 1921: Wrinch and Jeffrey’s simplicity principle. 1952: Markowitz’s portfolio selection. 1960’s and 70’s: best subset selection in statistics. Julien Mairal Online Dictionary Learning for Sparse Coding 9/15

Sparsity and simplicity principles 1921 1950 1960 1970 1980 1990 1921: Wrinch and Jeffrey’s simplicity principle. 1952: Markowitz’s portfolio selection. 1960’s and 70’s: best subset selection in statistics. 1990’s: the wavelet era in signal processing. Julien Mairal Online Dictionary Learning for Sparse Coding 9/15

Sparsity and simplicity principles 1921 1950 1960 1970 1980 1990 2000 1921: Wrinch and Jeffrey’s simplicity principle. 1952: Markowitz’s portfolio selection. 1960’s and 70’s: best subset selection in statistics. 1990’s: the wavelet era in signal processing. 1996: Olshausen and Field’s dictionary learning method. 1994–1996: the Lasso (Tibshirani) and Basis pursuit (Chen and Donoho). Julien Mairal Online Dictionary Learning for Sparse Coding 9/15

Test of Time Award Online Dictionary Learning for Sparse Coding - PowerPoint PPT Presentation

Test of Time Award Online Dictionary Learning for Sparse Coding Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro International Conference on Machine Learning, 2019 Julien Mairal Online Dictionary Learning for Sparse Coding 1/15 Test

Sparse Coding and Dictionary Learning for Image Analysis Part II: Dictionary Learning for signal

Sparse dictionary learning in the presence of noise & outliers Rmi Gribonval INRIA Rennes

The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Coding and Dictionary Learning for Image Analysis Part I: Optimization for Sparse Coding

Sparse Coding and Dictionary Learning for Image Analysis Part IV: New sparse models Francis

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Dictionary Learning for Graph Signals 236862 Introduction to Sparse and Redundant

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

CMSC 206 Dictionaries and Hashing The Dictionary ADT n a dictionary (table) is an abstract

Learning ancestral atom of structured dictionary via sparse coding Bernoulli Society Satellite

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

TEST-OF-TIME AWARD Language primitives and type discipline for structured communication-based

Award Ceremony Announcement of Winners International Olive Oil Award 2015 Award Categories

2018 Halifax Art Festival AWARDS $400 Award of Merit JEFF THAMERT PHOTOGRAPHY $400 Award of

Westhill Open Award Group Overview aim of the Award Westhill Open Award Group key

Robot Object Manipulation Using RFIDs Jue Wang Fadel Adib, Ross Knepper, Dina Katabi, Daniela Rus

CMS Data Availability and Request Process David Lanctin, M.P.H. Technical Advisor ResDAC,

INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Sch utzes, linked from

Clustering Sriram Sankararaman (Adapted from slides by Junming Yin) Outline Introduction

2020 C 0 CMS S HE HEAL ALTHCARE E INNO NOVATION ON INDUS USTRY D DAY Health

CMS Web Interface Q&A Session March 7, 2018 Disclaimer This presentation was current at the

TSWF Case Management Pediatric AIM Form Training January Jan-Apr 2019 Form Version These slides

On the Maximal Number of Real Embeddings of Spatial Minimally Rigid Graphs Vangelis Bartzos,

Test of Time Award Online Dictionary Learning for Sparse Coding - PowerPoint PPT Presentation

Test of Time Award Online Dictionary Learning for Sparse Coding Julien Mairal, Francis Bach, Jean Ponce, Guillermo Sapiro International Conference on Machine Learning, 2019 Julien Mairal Online Dictionary Learning for Sparse Coding 1/15 Test

Sparse Coding and Dictionary Learning for Image Analysis Part II: Dictionary Learning for signal

Sparse dictionary learning in the presence of noise &amp; outliers Rmi Gribonval INRIA Rennes

The Dictionary ADT The dictionary ADT models a searchable collection findElement(k): if the

Sparse Matrices Example Of Sparse Matrices diagonal tridiagonal sparse many elements are

Sparse Coding and Dictionary Learning for Image Analysis Part I: Optimization for Sparse Coding

Sparse Coding and Dictionary Learning for Image Analysis Part IV: New sparse models Francis

Model-Based Testing (ISTQB Chapter 4) Arie van Deursen 1 4.1 ISTQB Test Design Test Scripts

Dictionary Learning for Graph Signals 236862 Introduction to Sparse and Redundant

Sparse Matrices sparse many elements are zero dense few elements are zero Example Of

CMSC 206 Dictionaries and Hashing The Dictionary ADT n a dictionary (table) is an abstract

Learning ancestral atom of structured dictionary via sparse coding Bernoulli Society Satellite

Online Learning Lorenzo Rosasco MIT, 9.520 L. Rosasco Online Learning About this class Goal

TEST-OF-TIME AWARD Language primitives and type discipline for structured communication-based

Award Ceremony Announcement of Winners International Olive Oil Award 2015 Award Categories

2018 Halifax Art Festival AWARDS $400 Award of Merit JEFF THAMERT PHOTOGRAPHY $400 Award of

Westhill Open Award Group Overview aim of the Award Westhill Open Award Group key

Robot Object Manipulation Using RFIDs Jue Wang Fadel Adib, Ross Knepper, Dina Katabi, Daniela Rus

CMS Data Availability and Request Process David Lanctin, M.P.H. Technical Advisor ResDAC,

INFO 4300 / CS4300 Information Retrieval slides adapted from Hinrich Sch utzes, linked from

Clustering Sriram Sankararaman (Adapted from slides by Junming Yin) Outline Introduction

2020 C 0 CMS S HE HEAL ALTHCARE E INNO NOVATION ON INDUS USTRY D DAY Health

CMS Web Interface Q&amp;A Session March 7, 2018 Disclaimer This presentation was current at the

TSWF Case Management Pediatric AIM Form Training January Jan-Apr 2019 Form Version These slides

On the Maximal Number of Real Embeddings of Spatial Minimally Rigid Graphs Vangelis Bartzos,

Sparse dictionary learning in the presence of noise & outliers Rmi Gribonval INRIA Rennes

CMS Web Interface Q&A Session March 7, 2018 Disclaimer This presentation was current at the