structured low rank matrix factorization global
play

STRUCTURED LOW-RANK MATRIX FACTORIZATION: GLOBAL OPTIMALITY, - PowerPoint PPT Presentation

1/20 STRUCTURED LOW-RANK MATRIX FACTORIZATION: GLOBAL OPTIMALITY, ALGORITHMS, AND APPLICATIONS ARTICLE BY BENJAMIN D. HAEFFELE AND REN VIDAL (2017) CMAP Machine Learning Journal Club Speaker: Imke Mayer , December 13 th 2018 CMAP 2/20


  1. 1/20 STRUCTURED LOW-RANK MATRIX FACTORIZATION: GLOBAL OPTIMALITY, ALGORITHMS, AND APPLICATIONS ARTICLE BY BENJAMIN D. HAEFFELE AND RENÉ VIDAL (2017) CMAP Machine Learning Journal Club Speaker: Imke Mayer , December 13 th 2018 CMAP

  2. 2/20 OUTLINE Structured Matrix Factorization I. Context and definition i. Special case 1: Sparse dictionary learning (SDL) ii. Special case 2: Subspace clustering (SC) iii. Global optimality for structured matrix factorization II. Main theorem i. Polar problem ii. III. Application: SDL global optimality IV. Extension to tensor factorization and deep learning CMAP Machine Learning Journal Club, December 13 th 2018

  3. 3/20 STRUCTURED MATRIX FACTORIZATION CONTEXT (Large) high-dimensional datasets (images, videos, user ratings, etc.) Ø difficult to assess (computational issues, memory complexity) Ø but relevant information often lies in a low-dimensional structure Goal: recover this underlying low-dimensional structure of given (large scale) data X Motion segmentation Face clustering CMAP Machine Learning Journal Club, December 13 th 2018 [12] VIDAL, R., MA, Y., AND SASTRY, S. S. Generalized principal component analysis , vol. 5. Springer, 2016.

  4. 4/20 STRUCTURED MATRIX FACTORIZATION CONTEXT Large high-dimensional datasets (images, videos, user ratings, etc.) Ø difficult to assess (computational issues, memory complexity) Ø but relevant information often lies in general low-dimensional structure Goal: recover this underlying low-dimensional structure of given (large scale) data X Model assumption: linear subspace model. T he data can be approximated by one ore more low-dimensional subspace(s). X ⇡ UV T Basis of the linear low- Low-dimensional data dimensional structure representation CMAP Machine Learning Journal Club, December 13 th 2018

  5. 4/20 STRUCTURED MATRIX FACTORIZATION CONTEXT X ⇡ UV T Basis of the linear low- Low-dimensional data dimensional structure representation Ø Issue: Without any assumptions there are infinitely many choices for U and V such that X ≈ UV T . Ø Solution: Constrain the factors to satisfy certain properties. ` ( X, UV T ) + � Θ ( U, V ) min ( (1) U,V Ø Non-convex Ø Structured factors → more modeling flexibility Loss : Regularization : Ø Explicit representation measures the imposes restrictions approximation on the factors CMAP Machine Learning Journal Club, December 13 th 2018

  6. 5/20 STRUCTURED MATRIX FACTORIZATION SPECIAL CASE 1: SPARSE DICTIONARY LEARNING Given a set of signals, find a set of dictionary atoms and sparse codes to approximate the signals. [9] Ø denoising, inpainting Ø classification Sparse linear combinations = of dictionary atoms Denoised image Noisy image Dictionary atoms dictionary signals k X � UV T k 2 min F + � k V k 1 subject to k U i k 2  1 ( (3) U,V sparse codes CMAP Machine Learning Journal Club, December 13 th 2018 [9] OLSHAUSEN, B. A., AND FIELD, D. J. Sparse coding with an overcomplete basis set: A strategy employed by v1? Vision research 37 , 23 (1997), 3311–3325.

  7. 6/20 STRUCTURED MATRIX FACTORIZATION SPECIAL CASE 1: SPARSE DICTIONARY LEARNING dictionary signals k X � UV T k 2 min F + � k V k 1 subject to k U i k 2  1 ( (3) U,V sparse codes Challenges: Ø Optimization strategies without global convergence guarantees Ø Which size for U and V? Need to pick r (number of columns) a priori r X k X � UV T k 2 min F + � � k V i k 1 + (1 � � ) k V i k 2 subject to k U i k 2  1 ( (4) U,V,r i =1 CMAP Machine Learning Journal Club, December 13 th 2018

  8. 7/20 STRUCTURED MATRIX FACTORIZATION SPECIAL CASE 2: SUBSPACE CLUSTERING Given data X coming from a union of subspaces, find these underlying subspaces and separate the data according to these subspaces. Ø clustering Ø recover low-dimensional structures CMAP Machine Learning Journal Club, December 13 th 2018

  9. 8/20 STRUCTURED MATRIX FACTORIZATION SPECIAL CASE 2: SUBSPACE CLUSTERING Given data X coming from a union of subspaces, determine these underlying subspaces and separate the data according to these subspaces. Ø clustering Ø recover low-dimensional structures Subspaces S 1 ,..., S n Segmentation by finding a subspace- characterized by bases → U preserving representation → V recover number and recover data dimensions of the segmentation subspaces Challenges: Ø Model selection: how many subspaces? Dimension of each subspace? Ø Potentially: difficult subspace configurations CMAP Machine Learning Journal Club, December 13 th 2018

  10. 9/20 STRUCTURED MATRIX FACTORIZATION SPECIAL CASE 2: SUBSPACE CLUSTERING One solution to do subspace clustering: Sparse Subspace Clustering [4] • Self-expressive dictionary: fix the dictionary as U ← X • Find a sparse representation over U which allows to segment the data. But optimality of the dictionary is not addressed. Idea: Sparse dictionary learning on union of subspaces model is suited to recover a more compact factorization with subspace-sparse codes. [1] [1] ADLER, A., ELAD, M., AND HEL-OR, Y. Linear-time subspace clustering via bipartite graph modeling. IEEE transactions on neural networks and learning systems 26 , 10 (2015), 2234–2246. [4] ELHAMIFAR, E., AND VIDAL, R. Sparse subspace clustering: Algorithm, theory, and applications. IEEE transactions on pattern analysis and machine intelligence 35 , 11 (2013), 2765–2781.

  11. 10/20 STRUCTURED MATRIX FACTORIZATION THEORY FOR GLOBAL OPTIMALITY Matrix factorization Matrix approximation ? (1) ` ( X, UV T ) + � Θ ( U, V ) (2) min ` ( X, Y ) + � Ω Θ ( Y ) ( min ( Y U,V Ø Non-convex Ø Convex Ø Small problem size Ø Large problem size Ø Structured factors → more modeling flexibility Ø Unstructured Ø Explicit representation ` ( X, UV T ) min ` ( X, Y ) + � k Y k ⇤ min subject to U, V have number of columns  r U,V Y Low-rank matrix factorization Low-rank matrix approximation CMAP Machine Learning Journal Club, December 13 th 2018

  12. 10/20 STRUCTURED MATRIX FACTORIZATION THEORY FOR GLOBAL OPTIMALITY Matrix factorization Matrix approximation ? (1) ` ( X, UV T ) + � Θ ( U, V ) (2) min ` ( X, Y ) + � Ω Θ ( Y ) ( min ( Y U,V Ø Non-convex Ø Convex Ø Small problem size Ø Large problem size Ø Structured factors → more modeling flexibility Ø Unstructured Ø Explicit representation X Ideas: Find a convex relaxation for general a regularization function to couple the two problems (1) and (2) . • Ω Θ Θ • Allow the number of columns of U and V to change in (1) . Results: • Problem (2) gives a global lower-bound to problem (1) . This convex lower-bound allows to analyze global optimality for problem (1) . •

  13. 11/20 GLOBAL OPTIMALITY OF STRUCTURED MATRIX FACTORIZATION AT A LOCAL MINIMUM f ( U, V ) = ` ( X, UV T ) + � Θ ( U, V ) min ( U,V Assumptions: • Factorization size r is allowed to change. X • Loss is convex and once differentiable w.r.t. Y . ` ( X, Y ) + • is a sum of positively homogeneous functions of degree 2. Θ r X θ ( α u, α v ) = α 2 θ ( u, v ) Θ ( U, V ) = θ ( U i , V i ) for all α � 0 i =1 THEOREM [6] ( ˜ U, ˜ ( ˜ U i , ˜ Local minima of are globally optimal if for some . f ( U, V ) i 2 [ r ] V ) V i ) = (0 , 0) All local minima of of sufficient size are global minima. f ( U, V ) [6] HAEFFELE, B. D., AND VIDAL, R. Structured low-rank matrix factorization: Global optimality, algorithms, and applications. arXiv preprint arXiv:1708.07850 (2017).

  14. 12/20 GLOBAL OPTIMALITY OF STRUCTURED MATRIX FACTORIZATION AT ANY POINT f ( U, V ) = ` ( X, UV T ) + � Θ ( U, V ) min ( U,V Assumptions: • Factorization size r is allowed to change. X • Loss is convex and once differentiable w.r.t. Y . ` ( X, Y ) + • is a sum of positively homogeneous functions of degree 2. Θ r X θ ( α u, α v ) = α 2 θ ( u, v ) Θ ( U, V ) = θ ( U i , V i ) for all α � 0 i =1 COROLLARY [6] ( ˜ U, ˜ A point is a global optimum of if it satisfies the following conditions: f ( U, V ) V ) ✓ ◆ � 1 ˜ � r Y ` ( X, ˜ U ˜ V i = ✓ ( ˜ ˜ U i , ˜ 1) U T V T ) V i ) for all i 2 [ r ] ← for many choices of " condition 1 is satisfied by first order optimal points i ✓ ◆ � 1 2) � r Y ` ( X, ˜ U ˜ u T V T ) v  ✓ ( u, v ) for all ( u, v ) [6] HAEFFELE, B. D., AND VIDAL, R. Structured low-rank matrix factorization: Global optimality, algorithms, and applications. arXiv preprint arXiv:1708.07850 (2017).

  15. 12/20 GLOBAL OPTIMALITY OF STRUCTURED MATRIX FACTORIZATION AT ANY POINT f ( U, V ) = ` ( X, UV T ) + � Θ ( U, V ) min ( U,V Assumptions: • Factorization size r is allowed to change. X • Loss is convex and once differentiable w.r.t. Y . ` ( X, Y ) + • is a sum of positively homogeneous functions of degree 2. Θ r X θ ( α u, α v ) = α 2 θ ( u, v ) Θ ( U, V ) = θ ( U i , V i ) for all α � 0 i =1 COROLLARY [6] ( ˜ U, ˜ Given a point we can test whether it is a local minimum and of sufficient size V ) by testing: ✓ ◆ � 1 � r Y ` ( X, ˜ U ˜ (5) u T V T ) v  ✓ ( u, v ) for all ( u, v ) [6] HAEFFELE, B. D., AND VIDAL, R. Structured low-rank matrix factorization: Global optimality, algorithms, and applications. arXiv preprint arXiv:1708.07850 (2017).

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend