matrix factorization with binary components uniqueness in
play

Matrix Factorization with Binary Components Uniqueness in a - PowerPoint PPT Presentation

Matrix Factorization with Binary Components Uniqueness in a randomized model Felix Krahmer, TU M unchen Joint work with: Matthias Hein, Saarland University, David James , University of G ottingen Matrix Factorization given data


  1. Matrix Factorization with Binary Components – Uniqueness in a randomized model Felix Krahmer, TU M¨ unchen Joint work with: Matthias Hein, Saarland University, David James , University of G¨ ottingen

  2. Matrix Factorization � given data matrix D ∈ R m × n , n number of data points, m number of features � find matrices T ∈ R m × r , A ∈ R r × n such that min T ∈ R m × r , A ∈ R r × n � D − TA � 2 D = TA or F , exact case approximate case where r is typically small Globally optimal solution: � Singular Value Decomposition (SVD) D = U Σ V T T = U Σ , A = V T . = ⇒ � best rank r approximation obtained by taking top r singular values Problem: Factors often lack interpretation Felix Krahmer, TUM Matrix Factorization with Binary Components 2 of 23

  3. Nonnegative Matrix Factorization (NMF) � given data matrix D ∈ R m × n , � find matrices T ∈ R m × r , A ∈ R r × n such that + + � D − TA � 2 D = TA or min T ∈ R m × r F . , A ∈ R r × n + + (taken from Lee, Seung: Learning the parts of objects by NMF, Nature(1999)) Felix Krahmer, TUM Matrix Factorization with Binary Components 3 of 23

  4. Nonnegative Matrix Factorization (NMF) � given data matrix D ∈ R m × n , � find matrices T ∈ R m × r , A ∈ R r × n such that + + � D − TA � 2 D = TA or min T ∈ R m × r F . , A ∈ R r × n + + Prior work: � used for finding latent factors/components T � solved via alternating least squares but convergence can only proven to critical point = ⇒ no guarantee to find global optimum � In 2012 Arora, Ge, Kanna, Moitra propose an algorithm for exact NMF with runtime O (( nm ) r 2 ). � In the case where T is separable, algorithm runs in polynomial time (improved by Bittorf et al (2013)) Goal: extend conditions on NMF for which solution can be found efficiently Felix Krahmer, TUM Matrix Factorization with Binary Components 3 of 23

  5. Gene expression data analysis � Gene expression is the process by which information from a gene is used in the synthesis of a functional gene product.  � 0 0 1  �     1 1 0  1 1 0        � 1 0 1 �       =    0 1 0          0 1 1       � 1 1 0 �         1 1 1 0 0 0 gene product gene expression genes Goal: Decompose gene expression data into functional processes Felix Krahmer, TUM Matrix Factorization with Binary Components 4 of 23

  6. Matrix Factorization with Binary Components Our model:   1 0 1 1 0 1 1 0 1 1 0 1   =   0 1 0 0   1 1 0 0     1 0 0 0 0 1 0 1 D ∈ R m × n T ∈ { 0 , 1 } m × r A ∈ R r × n Our Goal: factor D = TA Felix Krahmer, TUM Matrix Factorization with Binary Components 5 of 23

  7. Matrix Factorization with Binary Components Our model:   1 0 1 1 0 1 1 0 1 1 0 1   =   0 1 0 0   1 1 0 0     1 0 0 0 0 1 0 1 D ∈ R m × n T ∈ { 0 , 1 } m × r A ∈ R r × n Our Goal: factor D = TA Felix Krahmer, TUM Matrix Factorization with Binary Components 5 of 23

  8. Matrix Factorization with Binary Components Our model:   1 0 1 1 0 1 1 0 1 1 0 1   =   0 1 0 0   1 1 0 0     1 0 0 0 0 1 0 1 D ∈ R m × n T ∈ { 0 , 1 } m × r A ∈ R r × n Our Goal: factor D = TA Assumptions: 1 T A = 1 T � rank ( D ) = r ≪ m , rank ( A ) = r , � the columns of T are affinely independent, i.e. ∀ λ ∈ R r with λ T 1 r = 0 and T λ = 0 = ⇒ λ = 0 Felix Krahmer, TUM Matrix Factorization with Binary Components 5 of 23

  9. Key idea Lemma The affine hull of T and D agree, aff ( D ) = aff ( T ) . Illustration for m = 3 - note that aff ( D ) ∩ { 0 , 1 } m = T Theorem (Slawski, Hein, Lutsik (NIPS 2013)) Some exact factorization can be computed in O ( rm 2 r ) by computing aff ( T ) ∩ { 0 , 1 } m = aff ( D ) ∩ { 0 , 1 } m . Felix Krahmer, TUM Matrix Factorization with Binary Components 6 of 23

  10. Uniqueness of the Factorization Solutions are not guaranteed to be nonnegative = > If two solutions exist, we may find one which is not nonnegative Uniqueness is crucial for the interpretability of the factors ! 1 0 1 1 1 0 0 0     0 1 1 0 1 1 0 0 ? 1 1 0 1 1 0 0 1  0 1 0 0   1 0 0 0  = =     1 1 0 0 1 0 1 1 A ′ A     1 0 0 0 0 0 0 0 1 1 1 1 0 0 0 1 0 1 0 1 1 1 0 0 T ′ D T Factorization is unique if aff( T ) ∩ { 0 , 1 } m = { T : , 1 , . . . , T : , r } Felix Krahmer, TUM Matrix Factorization with Binary Components 7 of 23

  11. Matrix Factorization with Random Binary Components Our model: t 1 , 1 t 1 , r   . . . t 2 , 1 t 2 , r . . .     = . .  . .  . .       t m − 1 , 1 t m − 1 , r . . . t m , 1 t m , r . . . D ∈ R m × n A ∈ R r × n , 1 T A = 1 T T random matrix � t ij are drawn independently from { 0 , 1 } with probabilities P [ t ij = 0] = p and P [ t ij = 1] = 1 − p , � choose p big to simulate sparse binary components � task: bound probability that aff( T ) ∩ { 0 , 1 } m � = { T : , 1 , . . . , T : , r } Felix Krahmer, TUM Matrix Factorization with Binary Components 8 of 23

  12. Idea � Replace T with M taking the values in {− 1 , +1 } with same probability distribution P [aff( T ) ∩ { 0 , 1 } m � = { T : , 1 , . . . , T : , r } ] = P [aff( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] � Define R s = P [ ∃ x ∈ R r , | supp ( x ) | = s , Mx ∈ {− 1 , +1 } m ] , Felix Krahmer, TUM Matrix Factorization with Binary Components 9 of 23

  13. Idea � Replace T with M taking the values in {− 1 , +1 } with same probability distribution P [aff( T ) ∩ { 0 , 1 } m � = { T : , 1 , . . . , T : , r } ] = P [aff( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] � Define R s = P [ ∃ x ∈ R r , | supp ( x ) | = s , Mx ∈ {− 1 , +1 } m ] , then r P [aff( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ � R s s =2 Felix Krahmer, TUM Matrix Factorization with Binary Components 9 of 23

  14. Idea � Replace T with M taking the values in {− 1 , +1 } with same probability distribution P [aff( T ) ∩ { 0 , 1 } m � = { T : , 1 , . . . , T : , r } ] = P [aff( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] � Define R s = P [ ∃ x ∈ R r , | supp ( x ) | = s , Mx ∈ {− 1 , +1 } m ] , P s = P [ ∃ x ∈ R r , supp ( x ) = { 1 , . . . , s } : Mx ∈ {− 1 , +1 } m ] , then r r � r � P [aff( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ � � R s ≤ P s s s =2 s =2 Felix Krahmer, TUM Matrix Factorization with Binary Components 9 of 23

  15. Odlyzko 1988 Theorem (Odlyzko 1988) Let M be a random m × r matrix whose entries are drawn independently from {− 1 , +1 } with equal probabilities ( p = 1 / 2) . If � � 10 r ≤ m 1 − , log( m ) then �� 7 � � � m � r P [ aff ( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ P 3 + O 3 10 � 3 � m , as m tends to infinity. with P 3 = 4 4 Felix Krahmer, TUM Matrix Factorization with Binary Components 10 of 23

  16. Conjecture - Uniqueness under Random Sampling Conjecture Let M be a random m × r matrix whose entries are drawn independently from {− 1 , +1 } with probabilities P [ m ij = − 1] = p and P [ m ij = 1] = 1 − p , If there is some fixed ε > 0 such that r < m (1 − ε ) , Then, � � r P [ aff ( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ P 3 + o ( P 3 ) 3 with P 3 = 4(1 − p (1 − p )) m , as m tends to infinity. Felix Krahmer, TUM Matrix Factorization with Binary Components 11 of 23

  17. Conjecture - Uniqueness under Random Sampling Conjecture Let M be a random m × r matrix whose entries are drawn independently from {− 1 , +1 } with probabilities P [ m ij = − 1] = p and P [ m ij = 1] = 1 − p , If there is some fixed ε > 0 such that r < m (1 − ε ) , Then, � � r P [ aff ( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ P 3 + o ( P 3 ) 3 with P 3 = 4(1 − p (1 − p )) m , as m tends to infinity. � (1 − p (1 − p )) < 1 for p ∈ (0 , 1) � (1 − 1 2 (1 − 1 2 )) = 3 4 Felix Krahmer, TUM Matrix Factorization with Binary Components 11 of 23

  18. Partial result Theorem (almost/work in progress) Let M be a random m × r matrix whose entries are drawn independently from {− 1 , +1 } with probabilities P [ m ij = − 1] = p P [ m ij = 1] = 1 − p , and If there is some fixed ε > 0 such that r ≤ 32 , Then, � � r P [ aff ( M ) ∩ {− 1 , +1 } m � = { M : , 1 , . . . , M : , r } ] ≤ P 3 + o ( P 3 ) 3 with P 3 = 4(1 − p (1 − p )) m , as m tends to infinity. � (1 − p (1 − p )) < 1 for p ∈ (0 , 1) � (1 − 1 2 (1 − 1 2 )) = 3 4 Felix Krahmer, TUM Matrix Factorization with Binary Components 12 of 23

  19. Sperner family and Sperners Lemma Definition (Sperner (1928)) A family of sets that does not include two sets X and Y for which X ⊂ Y is called a Sperner family . Felix Krahmer, TUM Matrix Factorization with Binary Components 13 of 23

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend