a mathematical theory of dimensionality reduction
play

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - PowerPoint PPT Presentation

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018 Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! Dont generalize


  1. A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018

  2. Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! § Don’t generalize easily § Hard to understand theoretically § # " ! " # " $ . . . . . . . . . . . . . . . . . . Decoder Encoder

  3. Introduction: Decoding and Dimensionality Reduction Our focus today: Decoder side Observed data: ! " ∈ ℝ % , ' = 1,2 ⋯ , , § Common latents - " ∈ ℝ . generate the data in an unknown nolinear fashion: § ! 0" = 1 0 - " + 3 0" ! " / - " . . . . . . . . . Decoder

  4. Introduction: Role of Dynamics Dynamics play an important role in solving the inverse problem The inverse problem is still ill-posed ! Latents can only be identified up to an § isomorphism ! " = * ! "+, $ %" = ' % ! " + ) %" ! " $ " # . . . . . . . . . Decoder

  5. Koopman Theory Resolves this Ambiguity Generalizes eigenfunction/eigenvalue to nonllinear dynamics: ! " = $ ! "%& § Koopman Operator: Linear, Infinite-dimensional § '( ! = ( $ ! › Linearizes the dynamics ') ! = *)(!) Eigenfunctions/Eigenvalues: § 1 ) $ ! = *) ! - " = . 2 / * / ) / ! "%& Goal: /0& ) interacts nicely with $

  6. Polynomials are Eigenfunctions of Linear Dynamics! Linear Dynamical System ! " = $! "%& - ! ' / ! ' ( . ( eigen pairs of $ / / ! / ! ⋯ . ( , / ! ' = ' ( ) ' ( * ⋯ ' ( , polynomials . ( ) . ( * 0 ) . ( * 0 * ⋯ . ( , 0 1 0 * ⋯ ' ( , / ! / ! / ! 0 ) ' ( * 0 1 posynomials ' = ' ( ) . ( ) complex 2 ( ’s for conjugate pairs periodic combinations ReLu, etc. with same '

  7. Polynomials are Eigenfunctions of Linear Dynamics! ! " = $! "%& * ! * ! ⋯ ' ( - * ! ' ( ) ' ( + . ! = Good news: they also form a basis! Nonlinear dimensionality reduction for dynamical systems ≡ Low-rank harmonic analysis

  8. Polynomial Principal Component Analysis (Poly-PCA) § Replace deterministic linear dynamics with an AR model § Model observations as polynomials of degree ≤ " in latents ) % = .) %/0 + 1 % ⊗+ + - $% # $% = ' $ ( ) % ' $ : Symmetric tensor of polynomial coefficients ) % : Latents augmented with 1 ⊗+ = + > ; = minimize ' 7 , 9 : ; # $% − ' $ ( ) % ||) % − .) %/0 || = $,% %

  9. ̇ ̇ Example: Van der Pol Oscillator with Quadratic Measurements § A 2-dimensional nonlinear oscillator % 1 = % 2 & ' " % # + ) "# ! "# = % # * % 2 = 3 1 − % 1 0 % 2 − % 1 origin 10 0 10 time (s) 10 For known rank-1 ' " ≡ phase retrieval 0 2 4 6 8 10 ( s ) t i m e

  10. Why does linear dimensionality reduction fail? § Nonlinearity changes topology ICA PCA origin L L E tsne 10 0 10 time (s) 0 2 4 6 8 10 time (s)

  11. Poly-PCA Recovers True Latents Recovered Ground-truth Non-singular origin origin linear map 10 10 0 2 4 6 8 10 time (s)

  12. Axioms of Dimensionality Reduction § Nonsingular linear transformations of latents should also be a solution § Nonsingular and stable linear transformations of measurements should result in the same latents › Gives stability and robustness to outliers § Stable reconstruction possible if › ! " = $ % " is Lipschitz: far away latents do not map to close observations Poly-PCA is compatible with these Axioms!

  13. Some Poly-PCA Theory Poly-PCA ≡ constrained PCA § › ALS has no local minima, local minima appear from the polynomial constraints Experimental observation: Poly-PCA has few local minima (compared to Bezout’s § theorem) x local minimum local maximum Least squares error Poly-PCA error LS Poly-PCA x (unique) feasible manifold Also gives a good intuitive initialization

  14. Some Poly-PCA Theory § Linear ambiguity can be handled by small penalization ⊗3 4 + 6 + 4 + 8 + 4 minimize & ' , ) * + . ,- − & , 0 1 - ||1 - || 4 ||& , || 4 ,,- - , convex one global minimum local minimum 2 optimal solution nonconvex after small regularization one global minimum manifold of equivalent local minima local minimum 1 slope = All minima of Poly-PCA

  15. Some Poly-PCA Theory § Local minima are unique up to a linear transformation, i.e. for * : '() ! " and # $ in general position and % > ) ⊗) = ℬ $ / 3 " ⊗) ⇒ ! " = 53 " # $ / ! " › Generalization of linear preserver theory '() § Conjecture: minimum required samples is % > + , + 1. )

  16. Equivalence to a 1-layer Decoder with Polynomial Activation ( ( ⊗+ ⇒ ! " - . / ⊗0 = $ 0 1 . / ! " = $ ) "% ) "% %&' %&' Not easy to train (Mondelli, Montanari, 2018) § Better to train directly on ! " § . . . + neuron 1 Universal approximation theorem ≡ Taylor § approximation theorem . . . . . . . . . + neuron p Poly-PCA Decoder

  17. Poly-PCA Initialization Strategy 1: Use PCA § › Beats PCA › Need larger Lipschitz constant Strategy 2: Data Embedding + PCA § Ground-truth › Use Taken’s Embedding Theorem ! " (1) ! "&' (1) Embedded ! "&(' (1)

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend