A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - - PowerPoint PPT Presentation
A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - - PowerPoint PPT Presentation
A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018 Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! Dont generalize
Introduction: Decoding and Dimensionality Reduction
Autoencoders § Perform well in most applications but not always! § Don’t generalize easily § Hard to understand theoretically
. . . . . . . . . . . . . . . . . . Encoder Decoder
!" #" $ #"
Introduction: Decoding and Dimensionality Reduction
Our focus today: Decoder side § Observed data: !" ∈ ℝ%, ' = 1,2 ⋯ , , § Common latents -" ∈ ℝ. generate the data in an unknown nolinear fashion:
. . . . . . . . . Decoder
- "
/ !"
!0" = 10 -" + 30"
Introduction: Role of Dynamics
Dynamics play an important role in solving the inverse problem § The inverse problem is still ill-posed ! Latents can only be identified up to an isomorphism
. . . . . . . . . Decoder
!" # $"
$%" = '% !" + )%" !" = * !"+,
Koopman Theory Resolves this Ambiguity
§ Generalizes eigenfunction/eigenvalue to nonllinear dynamics: !" = $ !"%& § Koopman Operator: Linear, Infinite-dimensional › Linearizes the dynamics § Eigenfunctions/Eigenvalues:
'( ! = ( $ ! ') ! = *)(!)
- " = .
/0& 1
2/*/)/ !"%& ) $ ! = *) !
) interacts nicely with $ Goal:
Polynomials are Eigenfunctions of Linear Dynamics!
!" = $!"%&
' = '()'(* ⋯ '(,
- !
'
.()
/ !
.(*
/ ! ⋯ .(, / ! Linear Dynamical System
polynomials ' = '()
0)'(* 0* ⋯ '(, 01
.()
/ ! 0) .(* / ! 0* ⋯ .(, / ! 01
posynomials periodic complex 2(’s for conjugate pairs combinations with same ' ReLu, etc. '( .(
/!
eigen pairs of $/
Polynomials are Eigenfunctions of Linear Dynamics!
!" = $!"%& '()
* !
'(+
* ! ⋯ '(- * !
Good news: they also form a basis!
. ! =
Nonlinear dimensionality reduction for dynamical systems ≡ Low-rank harmonic analysis
Polynomial Principal Component Analysis (Poly-PCA) § Replace deterministic linear dynamics with an AR model § Model observations as polynomials of degree ≤ " in latents
#$% = '$ ( )%
⊗+ + -$%
)% = .)%/0 + 1%
'$ : Symmetric tensor of polynomial coefficients )% : Latents augmented with 1
minimize'7, 9: ;
$,%
#$% − '$ ( )%
⊗+ = + > ; %
||)% − .)%/0||=
=
Example: Van der Pol Oscillator with Quadratic Measurements
§ A 2-dimensional nonlinear oscillator
!"# = %#
&'"%# + )"#
* ̇ % 1 = % 2 ̇ % 2 = 3 1 − % 1 0 % 2 − % 1
10
- rigin
t i m e ( s )
2 4 6 8 10
10
time (s)
10
For known rank-1 '" ≡ phase retrieval
Why does linear dimensionality reduction fail?
§ Nonlinearity changes topology
ICA PCA tsne L L E
- rigin
time (s)
2 4 6 8 10 10
time (s)
10
Poly-PCA Recovers True Latents
time (s)
2 4 6 8 10
10
- rigin
10
- rigin
Non-singular linear map
Ground-truth Recovered
Axioms of Dimensionality Reduction
§ Nonsingular linear transformations of latents should also be a solution § Nonsingular and stable linear transformations of measurements should result in the same latents › Gives stability and robustness to outliers § Stable reconstruction possible if › !" = $ %" is Lipschitz: far away latents do not map to close
- bservations
Poly-PCA is compatible with these Axioms!
Some Poly-PCA Theory
§ Poly-PCA ≡ constrained PCA › ALS has no local minima, local minima appear from the polynomial constraints § Experimental observation: Poly-PCA has few local minima (compared to Bezout’s theorem)
x
x local minimum
local maximum
feasible manifold (unique) Least squares error Poly-PCA error
Poly-PCA LS
Also gives a good intuitive initialization
Some Poly-PCA Theory
§ Linear ambiguity can be handled by small penalization
convex
- ne global minimum
nonconvex
- ne global minimum
All minima of Poly-PCA
minimize&', )* +
,,-
.,- − &, 0 1-
⊗3 4 + 6 +
- ||1-||4
4 + 8 + ,
||&,||4
4
slope =
- ptimal solution
after small regularization local minimum 2 manifold of equivalent local minima local minimum 1
Some Poly-PCA Theory
§ Local minima are unique up to a linear transformation, i.e. for !" and #$ in general position and% >
'() ) *:
› Generalization of linear preserver theory § Conjecture: minimum required samples is % >
'() )
+ , + 1. #$ / !"
⊗) = ℬ$ / 3" ⊗) ⇒ !" = 53"
Poly-PCA Decoder
. . . . . . . . .
+ +
. . .
neuron 1 neuron p