A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - - PowerPoint PPT Presentation

a mathematical theory of dimensionality reduction
SMART_READER_LITE
LIVE PREVIEW

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour - - PowerPoint PPT Presentation

A Mathematical Theory of Dimensionality Reduction Abbas Kazemipour Druckmann Lab Meeting October 26, 2018 Introduction: Decoding and Dimensionality Reduction Autoencoders Perform well in most applications but not always! Dont generalize


slide-1
SLIDE 1

A Mathematical Theory of Dimensionality Reduction

Abbas Kazemipour

Druckmann Lab Meeting October 26, 2018

slide-2
SLIDE 2

Introduction: Decoding and Dimensionality Reduction

Autoencoders § Perform well in most applications but not always! § Don’t generalize easily § Hard to understand theoretically

. . . . . . . . . . . . . . . . . . Encoder Decoder

!" #" $ #"

slide-3
SLIDE 3

Introduction: Decoding and Dimensionality Reduction

Our focus today: Decoder side § Observed data: !" ∈ ℝ%, ' = 1,2 ⋯ , , § Common latents -" ∈ ℝ. generate the data in an unknown nolinear fashion:

. . . . . . . . . Decoder

  • "

/ !"

!0" = 10 -" + 30"

slide-4
SLIDE 4

Introduction: Role of Dynamics

Dynamics play an important role in solving the inverse problem § The inverse problem is still ill-posed ! Latents can only be identified up to an isomorphism

. . . . . . . . . Decoder

!" # $"

$%" = '% !" + )%" !" = * !"+,

slide-5
SLIDE 5

Koopman Theory Resolves this Ambiguity

§ Generalizes eigenfunction/eigenvalue to nonllinear dynamics: !" = $ !"%& § Koopman Operator: Linear, Infinite-dimensional › Linearizes the dynamics § Eigenfunctions/Eigenvalues:

'( ! = ( $ ! ') ! = *)(!)

  • " = .

/0& 1

2/*/)/ !"%& ) $ ! = *) !

) interacts nicely with $ Goal:

slide-6
SLIDE 6

Polynomials are Eigenfunctions of Linear Dynamics!

!" = $!"%&

' = '()'(* ⋯ '(,

  • !

'

.()

/ !

.(*

/ ! ⋯ .(, / ! Linear Dynamical System

polynomials ' = '()

0)'(* 0* ⋯ '(, 01

.()

/ ! 0) .(* / ! 0* ⋯ .(, / ! 01

posynomials periodic complex 2(’s for conjugate pairs combinations with same ' ReLu, etc. '( .(

/!

eigen pairs of $/

slide-7
SLIDE 7

Polynomials are Eigenfunctions of Linear Dynamics!

!" = $!"%& '()

* !

'(+

* ! ⋯ '(- * !

Good news: they also form a basis!

. ! =

Nonlinear dimensionality reduction for dynamical systems ≡ Low-rank harmonic analysis

slide-8
SLIDE 8

Polynomial Principal Component Analysis (Poly-PCA) § Replace deterministic linear dynamics with an AR model § Model observations as polynomials of degree ≤ " in latents

#$% = '$ ( )%

⊗+ + -$%

)% = .)%/0 + 1%

'$ : Symmetric tensor of polynomial coefficients )% : Latents augmented with 1

minimize'7, 9: ;

$,%

#$% − '$ ( )%

⊗+ = + > ; %

||)% − .)%/0||=

=

slide-9
SLIDE 9

Example: Van der Pol Oscillator with Quadratic Measurements

§ A 2-dimensional nonlinear oscillator

!"# = %#

&'"%# + )"#

* ̇ % 1 = % 2 ̇ % 2 = 3 1 − % 1 0 % 2 − % 1

10

  • rigin

t i m e ( s )

2 4 6 8 10

10

time (s)

10

For known rank-1 '" ≡ phase retrieval

slide-10
SLIDE 10

Why does linear dimensionality reduction fail?

§ Nonlinearity changes topology

ICA PCA tsne L L E

  • rigin

time (s)

2 4 6 8 10 10

time (s)

10

slide-11
SLIDE 11

Poly-PCA Recovers True Latents

time (s)

2 4 6 8 10

10

  • rigin

10

  • rigin

Non-singular linear map

Ground-truth Recovered

slide-12
SLIDE 12

Axioms of Dimensionality Reduction

§ Nonsingular linear transformations of latents should also be a solution § Nonsingular and stable linear transformations of measurements should result in the same latents › Gives stability and robustness to outliers § Stable reconstruction possible if › !" = $ %" is Lipschitz: far away latents do not map to close

  • bservations

Poly-PCA is compatible with these Axioms!

slide-13
SLIDE 13

Some Poly-PCA Theory

§ Poly-PCA ≡ constrained PCA › ALS has no local minima, local minima appear from the polynomial constraints § Experimental observation: Poly-PCA has few local minima (compared to Bezout’s theorem)

x

x local minimum

local maximum

feasible manifold (unique) Least squares error Poly-PCA error

Poly-PCA LS

Also gives a good intuitive initialization

slide-14
SLIDE 14

Some Poly-PCA Theory

§ Linear ambiguity can be handled by small penalization

convex

  • ne global minimum

nonconvex

  • ne global minimum

All minima of Poly-PCA

minimize&', )* +

,,-

.,- − &, 0 1-

⊗3 4 + 6 +

  • ||1-||4

4 + 8 + ,

||&,||4

4

slope =

  • ptimal solution

after small regularization local minimum 2 manifold of equivalent local minima local minimum 1

slide-15
SLIDE 15

Some Poly-PCA Theory

§ Local minima are unique up to a linear transformation, i.e. for !" and #$ in general position and% >

'() ) *:

› Generalization of linear preserver theory § Conjecture: minimum required samples is % >

'() )

+ , + 1. #$ / !"

⊗) = ℬ$ / 3" ⊗) ⇒ !" = 53"

slide-16
SLIDE 16

Poly-PCA Decoder

. . . . . . . . .

+ +

. . .

neuron 1 neuron p

!" = $

%&' (

)"%

⊗+ ⇒ !" - ./ ⊗0 = $ %&' (

)"%

1./

Equivalence to a 1-layer Decoder with Polynomial Activation

§ Not easy to train (Mondelli, Montanari, 2018) § Better to train directly on !" § Universal approximation theorem ≡ Taylor approximation theorem

slide-17
SLIDE 17

Poly-PCA Initialization

§ Strategy 1: Use PCA › Beats PCA › Need larger Lipschitz constant § Strategy 2: Data Embedding + PCA › Use Taken’s Embedding Theorem !"(1) !"&'(1) !"&('(1) Ground-truth Embedded