From Fourier to Koopman Spectral Methods for Long-term Time Series - - PowerPoint PPT Presentation

from fourier to koopman
SMART_READER_LITE
LIVE PREVIEW

From Fourier to Koopman Spectral Methods for Long-term Time Series - - PowerPoint PPT Presentation

From Fourier to Koopman Spectral Methods for Long-term Time Series Prediction arXiv:2004.00574 Henning Lange, Steven L. Brunton, J. Nathan Kutz Objective > Given data snapshots from x t t = 1 t = T to > Predict temporal snapshots x


slide-1
SLIDE 1

From Fourier to Koopman

Henning Lange, Steven L. Brunton, J. Nathan Kutz

Spectral Methods for Long-term Time Series Prediction

arXiv:2004.00574

slide-2
SLIDE 2

> Given data snapshots from to 
 > Predict temporal snapshots > in the order of 10.000
 > Assumption: > is produced by quasi-periodic system

xt t = 1 t = T xT+h h xt

Objective

slide-3
SLIDE 3

Spatio-Temporal Systems

slide-4
SLIDE 4

> Fourier Forecast > Similar to Fourier Transform > No implicit periodicity assumption
 > Koopman Forecast > Based on Koopman theory > Fourier Transform in non-linear basis

Outline

slide-5
SLIDE 5

> Fourier Forecast > Non-convex objective
 > Koopman Forecast > Non-linear and non-convex objective > FFT allows for obtaining global optima

Outline

slide-6
SLIDE 6

> Both learning objectives contain easy and hard to

  • ptimize parameters

> For both algorithms, the strategy for obtaining the global optimum of a single value of the hard to

  • ptimize parameters is introduced

> Apply coordinate descent > Alternately optimize hard and easy quantities

Solution strategy

slide-7
SLIDE 7

Fourier Forecast

slide-8
SLIDE 8

> Goal: Fit linear dynamical system to data

yt xt

Objective

E(A, B) =

T

t=1

(xt − Ayt)2 yt = Byt−1

minimize subject to

Re[eig(B)] = 0

slide-9
SLIDE 9

> Goal: Fit linear dynamical system to data

yt xt

Objective

E(A, ω) =

T

t=1

xt − A sin(ω1t) ⋮ sin(ωNt) cos(ω1t) ⋮ cos(ωNt)

2

slide-10
SLIDE 10

> Goal: Fit linear dynamical system to data

yt xt

Objective

E(A, ω) =

T

t=1

(xt − AΩ(ωt))

2

slide-11
SLIDE 11

> Goal: Fit linear dynamical system to data > Because of linearity of and > Analytic solution for > Symmetry relationship to Fourier Transform

yt xt A Ω ωi

Objective

E(A, ω) =

T

t=1

(xt − AΩ(ωt))

2

slide-12
SLIDE 12

Symmetry

E(A, ω) =

T

t=1

(xt − AΩ(ωt))

2

Jaynes, E. T . "Bayesian spectrum and chirp analysis." Maximum-Entropy and Bayesian Spectral Analysis and Estimation Problems. Springer, Dordrecht, 1987. 1-37.

slide-13
SLIDE 13

> For quasi-periodic systems, FT/error surface is superposition of sinc-functions

Spectral leakage

slide-14
SLIDE 14

> Fast Fourier Transform

> evaluates the Fourier Transform at frequencies with period > harmful for forecasting > Gradient Descent > because of non-convexity, will get stuck in bad local minimum

T

Combining FFT and GD

slide-15
SLIDE 15

> Use Fast Fourier Transform > to locate global valley of error surface

> Use Gradient Descent > to improve initial guess of FFT to break implicit periodicity assumptions

Combining FFT and GD

slide-16
SLIDE 16

Combining FFT and GD

slide-17
SLIDE 17

Koopman Forecast

slide-18
SLIDE 18

Spatio-Temporal Systems

slide-19
SLIDE 19

> Koopman showed in 1931: > any non-linear dynamical system can be lifted by non-linear but time-invariant function into space where time evolution is linear > Analogous to Cover’s theorem (1965) > Theoretical underpinning of Kernel methods and Deep Learning

Koopman Theory

Cover, T .M. (1965). "Geometrical and Statistical properties of systems of linear inequalities with applications in pattern recognition" (PDF). IEEE Transactions on Electronic Computers. EC-14 (3): 326–334 Koopman, Bernard O. "Hamiltonian systems and transformation in Hilbert space." Proceedings of the National Academy of Sciences of the United States of America 17.5 (1931): 315

slide-20
SLIDE 20

Koopman Theory

Koopman: Cover:

f

slide-21
SLIDE 21

Objective: Koopman

Ω(ωt) = sin(ω1t) ⋮ sin(ωNt) cos(ω1t) ⋮ cos(ωNt)

> Recap: Stable Linear Dynamical System

slide-22
SLIDE 22

Objectives

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

E(A, ω) =

T

t=1

(xt − AΩ(ωt))

2

Koopman: Fourier:

slide-23
SLIDE 23

Objectives

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

Koopman:

slide-24
SLIDE 24

Objective: Koopman

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

Koopman:

Neural Network parameterized by Θ

slide-25
SLIDE 25

Objective: Koopman

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

Koopman:

Because of non-linearity, no analytical solution for

ωi

slide-26
SLIDE 26

Objective: Koopman

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

Koopman:

However, in spite of non-linearity and non-convexity, computing global optima in direction of possible!

ωi

slide-27
SLIDE 27

Objective: Koopman

E(Θ, ω) =

T

t=1

(xt − fΘ(Ω(ωt)))

2

Koopman: =

T

t=1

L(Θ, ω, t) L(Θ, ω, t) = (xt − fΘ(Ω(ωt)))

2

slide-28
SLIDE 28

Periodicity in loss

L(Θ, ω + 2π t , t) = (xt − fΘ(Ω((ω + 2π t )t)))

2

= (xt − fΘ(Ω(ωt)))

2

= L(Θ, ω, t)

slide-29
SLIDE 29

Periodicity in loss

L(Θ, ω, t) = L(Θ, ω + 2π t , t) sin((ω + 2π t )t) = sin(ωt + 2π) = sin(ωt)

slide-30
SLIDE 30

Periodicity in loss

L(Θ, ω, t) = L(Θ, ω + 2π t , t)

slide-31
SLIDE 31

Computing the loss

For all , compute loss within

t 2π t

slide-32
SLIDE 32

Computing the loss

For all , repeat computed loss times

t t

slide-33
SLIDE 33

Computing the loss

For all , resample loss

t

slide-34
SLIDE 34

Computing the loss

+ +

Sum all ‘temporally local’ losses

slide-35
SLIDE 35

Computing the loss

+ + =

slide-36
SLIDE 36

Easy and efficient to implement in freq. domain!

Computing the loss

for t in range(T): E_ft[range(K)*t] += fft(L[t]) E = ifft(E_ft)

slide-37
SLIDE 37

Results

slide-38
SLIDE 38

> Fourier algorithm has universal approximation properties on finite datasets > Sines and cosine form an orthogonal basis > which is periodic in > Analogous to Cover’s theorem, requires dimensional space

T N

Results: Theoretical

slide-39
SLIDE 39

> For infinite data, Koopman algorithm is more expressive than Fourier counterpart

Results: Theoretical

slide-40
SLIDE 40

> Close relationship to Bayesian Spectral analysis > Error grows linear in time and with noise variance > But shrinks superlinearly with amount of data

Results: Theoretical

Jaynes, E. T . "Bayesian spectrum and chirp analysis." Maximum-Entropy and Bayesian Spectral Analysis and Estimation Problems. Springer, Dordrecht, 1987. 1-37. Bretthorst, G. Larry. Bayesian spectrum analysis and parameter estimation. Vol. 48. Springer Science & Business Media, 2013.

| ̂ xt(ω) − ̂ xt(ω*)| ∈ 𝒫 ( t T3 ∑

i

σ2 Ai )

slide-41
SLIDE 41

Results: Practical

xt = sin ( 2π 24 t)

17

+ ϵt

slide-42
SLIDE 42

Results: Practical

slide-43
SLIDE 43

Results: Practical

slide-44
SLIDE 44

Results: Practical

slide-45
SLIDE 45

Results: Practical

slide-46
SLIDE 46

Spatio-Temporal Systems

slide-47
SLIDE 47

> Fit linear and non-linear oscillators to data > non-convex and non-linear objective > Many real world phenomena are quasi-periodic > gait, (space) weather, fluid flows, epidemiological data, power systems, sales, room occupancy, …
 > Code is available:

> https://github.com/helange23/from_fourier_to_koopman

Summary