Low Rank Approximation Lecture 10 Daniel Kressner Chair for - - PowerPoint PPT Presentation

low rank approximation lecture 10
SMART_READER_LITE
LIVE PREVIEW

Low Rank Approximation Lecture 10 Daniel Kressner Chair for - - PowerPoint PPT Presentation

Low Rank Approximation Lecture 10 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Time-dependent low-rank approximation Goal: Approximate given matrix A ( t ) at every time t


slide-1
SLIDE 1

Low Rank Approximation Lecture 10

Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch

1

slide-2
SLIDE 2

Time-dependent low-rank approximation

Goal: Approximate given matrix A(t) at every time t ∈ [0, T] by rank-r matrix. Motivation for taking time-dependence into account:

◮ If σr(t) > σr+1(t) for all t ∈ [0, T] then Tr(A(t)) inherits

smoothness of A wrt t [Baumann/Helmke’2003; Mehrmann/Rath’1993].

◮ Low-rank approximation at ti may yield valuable information for

low-rank approximation at nearby ti+1.

◮ Allows us to work with tangent space (linear) instead of manifold

(nonlinear). Much easier to impose additional structure. More general setting: Approximate solution of ODE ˙ A(t) = F(A(t)), A(0) = A0. by trajectory X(t) in Mr. Main application for low-rank tensors: Discretized time-dependent high-dimensional PDEs.

2

slide-3
SLIDE 3

Dynamical low-rank approximation

Recall definition of tangent vectors. If X : [0, T] → Rm×n is a smooth curve in Mr then ˙ X(t) ∈ TX(t)Mr, ∀t ∈ [0, T]. Dynamical low-rank approximation: Given A : [0, T] → Rm×n, for every t construct X(t) ∈ Mr satisfying ˙ X(t) ∈ TX(t)Mr such that ˙ X(t) − ˙ A(t) = min!

◮ Differential equation needs to be supplemented by initial value,

say X(0) = Tr(A(0)), which is in Mr unless A(0) has rank less than r.

◮ Hope: X stays close to A if A admits good rank-r approximation

for all t.

◮ For efficiency, aim at finding equivalent system of ≈ dim Mr

differential equations.

3

slide-4
SLIDE 4

Dynamical low-rank approximation

Counterexample to hope: A(t) =   1 10−5et−1 10−5e1−t   , t ∈ [0, 10]. For r = 2, dynamical low-rank approximation yields X(t) =   1 10−5e1−t   . A(T) − X(T)F = 104 but A(T) − Tr(A(T))F = 10−14. Dynamical low-rank approximation is “blind” to evolution of 10−5et−1. Can only hope for good approximation on short time intervals and if σr(t) > σr+1(t).

4

slide-5
SLIDE 5

Dynamical low-rank approximation

˙ X(t) ∈ TX(t)Mr such that ˙ X(t) − ˙ A(t) = min! is equivalent to differential equation ˙ X(t) = PX(t)( ˙ A(t)), (1) where PX(t) is the orthogonal projection onto TX(t)Mr. Will now omit dependence on time. Given X = USV T with S ∈ Rr×r (not necessarily diagonal), U ∈ St(m, r) and V ∈ St(n, r), we have PX( ˙ A) = PU ˙ APV + P⊥

U ˙

APV + PU ˙ AP⊥

V

= U ˙ SV T + ˙ USV T + US ˙ V T, where ˙ S = UT ˙ AV ˙ U = P⊥

U ˙

AVS−1 ˙ V = P⊥

V ˙

ATUS−T.

5

slide-6
SLIDE 6

Dynamical low-rank approximation

Dynamical low-rank approximation is equivalent to system of differential equations ˙ S = UT ˙ AV ˙ U = P⊥

U ˙

AVS−1 ˙ V = P⊥

V ˙

ATUS−T with initial values S(0) = S0 ∈ Rr×r, U(0) = U0 ∈ St(m, r), V(0) = V0 ∈ St(n, r). Remarks:

◮ ˙

U ∈ TUSt(m, r) and thus U stays in St(m, r); ˙ V ∈ TVSt(n, r) and thus V stays in St(n, r). This can be preserved using geometric integration [Hairer/Lubich/Wanner’2006] or combining standard integrators with retractions.

◮ Step size control should aim at controlling error for USV T and

not for individual factors

◮ Presence of S−1 makes differential equations very stiff for

ill-conditioned S (σr close to zero); results in small step sizes of explicit integrators.

6

slide-7
SLIDE 7

Dynamical low-rank approximation

Error bound from Theorem 5.1 in [Koch/Lubich’2007].

Theorem

Suppose that for t ∈ [0, T] we have

◮ σr(A(t)) > σr+1(A(t)); ◮ A(t) − Tr(A(t))F ≤ ρ/16 with ρ = min t∈[0,T] σr(A(t)).

Then, with X(0) = Tr(A(0)), we have X(t) − Tr(A(t))F ≤ 2βeβt t A(s) − Tr(A(s))F ds where β = 8µ/ρ.

7

slide-8
SLIDE 8

Dynamical low-rank approximation

Extension to dynamical system ˙ A = F(A) with F : Rm×n → Rm×n. Dynamical low-rank approximation: Construct X(t) ∈ Mr satisfying ˙ X(t) ∈ TX(t)Mr such that ˙ X(t) − F(X(t)) = min! Equivalent to dynamical system ˙ X(t) = PX(t)(F(X(t))). Equivalent to ˙ S = UTF(X(t))V ˙ U = P⊥

U F(X(t))VS−1

˙ V = P⊥

V F(X(t))TUS−T

EFY. Consider the linear matrix differential equation ˙ A = LA + AR for L(t) ∈ Rm×m and R(t) ∈ Rn×n. Show that A(t) ∈ Mr for all t > 0 if A(0) ∈ Mr . Write down the differential equations for U, S, V. EFY. Consider the nonlinear matrix differential equation ˙ A = F(A) := LA + AR + A ◦ A, where ◦ denotes the elementwise product. Develop an efficient method for evaluating F(A(t)) for A(t) ∈ Mr when r ≪ m, n. 8

slide-9
SLIDE 9

Dynamical low-rank approximation

For error analysis assume for Y(t) := Tr(A(t)) that

◮ F(Y(t))F ≤ µ, F(X(t))F ≤ µ for t ∈ [0, T]; ◮ F(Z1) − F(Z2), Z1 − Z2 ≤ λZ1 − Z22 F for all Z1, Z2 ∈ Mr and

some fixed λ ∈ R;

◮ F(Y(t)) − F(A(t))F ≤ LY(t) − A(t)F for t ∈ [0, T].

Theorem 6.1 in [Koch/Lubich’2007]:

Theorem

Under assumptions above, suppose that for t ∈ [0, T] we have

◮ σr(A(t)) > σr+1(A(t)); ◮ A(t) − Tr(A(t))F ≤ ρ/16 with ρ = min t∈[0,T] σr(A(t)).

Then, with X(0) = Tr(A(0)), we have X(t) − Tr(A(t))F ≤ (2β + L)e(2β+λ)t t A(s) − Tr(A(s))F ds

9

slide-10
SLIDE 10

Dynamical low-rank approximation

Numerical integrators face severe problems as σr → 0. Unfortunately, σr ≈ 0 is a very likely situation, as problems of interest typically feature quick singular value decay. Regularization (= increasing small singular values of S by ǫ > 0) introduces additional errors whose effect is poorly understood. Fundamental underlying problem: U, V ill-conditioned in the presence

  • f small σr.

Idea by [Lubich/Oseledets’2014]: Splitting integrator that is insensitive to σr → 0.

10

slide-11
SLIDE 11

Projector-splitting integrator

Recall that ˙ X(t) = PX(t)( ˙ A(t)), with PX(Z) = PUZPV + P⊥

U ZPV + P⊥ V ZPU

= ZVV T − UUTZVV T + UUTZ = ZPrange(X T ) − Prange(X)ZPrange(X T ) + Prange(X)Z. One step of Lie-Trotter splitting integrator from t0 to t1 = t0 + h applied to this decomposition:

  • 1. Solve ˙

XI = ˙ APrange(X T

I ) on [t0, t1] with i.v. XI(t0) = X0 ∈ Mr.

  • 2. Solve ˙

XII = −Prange(XII) ˙ APrange(X T

II ) on [t0, t1] with i.v.

XII(t0) = XI(t1).

  • 3. Solve ˙

XIII = Prange(XIII) ˙ A on [t0, t1] with i.v. XIII(t0) = XII(t1). Approximation returned: X1 := XIII(t1). Standard theory [Hairer/Lubich/Wanner’2006]: Lie-Trotter splitting integrator has convergence order one.

11

slide-12
SLIDE 12

Projector-splitting integrator

Consider first step: Rhs ˙ APrange(X T

I ) ∈ TXIMr and hence XI ∈ Mr for t ∈ [t0, t1].

factorization XI = UISIV T

I .

Differentiation 1 ∂t (UISI)V T

I + UISI ˙

V T

I = ˙

XI

!

= ˙ AVIV T

I .

Satisfied if 1 ∂t (UISI) = ˙ AVI, ˙ VI = 0. In particular, primitive of ˙ AVI is AVI. In turn, this equation is solved by UI(t)SI(t) = UI(t0)SI(t0) +

  • A(t) − A(t0)
  • VI(t0).

Similar considerations for second and third step.

12

slide-13
SLIDE 13

Projector-splitting integrator

Solution of three split equations given by

  • 1. UI(t)SI(t) = UI(t0)SI(t0) +
  • A(t) − A(t0)
  • VI(t0),
  • 2. SII(t) = SII(t0) − UII(t0)T

A(t) − A(t0) TVII(t0),

  • 3. VIII(t)SIII(t)T = VIII(t0)SIII(t0)T +
  • A(t) − A(t0)

TUIII(t0), Unassigned Quantities (VI, UII, VII, UIII) remain constant. Practical algorithm: Given X0 = U0S0V T

0 and with ∆A := A(t1) − A(t0), compute

  • 1. QR factorization KI = UISI for KI = U0S0 + ∆AV0.
  • 2. SII = SI − UT

I ∆AV0

  • 3. QR factorization LIII = VIIIST

III for LIII = V0ST II + ∆T AUI.

Returns U1S1V T

1 = UISIIIV T III

13

slide-14
SLIDE 14

Projector-splitting integrator

Remarks:

◮ KSL splitting is exact if A(t) has rank at most r (Theorem 4.1 by

Lubich/Oseledets) robustness in presence of small singular value σr [Kieri/Lubich/Wallach’2016].

◮ Symmetrization (KSLSK) leads to second order. ◮ Splitting extends to ˙

A = F(A) by setting △A := hF(Y0) (but exactness result does not hold).

14

slide-15
SLIDE 15

Further literature

◮ Extension to Tucker [Koch/Lubich’2010,

Lubich/Vandereycken/Wallach’2018], tensor train [Lubich/Oseledets/Vandereycken’2015]

◮ Application to 3D nonlinear PDEs [Nonnenmacher/Lubich’2009],

matrix differential equations [Mena et al.’2018], Vlasov-Poisson equation [Einkemmer/Lubich’2018], time-dependent PDEs with uncertainties [Musharbash/Nobile/Zhou’2014], quantum chemistry [Kloss/Burghardt/Lubich’2017] and physics [Haegeman et al.’2016].

◮ [Kieri/Vandereycken’2017]: Conceptually simpler approach by

combining standard integrators with low-rank truncation.

15