low rank approximation lecture 10
play

Low Rank Approximation Lecture 10 Daniel Kressner Chair for - PowerPoint PPT Presentation

Low Rank Approximation Lecture 10 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Time-dependent low-rank approximation Goal: Approximate given matrix A ( t ) at every time t


  1. Low Rank Approximation Lecture 10 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1

  2. Time-dependent low-rank approximation Goal: Approximate given matrix A ( t ) at every time t ∈ [ 0 , T ] by rank- r matrix. Motivation for taking time-dependence into account: ◮ If σ r ( t ) > σ r + 1 ( t ) for all t ∈ [ 0 , T ] then T r ( A ( t )) inherits smoothness of A wrt t [Baumann/Helmke’2003; Mehrmann/Rath’1993]. ◮ Low-rank approximation at t i may yield valuable information for low-rank approximation at nearby t i + 1 . ◮ Allows us to work with tangent space (linear) instead of manifold (nonlinear). Much easier to impose additional structure. More general setting: Approximate solution of ODE ˙ A ( t ) = F ( A ( t )) , A ( 0 ) = A 0 . by trajectory X ( t ) in M r . Main application for low-rank tensors: Discretized time-dependent high-dimensional PDEs. 2

  3. Dynamical low-rank approximation Recall definition of tangent vectors. If X : [ 0 , T ] → R m × n is a smooth curve in M r then ˙ X ( t ) ∈ T X ( t ) M r , ∀ t ∈ [ 0 , T ] . Dynamical low-rank approximation: Given A : [ 0 , T ] → R m × n , for every t construct X ( t ) ∈ M r satisfying ˙ � ˙ X ( t ) − ˙ X ( t ) ∈ T X ( t ) M r such that A ( t ) � = min ! ◮ Differential equation needs to be supplemented by initial value, say X ( 0 ) = T r ( A ( 0 )) , which is in M r unless A ( 0 ) has rank less than r . ◮ Hope: X stays close to A if A admits good rank- r approximation for all t . ◮ For efficiency, aim at finding equivalent system of ≈ dim M r differential equations. 3

  4. Dynamical low-rank approximation Counterexample to hope:   1 0 0  , 10 − 5 e t − 1 A ( t ) = 0 0 t ∈ [ 0 , 10 ] .  10 − 5 e 1 − t 0 0 For r = 2, dynamical low-rank approximation yields  1 0 0   . 0 0 0 X ( t ) =  10 − 5 e 1 − t 0 0 � A ( T ) − X ( T ) � F = 10 4 � A ( T ) − T r ( A ( T )) � F = 10 − 14 . but Dynamical low-rank approximation is “blind” to evolution of 10 − 5 e t − 1 . Can only hope for good approximation on short time intervals and if σ r ( t ) > σ r + 1 ( t ) . 4

  5. Dynamical low-rank approximation ˙ � ˙ X ( t ) − ˙ X ( t ) ∈ T X ( t ) M r such that A ( t ) � = min ! is equivalent to differential equation X ( t ) = P X ( t ) ( ˙ ˙ A ( t )) , (1) where P X ( t ) is the orthogonal projection onto T X ( t ) M r . Will now omit dependence on time. Given X = USV T with S ∈ R r × r (not necessarily diagonal), U ∈ St ( m , r ) and V ∈ St ( n , r ) , we have P X ( ˙ P U ˙ U ˙ AP V + P U ˙ AP V + P ⊥ AP ⊥ A ) = V SV T + ˙ USV T + US ˙ U ˙ V T , = where U T ˙ ˙ S = AV ˙ U ˙ P ⊥ AVS − 1 U = ˙ V ˙ P ⊥ A T US − T . V = 5

  6. Dynamical low-rank approximation Dynamical low-rank approximation is equivalent to system of differential equations U T ˙ ˙ S = AV ˙ U ˙ P ⊥ AVS − 1 U = ˙ V ˙ P ⊥ A T US − T V = with initial values S ( 0 ) = S 0 ∈ R r × r , U ( 0 ) = U 0 ∈ St ( m , r ) , V ( 0 ) = V 0 ∈ St ( n , r ) . Remarks: ◮ ˙ U ∈ T U St ( m , r ) and thus U stays in St ( m , r ) ; ˙ V ∈ T V St ( n , r ) and thus V stays in St ( n , r ) . This can be preserved using geometric integration [Hairer/Lubich/Wanner’2006] or combining standard integrators with retractions. ◮ Step size control should aim at controlling error for USV T and not for individual factors ◮ Presence of S − 1 makes differential equations very stiff for ill-conditioned S ( σ r close to zero); results in small step sizes of explicit integrators. 6

  7. Dynamical low-rank approximation Error bound from Theorem 5.1 in [Koch/Lubich’2007]. Theorem Suppose that for t ∈ [ 0 , T ] we have ◮ σ r ( A ( t )) > σ r + 1 ( A ( t )) ; ◮ � A ( t ) − T r ( A ( t )) � F ≤ ρ/ 16 with ρ = min t ∈ [ 0 , T ] σ r ( A ( t )) . Then, with X ( 0 ) = T r ( A ( 0 )) , we have � t � X ( t ) − T r ( A ( t )) � F ≤ 2 β e β t � A ( s ) − T r ( A ( s )) � F d s 0 where β = 8 µ/ρ . 7

  8. Dynamical low-rank approximation A = F ( A ) with F : R m × n → R m × n . Extension to dynamical system ˙ Dynamical low-rank approximation: Construct X ( t ) ∈ M r satisfying ˙ � ˙ X ( t ) ∈ T X ( t ) M r such that X ( t ) − F ( X ( t )) � = min ! Equivalent to dynamical system ˙ X ( t ) = P X ( t ) ( F ( X ( t ))) . Equivalent to ˙ U T F ( X ( t )) V S = ˙ P ⊥ U F ( X ( t )) VS − 1 U = ˙ P ⊥ V F ( X ( t )) T US − T V = EFY. Consider the linear matrix differential equation ˙ A = LA + AR for L ( t ) ∈ R m × m and R ( t ) ∈ R n × n . Show that A ( t ) ∈ M r for all t > 0 if A ( 0 ) ∈ M r . Write down the differential equations for U , S , V . EFY. Consider the nonlinear matrix differential equation ˙ A = F ( A ) := LA + AR + A ◦ A , where ◦ denotes the elementwise product. Develop an efficient method for evaluating F ( A ( t )) for A ( t ) ∈ M r when r ≪ m , n . 8

  9. Dynamical low-rank approximation For error analysis assume for Y ( t ) := T r ( A ( t )) that ◮ � F ( Y ( t )) � F ≤ µ , � F ( X ( t )) � F ≤ µ for t ∈ [ 0 , T ] ; ◮ � F ( Z 1 ) − F ( Z 2 ) , Z 1 − Z 2 � ≤ λ � Z 1 − Z 2 � 2 F for all Z 1 , Z 2 ∈ M r and some fixed λ ∈ R ; ◮ � F ( Y ( t )) − F ( A ( t )) � F ≤ L � Y ( t ) − A ( t ) � F for t ∈ [ 0 , T ] . Theorem 6.1 in [Koch/Lubich’2007]: Theorem Under assumptions above, suppose that for t ∈ [ 0 , T ] we have ◮ σ r ( A ( t )) > σ r + 1 ( A ( t )) ; ◮ � A ( t ) − T r ( A ( t )) � F ≤ ρ/ 16 with ρ = min t ∈ [ 0 , T ] σ r ( A ( t )) . Then, with X ( 0 ) = T r ( A ( 0 )) , we have � t � X ( t ) − T r ( A ( t )) � F ≤ ( 2 β + L ) e ( 2 β + λ ) t � A ( s ) − T r ( A ( s )) � F d s 0 9

  10. Dynamical low-rank approximation Numerical integrators face severe problems as σ r → 0. Unfortunately, σ r ≈ 0 is a very likely situation, as problems of interest typically feature quick singular value decay. Regularization ( = increasing small singular values of S by ǫ > 0) introduces additional errors whose effect is poorly understood. Fundamental underlying problem: U , V ill-conditioned in the presence of small σ r . Idea by [Lubich/Oseledets’2014]: Splitting integrator that is insensitive to σ r → 0. 10

  11. Projector-splitting integrator Recall that X ( t ) = P X ( t ) ( ˙ ˙ A ( t )) , with P U ZP V + P ⊥ U ZP V + P ⊥ P X ( Z ) = V ZP U ZVV T − UU T ZVV T + UU T Z = = ZP range ( X T ) − P range ( X ) ZP range ( X T ) + P range ( X ) Z . One step of Lie-Trotter splitting integrator from t 0 to t 1 = t 0 + h applied to this decomposition: 1. Solve ˙ X I = ˙ AP range ( X T I ) on [ t 0 , t 1 ] with i.v. X I ( t 0 ) = X 0 ∈ M r . 2. Solve ˙ X II = − P range ( X II ) ˙ II ) on [ t 0 , t 1 ] with i.v. AP range ( X T X II ( t 0 ) = X I ( t 1 ) . 3. Solve ˙ X III = P range ( X III ) ˙ A on [ t 0 , t 1 ] with i.v. X III ( t 0 ) = X II ( t 1 ) . Approximation returned: X 1 := X III ( t 1 ) . Standard theory [Hairer/Lubich/Wanner’2006]: Lie-Trotter splitting integrator has convergence order one. 11

  12. Projector-splitting integrator Consider first step: Rhs ˙ AP range ( X T I ) ∈ T X I M r and hence X I ∈ M r for t ∈ [ t 0 , t 1 ] . � factorization X I = U I S I V T I . Differentiation � 1 ! ∂ t ( U I S I ) V T I + U I S I ˙ V T I = ˙ = ˙ AV I V T X I I . Satisfied if 1 ∂ t ( U I S I ) = ˙ ˙ AV I , V I = 0 . In particular, primitive of ˙ AV I is AV I . In turn, this equation is solved by � � U I ( t ) S I ( t ) = U I ( t 0 ) S I ( t 0 ) + A ( t ) − A ( t 0 ) V I ( t 0 ) . Similar considerations for second and third step. 12

  13. Projector-splitting integrator Solution of three split equations given by � � 1. U I ( t ) S I ( t ) = U I ( t 0 ) S I ( t 0 ) + A ( t ) − A ( t 0 ) V I ( t 0 ) , � T V II ( t 0 ) , 2. S II ( t ) = S II ( t 0 ) − U II ( t 0 ) T � A ( t ) − A ( t 0 ) 3. V III ( t ) S III ( t ) T = V III ( t 0 ) S III ( t 0 ) T + � T U III ( t 0 ) , � A ( t ) − A ( t 0 ) Unassigned Quantities ( V I , U II , V II , U III ) remain constant. Practical algorithm: Given X 0 = U 0 S 0 V T 0 and with ∆ A := A ( t 1 ) − A ( t 0 ) , compute 1. QR factorization K I = U I S I for K I = U 0 S 0 + ∆ A V 0 . 2. S II = S I − U T I ∆ A V 0 3. QR factorization L III = V III S T III for L III = V 0 S T II + ∆ T A U I . Returns U 1 S 1 V T 1 = U I S III V T III 13

  14. Projector-splitting integrator Remarks: ◮ KSL splitting is exact if A ( t ) has rank at most r (Theorem 4.1 by Lubich/Oseledets) � robustness in presence of small singular value σ r [Kieri/Lubich/Wallach’2016]. ◮ Symmetrization (KSLSK) leads to second order. ◮ Splitting extends to ˙ A = F ( A ) by setting △ A := hF ( Y 0 ) (but exactness result does not hold). 14

  15. Further literature ◮ Extension to Tucker [Koch/Lubich’2010, Lubich/Vandereycken/Wallach’2018], tensor train [Lubich/Oseledets/Vandereycken’2015] ◮ Application to 3D nonlinear PDEs [Nonnenmacher/Lubich’2009], matrix differential equations [Mena et al.’2018], Vlasov-Poisson equation [Einkemmer/Lubich’2018], time-dependent PDEs with uncertainties [Musharbash/Nobile/Zhou’2014], quantum chemistry [Kloss/Burghardt/Lubich’2017] and physics [Haegeman et al.’2016]. ◮ [Kieri/Vandereycken’2017]: Conceptually simpler approach by combining standard integrators with low-rank truncation. 15

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend