low rank approximation lecture 9
play

Low Rank Approximation Lecture 9 Daniel Kressner Chair for - PowerPoint PPT Presentation

Low Rank Approximation Lecture 9 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1 Manifold optimization General setting: Aim at solving optimization problem X M r f ( X ) ,


  1. Low Rank Approximation Lecture 9 Daniel Kressner Chair for Numerical Algorithms and HPC Institute of Mathematics, EPFL daniel.kressner@epfl.ch 1

  2. Manifold optimization General setting: Aim at solving optimization problem X ∈M r f ( X ) , min where M r is a manifold of rank- r matrices or tensors. Goal: Modify classical optimization algorithms (line search, Newton, quasi-Newton, ...) to produce iterates that stay on M r . Advantages over ALS: ◮ No need to solve subproblems, at least for first-order methods; ◮ Can draw on concepts from classical smooth optimization (line search strategies, convergence analysis, ...). Two valuable resources: ◮ Absil/Mahony/Sepulchre’2011: Optimization Algorithms on Matrix Manifolds. PUP , 2008. Available from https://press.princeton.edu/absil ◮ Manopt, a Matlab toolbox for optimization on manifolds. Available from https://manopt.org/ 2

  3. Manifolds For open sets U ⊂ M , V ⊂ R d chart is bijective function ϕ : U → V . Atlas of M into R d is collection of charts ( U α , ϕ α ) such that: ◮ � α U α = M ◮ for any α, β with U α ∩ U β � = {∅} , change of coordinates : R d → R d ϕ β ◦ ϕ − 1 α is smooth ( C ∞ ) on its domain ϕ α ( U α ∩ U β ) . Illustration taken from Wikipedia. 3

  4. Manifolds In the following, we assume that atlas is maximal. Proper definition of smooth manifold M needs further properties (topology induced by maximal atlas is Hausdorff and second-countable). See [Lee’2003] and [Absil et al.’2008]. Properties of M : ◮ finite-dimensional vector spaces are always manifolds; ◮ d = dimension of M ; ◮ M does not need to be connected (in the context of smooth optimization makes sense to consider connected manifolds only); ◮ function f : M → R differentiable at point x ∈ M if and only if f ◦ ϕ − 1 : ϕ ( U ) ⊂ R d → R is differentiable at ϕ ( x ) for some chart ( U , ϕ ) with x ∈ U . 4

  5. Manifolds: First examples Lemma Let M be a smooth manifold and N ⊂ M an open subset. Then N is a smooth manifold (of equal dimension). Proof: Given atlas for M obtain atlas for N by selecting charts ( U , ϕ ) with U ⊂ N . Example: GL ( n , R ) , the set of real invertible n × n matrices, is a smooth manifold. Show that R m × n EFY. , the set of real m × n matrices of full rank min { m , n } , is a smooth manifold. ∗ EFY. Show that the set of n × n symmetric positive definite matrices is a smooth manifold. Two main classes of matrix manifolds: ◮ embedded submanifolds of R m × n ; Example: Stiefel manifold of orthonormal bases. ◮ quotient manifolds; Example: Grassmann manifold R m × n / GL ( n , R ) . ∗ Will focus on embedded submanifolds (much easier to work with). 5

  6. Immersions and submersion Let M 1 , M 2 be smooth manifolds and F : M 1 → M 2 . Let x ∈ M 1 and y = F ( x ) ∈ M 2 . Choose charts ϕ 1 , ϕ 2 around x , y . Then coordinate representation of F given by : R d 1 → R d 2 . ˆ F := ϕ 2 ◦ F ◦ ϕ − 1 1 ◮ F is called smooth if ˆ F is smooth (that is, C ∞ ). ◮ rank of F at x ∈ M 1 defined as the rank of D ˆ F ( ϕ ( x 1 )) (Jacobian of ˆ F at ϕ ( x 1 ) ) ◮ F is called an immersion if its rank equals d 1 at every x ∈ M 1 . ◮ F is called a submersion if its rank equals d 2 at every x ∈ M 1 . 6

  7. Embedded submanifolds Subset N ⊂ M is called an embedded submanifold of dimension k in M if for each point p ∈ N there is a chart ( U , ϕ ) in M such that all elements of U ∩ N are obtained by varying first k coordinates only. (See Chapter 5 of [Lee’2003] for more details.) Theorem Let M , N be smooth manifolds and let F : M → N be a smooth map with constant rank ℓ . Then each level set F − 1 ( y ) := { x ∈ M : F ( x ) = y } is a closed embedded submanifold of codimension ℓ in M . Corollaries: ◮ If F : M → N is a submersion then each level is a closed embedded submanifold of codimension equal to the dimension of N . ◮ In fact, by open submanifold lemma, only need to check full rank condition of submersion for points in the level set (replace M by the open set for which F has full rank). 7

  8. The Stiefel manifold For m ≥ n , consider the set of all m × n matrices with orthonormal columns: St ( m , n ) := { X ∈ R m × n : X T X = I n } . Corollary St ( m , n ) is an embedded submanifold of R m × n . Proof: Define F : R m × n → symm ( n ) as F : X �→ X T X , where symm ( n ) denotes set of n × n symmetric matrices. At X ∈ St ( m , n ) , consider Jacobian DF ( X ) : H �→ X T H + H T X . Given symmetric Y ∈ R n × n , set H = XY / 2. Then DF ( X )[ H ] = Y ; thus DF ( X ) is surjective. EFY. What is the dimension of the Stiefel manifold? 8

  9. The manifold of rank- k matrices Locality of definition of embedded submanifolds implies the following lemma (Lemma 5.5 in [Lee’2003]). Lemma Let N be subset of smooth manifold M . Suppose every point p ∈ N has a neighborhood U ⊂ M such that U ∩ N is an embedded submanifold of U . Then N is an embedded submanifold of M . Theorem Given m ≥ n, the set M k = { A ∈ R m × n : rank ( A ) = k } is an embedded submanifold of R m × n for every 0 ≤ k ≤ n. 9

  10. The manifold of rank- k matrices Choose arbitrary A 0 ∈ M k . After a suitable permutation, may assume w.l.o.g. that � A 11 � A 12 A 11 ∈ R k × k is invertible . A 0 = , A 21 A 22 This property remains true in an open neighborhood U ⊂ R m × n of A 0 . Factorize A ∈ U as � � � A 11 � � A − 1 � I 0 0 I 11 A 12 A = . A 21 A − 1 A 22 − A 21 A − 1 I 0 11 A 12 0 I 11 Define F : U → R ( m − k ) × ( n − k ) as F : A �→ A 22 − A 21 A − 1 11 A 12 . Then F − 1 ( 0 ) = U ∩ M k . 10

  11. The manifold of rank- k matrices For arbitrary Y ∈ R ( m − k ) × ( n − k ) , we obtain that �� 0 �� 0 DF ( A ) = Y . 0 Y Thus, F is a submersion. In turn, U ∩ M k is an embedded submanifold of U . By lemma, M k is an embedded submanifold of R m × n . EFY. What is the dimension of M k ? EFY. Is M k connected? Prove that the set of symmetric rank- k matrices is an embedded submanifold of R n × n . Is this manifold connected? EFY. 11

  12. Tangent space In the following, much of the discussion restricted to submanifolds M embedded in vector space V with inner product �· , ·� and induced norm � · � . Given smooth curve γ : R → M with x = γ ( 0 ) , we call γ ′ ( 0 ) ∈ V a tangent vector at x . The tangent space T x M ⊂ V is the set of all tangent vectors at x . Lemma T x M is a subspace of V. Proof. If v 1 , v 2 are tangent vectors then there are smooth curves γ 1 , γ 2 such that x = γ 1 ( 0 ) = γ 2 ( 0 ) and γ ′ 1 ( 0 ) = v 1 , γ ′ 2 ( 0 ) = v 2 . To show that α v 1 + β v 2 for α, β ∈ R is again a tangent vector, consider chart ( U , ϕ ) around x such that ϕ ( x ) = 0. Define γ ( t ) = ϕ − 1 ( αϕ ( γ 1 ( t )) + βϕ ( γ 2 ( t ))) for t sufficiently close to 0. Then γ ( 0 ) = x and γ ′ ( 0 ) = α v 1 + β v 2 . EFY. Prove that the dimension of Tx M equals the dimension of M using a coordinate chart. 12

  13. Tangent space Application of definition to Stiefel manifold. Let γ ( t ) = X + tY + O ( t 2 ) be a smooth curve with X ∈ St ( m , n ) . To ensure that γ ( t ) ∈ St ( m , n ) , we require I n = γ ( t ) T γ ( t ) = ( X + tY ) T ( X + tY )+ O ( t 2 ) = I n + t ( X T Y + Y T X )+ O ( t 2 ) . Thus, X T Y + Y T X = 0 characterizes tangent space: { Y ∈ R m × n : X T Y = − Y T X } T x St ( m , n ) = { XW + X ⊥ W ⊥ : W ∈ R n × n , W = − W T , W ⊥ ∈ R ( m − n ) × n } = where the columns of X ⊥ form basis of span ( X ) ⊥ 13

  14. Tangent space When M is defined (at least locally) as level set of constant rank function F : V → R N , we have T x M = ker ( DF ( x )) . Proof. Let v ∈ T x M , that is, there is a curve γ : R → M such that γ ( 0 ) = x and γ ′ ( 0 ) = v . Then, by chain rule, � DF ( x )[ v ] = DF ( x )[ γ ′ ( 0 )] = ∂ � ∂ t F ( γ ( t )) = 0 , � � t = 0 because F is constant on M . Thus, T x M ⊂ ker ( DF ( x )) , which completes the proof by counting dimensions. 14

  15. Tangent space of M k Recall that M k was obtained as level set of local submersion F : A �→ A 22 − A 21 A − 1 11 A 12 . Given A ∈ M k consider SVD � � � � V Σ 0 � T . � U A = U ⊥ V ⊥ 0 0 We have � Σ � 0 DF [ H ] = H 22 . 0 0 Thus, H is in the kernel if and only if H 22 = 0. In terms of A this implies R k × k R k × ( n − k ) � � � � V � T T A M k � U = ker ( DF ( A )) = U ⊥ V ⊥ R ( m − k ) × k 0 { UMV T + U p V T + UV T p : M ∈ R k × k , U T p U = V T p V = 0 } . = EFY. Compute the tangent space for the embedded submanifold of rank- k symmetric matrices. 15

  16. Riemannian manifold and gradient For submanifold M embedded in vector space V : Inner product �· , ·� on V induces inner product on T x M . This turns M into a Riemannian manifold. 1 The (Riemannian) gradient of smooth f : M → R at x ∈ M is defined as the unique element grad f ( x ) ∈ T x M that satisfies � grad f ( x ) , ξ � = Df ( x )[ ξ ] , ∀ ξ ∈ T x M . EFY. Prove that the Riemannian gradient satisfies the steepest ascent property grad f ( x ) = arg max Df ( x )[ ξ ] . � grad f ( x ) � 2 ξ ∈ Tx M � ξ � = 1 1 In general, for a Riemannian manifold one needs to have an inner product on T x M that varies smoothly wrt x . 16

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend