optimizing an homogeneous polynomial on the unit sphere
play

Optimizing an homogeneous polynomial on the unit sphere Rima Khouja - PowerPoint PPT Presentation

Optimizing an homogeneous polynomial on the unit sphere Rima Khouja Advisors: Bernard Mourrain Houssam Khalil JNCF 2020 1/20 We consider K = C or R . Let P be an homogeneous polynomial in K [ x 1 , . . . , x n ] d := K [ x ] d . The spectral


  1. Optimizing an homogeneous polynomial on the unit sphere Rima Khouja Advisors: Bernard Mourrain Houssam Khalil JNCF 2020 1/20

  2. We consider K = C or R . Let P be an homogeneous polynomial in K [ x 1 , . . . , x n ] d := K [ x ] d . The spectral norm of P , denoted by || P || σ, K , is by definition: || P || σ, K = max v ∈ S n − 1 | P ( v ) | , (1) where S n − 1 is the unit sphere in K n . For K = C , (1) is also equivalent to optimizing Re ( P ( v )) on the unit sphere. 2/20

  3. Approximating P by a linear form to the d th power of the form ( v 1 x 1 + · · · + v n x n ) d := ( v t x ) d is solving the following minimization problem: ( w,v ) ∈ K × S n − 1 || P − w ( v t x ) d || d , dist 1 ( P ) := min (2) where || . || d is the apolar norm on K [ x ] d , such that for P = � � d � p α x α , | α | = d α �� � � d � || P || d = � P, P � d = p α ¯ p α . | α | = d α The two problems (1) and (2) are equivalent since: dist 1 ( P ) 2 = || P || 2 d − || P || 2 σ, K 3/20

  4. The symmetric tensor decomposition P is a symmetric tensor of order d of dimension n ∈ S d ( C n ): d times � �� � n × ... × n P = [ a i 1 ...i d ] ∈ C such that a i 1 ...i d = a i σ (1) ...i σ ( d ) , ∀ σ ∈ G d The symmetric tensor decomposition of P consists in writing it as a sum of rank one symmetric tensors r � λ i ∈ C , a i ∈ C n P = λ i a i ⊗ ... ⊗ a i , (3) � �� � i =1 d times By correspondance of S d ( C n ) with C [ x ] d , (3) is equivalent to write a homogeneous polynomial P (associated to P ) as a sum of linear forms to the d th power: P = � r i =1 λ i ( a i, 1 x 1 + ... + a i,n x n ) d := � r i =1 λ i ( a t i x ) d 4/20

  5. Symmetric rank and applications The symmetric rank of P , denoted by rank s ( P ), is the smallest ” r ” such that the decomposition (3) exists. Applications of this decomposition are numerous and of practical importance. 1 2 3 4 5 6 1 M. Janzamin, R. Ge, J. Kossaifi, and A. Anandkumar,Spectral learning on matrices and709tensors, Foundations and TrendsR c � in Machine Learning, 12 (2019), pp. 393–536. 2 P. Comon, “Tensor decompositions,” in Mathematics in Signal Processing V, J.G. McWhirter and I.K. Proudler, Eds., pp. 1–24. Clarendon Press, Oxford, UK, 2002. 3 P. Comon and M. Rajih, “Blind identification of under-determined mixtures based on the characteristic function,” Signal Process., 86 (2006), no. 9, pp. 2271–2281. 4 L. de Lathauwer, B. de Moor, and J. Vandewalle, “A multilinear singular value decomposition,” SIAM J. Matrix Anal. Appl., 21 (2000), no. 4, pp. 1253–1278. 5 N.D. Sidiropoulos, R. Bro, and G.B. Giannakis, “Parallel factor analysis in sensor array pro- cessing,” IEEE Trans. Signal Process., 48 (2000), no. 8, pp. 2377–2388. 6 A. Smilde, R. Bro, and P. Geladi, Multi-way analysis, John Wiley, West Sussex, UK, 2004. 5/20

  6. The symmetric tensor approximation problem (STA) In practice the input tensor P is known with some errors on its coefficients. For this reason, computing an approximate decomposition of low symmetric rank is usually more interesting than computing the exact symmetric decomposition of P : ( STA ) 1 Q ∈ σ r || Q − P || 2 2 min (4) d , where σ r = { Q ∈ C [ x ] d | rank s ( Q ) ≤ r } We develop a Riemannian Newton iteration (RNS) to solve localy (4) when r is strictly lower than the generic symmetric rank given by the ”Alexander-Hirschowitz” theorem. 7 7 J. Alexander and A. Hirschowitz, “Polynomial interpolation in several variables,” J. Algebraic Geom., 4 (1995), no. 2, pp. 201–222. 6/20

  7. The Riemannian Newton algorithm 2 || F ( p ) || 2 with F : M → E where Let f : M → R + , p �→ 1 ( M , � ., . � ) is a Riemannian manifold and E is an Euclidean space such that dim ( M ) ≤ dim ( E ). Newton’s method for a Riemannian least square problem consists in generating from an initial point x 0 , a sequence of iterates x 1 , x 2 , . . . which converges to a local minimum. We can summarize the method in two steps: 1 Solve the Newton equation Hess f ( x k )[ η k ] = − grad f ( x k ) for the unknown η k ∈ T x k M , where T x k M is the tangent space on M at the point x k ; 2 Apply a retraction operator R x k : T x k M → M to define the new point x k +1 on M by, x k +1 := R x k ( η k ). 7/20

  8. Retraction operator Definition Let M be a manifold, p ∈ M . A retraction R p is a map T p M → M that satisfies the following properties: 1 R p (0 p ) = p ; 2 There exists an open neighborhood U ⊂ T p M of 0 p such that the restriction on U is well-defined and a smooth map; 3 R p satisfies the local rigidity condition DR p (0 p ) = id T p M where id T p M denotes the identity mapping on T p M . 8/20

  9. Formulation of the STA problem as a Riemannian optimization problem Definition The Veronese manifold in C [ x ] d denoted by V d n is the set of the linear forms to the d th power in C [ x ] d − { 0 } . × r := V d Let Σ : V d n × · · · × V d n − → C [ x ] d , n i x ) d ) 1 ≤ i ≤ r ) = � r ( w i ( v t i x ) d ) 1 ≤ i ≤ r �− → Σ(( w i ( v t i =1 w i ( v t i x ) d Therefore, we formulate the Riemannian least square problem for the symmetric tensor approximation problem by using the Cartesian product of the Veronese manifold as follow: ( STA ∗ ) y ∈M f ( y ) min × r , y = ( w i ( v t where M = V d i x ) d ) 1 ≤ i ≤ r , f ( y ) = 1 2 || F ( y ) || 2 n d and F ( y ) = Σ( y ) − P. 9/20

  10. Explicit formula of the Riemannian Newton iteration for the STA problem We call the vector ( w i ) 1 ≤ i ≤ r ∈ C × r ”the weight vector”, and we denote it by W . We represent the polynomials i x ) d by the matrix V = [ v i ] 1 ≤ i ≤ r ∈ C n × r . ( v t Parameterization: We consider f as a function of � � r , V ∈ C n × r , || v i || 2 = 1 ( W, Re ( V ) , Im ( V )) | W ∈ R ∗ N = + Let p ∈ N , and let B be an orthogonal projection on T p N . We denote by G p (resp. H p ) the gradient (resp. the Hessian) of f at p , then: Newton equation: H p [ η ] = − G p , such that: G p = B T G R p , H p = B T H R p B 10/20

  11. Explicit formula of the Riemannian Newton iteration for the STA problem   Re ( G 1 ) G R p is given by the vector 2 Re ( G 2 )  , such that:  − 2 Im ( G 2 ) j v i ) d − P (¯ G 1 = ( � r i =1 w i ( v ∗ v j )) 1 ≤ j ≤ r 2 ( d � r v i − w j ∇ ¯ G 2 = 1 i =1 w i w j ( v ∗ i v j ) ( d − 1) ¯ P ( v j )) 1 ≤ j ≤ r 11/20

  12. Explicit formula of the Riemannian Newton iteration for the STA problem H R p is given by the following block matrix:   2 Re ( B ) T − 2 Im ( B ) T Re ( A ) 2 Re ( B ) 2( Re ( C ) + Re ( D )) − 2( Im ( C ) + Im ( D ))   − 2 Im ( B ) 2( Im ( D ) − Im ( C )) 2( Re ( D ) − Re ( C )) such that: A = [( v ∗ i v j ) d ] 1 ≤ i,j ≤ r v j + δ i,j ( d � r B = 1 j v i ) d − 1 ¯ l v i ) d − 1 ¯ 2 [ dw i ( v ∗ l =1 w l ( v ∗ v l − ∇ ¯ P ( v j ))] 1 ≤ i,j ≤ r 2 diag [ d ( d − 1)[ � r C = 1 i v j ) d − 2 ] 1 ≤ k,l ≤ n − i =1 w i w j v i,k v i,l ( v ∗ w j ∆ ¯ P ( v j )] 1 ≤ j ≤ r D = 1 2 [ dw i w j ( v ∗ i v j ) d − 2 (( v ∗ i v j ) I n + ( d − 1) v j v ∗ i )] 1 ≤ i,j ≤ r 12/20

  13. Retraction on the Veronese manifold The Hankel matrix of degree k, d − k associated to a polynomial P in C [ x ] d is given by: H k,d − k = ( � P, x α + β � d ) | α | = k, | β | = d − k P Π : C [ x ] d → V d q )( q t x ) d ; n , P �→ Π( P ) = P (¯ where q is the first left singular vector from the SVD decomposition of H 1 ,d − 1 . P By using the map Π, we find that the map R P : T P V d n → V d n , Q �→ R P ( Q ) = Π( P + Q ), verifies the properties of a retraction operator. We have the following property: √ || P − Π( P ) || d ≤ d || P − P best || d , where P best ∈ V d n is any best approximation of P in the norm || . || d . 13/20

  14. Examples Comparison with some existing state-of-the-art methods. Example 1 : Let P ∈ S 4 ( C 7 ) with entries: ( P ) i 1 ,i 2 ,i 3 ,i 4 = ( i ) i 1 + · · · + ( i ) i 4 . i 1 i 4 Notation: d rns (resp. d ccpd , d sd f ) denotes the distance between P and the approximation given by RNS 8 (resp. CCPD-NLS, SDF-NLS 9 ). N rns (resp. N ccpd , N sd f ) denotes the number of iterations of the RNS (resp. CCPD-NLS, SDF-NLS). t rns (resp. t ccpd , t sd f ) for the time in seconds consumed by the RNS (resp. CCPD-NLS, SDF-NLS). 8 ”https://gitlab.inria.fr/AlgebraicGeometricModeling/TensorDec.jl” 9N. Vervliet, O. Debals, L. Sorber, M. Van Barel, and L. De Lathauwer,Tensorlab 3.0,916Mar. 2016, https://www.tensorlab.net. 14/20

  15. Examples Table 1: Computational results for Example 1 with random initial point uniformally distributed. r=3 r=5 r=10 d 0 211.7385 487.5316 301.8507 d rns 0.09780803 8.3285e-5 2.5257e-5 d ccpd 0.8689672 0.66874246 2.444e8 0.21156654 0.03242244 d sd 2.4439e8 f N rns 130 40 89 N ccpd 500 466 200 64 13 200 N sd f t rns 3.420527 1.882588 9.254812 t ccpd 1.766338 2.209960 1.319475 1.986222 0.777425 4.819483 t sd f In the later two examples we choose the initial point for the RNS by the structured low rank decomposition of multivariate Hankel matrices algorithm [HKM, 17.], denoted SHD. 15/20

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend