a positive bb like stepsize and an extension for
play

A Positive BB-Like Stepsize and An Extension for Symmetric Linear - PowerPoint PPT Presentation

A Positive BB-Like Stepsize and An Extension for Symmetric Linear Systems Yu-Hong Dai Academy of Mathematics and Systems Science, Chinese Academy of Sciences Joint with M. Al-Baali and Xiaoqi Yang Peking University, 20140903 Yu-Hong Dai


  1. A Positive BB-Like Stepsize and An Extension for Symmetric Linear Systems Yu-Hong Dai Academy of Mathematics and Systems Science, Chinese Academy of Sciences Joint with M. Al-Baali and Xiaoqi Yang Peking University, 20140903 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 1 / 39

  2. Outline Introduction 1 A Positive BB-Like Stepsize 2 Analysis of The New Method 3 An Extension for Symmetric Linear Systems 4 Some Discussions 5 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 2 / 39

  3. Introduction Section I. Introduction Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 3 / 39

  4. Introduction Unconstrained Optimization x ∈ R n min f ( x ) , Convex Quadratic Minimization Q ( x ) := 1 2 x T A x − b T x , x ∈ R n min Linear System x ∈ R n A x = b , Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 4 / 39

  5. Introduction Steepest Descent Method (Cauchy 1847) x k + 1 = x k − α k g k α k = arg min α ≥ 0 f ( x k − α g k ) Fast during early several iterations Linear Convergence � k � κ − 1 ∇ 2 f ( x ∗ ) � � � g k � 2 ≈ κ = cond , κ + 1 Zigzagging Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 5 / 39

  6. Introduction Barzilai-Borwein (1988) x k + 1 = x k − α k g k x k − D − 1 = g k k D = α − 1 I � D k s k − 1 − y k − 1 � 2 D k = arg min 2 ( s k − 1 = x k − x k − 1 , y k − 1 = g k − g k − 1 ) = s T k − 1 s k − 1 α BB 1 ⇒ k s T k − 1 y k − 1 Similarly, = s T k − 1 y k − 1 α BB 2 k y T k − 1 y k − 1 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 6 / 39

  7. Introduction Fletcher (2005), “On the Barzilai-Borwein method": u ∈ [ 0 , 1 ] 3 △ u = − f , f = x ( x − 1 ) y ( y − 1 ) z ( z − 1 ) w ( x , y , z ) − 1 ( x − α ) 2 + ( y − β ) 2 + ( z − γ ) 2 �� � 2 σ 2 � w = exp n = 10 6 A u = b , � � ⇔ min 1 2 u T A u − b T u � g k � 2 ≤ 10 − 6 � g 1 � 2 u 1 = 0 , Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 7 / 39

  8. Introduction Numerical Results ( σ, α, β, γ ) BB CG ( 20 , 0 . 5 , 0 . 5 , 0 . 5 ) double 543(859) 162(178) single 462(964) 254(387) ( 50 , 0 . 4 , 0 . 7 , 0 . 5 ) double 640(1009) 285(306) single 310(645) 290(443) � g 2000 � But SD: 2000, = 0 . 18 ! � g 1 � Scholar google BB : 806 times (by Jan 5, 2014) Scholar google GPSR by Figueiredo, Wright and Nowak (2007): 1310 times (by Jan 5, 2014) Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 8 / 39

  9. Introduction Efficiency Evidences of BB for Quadratic Minimization Barzilai-Borwein (1988) n = 2, R -superlinear � � α − 1 k i 1 → λ 1 , α − 1 k i 2 → λ 2 Dai & Fletcher (2005) n = 3, R -superlinear Dai & Fletcher (2005) Cyclic SD method, m ≥ n 2 + 1, R -superlinear In theory, how to show that BB is better than SD for any-dimensional quadratic functions? Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 9 / 39

  10. Introduction Quadratic Termination of Gradient Method g k + 1 = g k − α k A g k = ( I − α k A ) g k � � � k = j = 1 ( 1 − α j A ) g 1 Assuming that λ ( A ) = { λ 1 , λ 2 , ..., λ n } by the Caylay-Hamilton theorem, we must have g n + 1 = 0 if � � � � λ − 1 α k : k = 1 , ..., n = : k = 1 , ..., n k This property was first due to Yan-Lian Lai (1983). Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 10 / 39

  11. Introduction A Typical Nonmonotone Performance of BB Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 11 / 39

  12. Introduction For any dimensional strictly convex quadratics Raydan (1993): global convergence Dai & Liao (2002): R -linear convergence We can then show that the BB stepsize can be asymptotically accepted by the nonmonotone line search in the context of unconstrained optimization. This is a property similar to quasi-Newton methods where the stepsize α k = 1 is usually firstly tried by the Wolfe line search and it will gradually accepted. Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 12 / 39

  13. Introduction Gobalization Technique for General Functions Raydan (1997): GLL nonmonotone line search f ( x k − α g k ) ≤ f ref − δα � g k � 2 , f ref = max j = 1 ,..., m f k − j Dai & Zhang (2001): Adaptive nonmonotone line search Initialization : f ref = + ∞ , H ∈ [ 4 , 10 ] If f k ≤ f best f best = f k , f c = f k , h = 0 ; Else f c = max { f c , f k } , h = h + 1 if h = H , f ref = f c , search , f c = f k , h = 0 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 13 / 39

  14. A Positive BB-Like Stepsize Section II. A Positive BB-Like Stepsize Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 14 / 39

  15. A Positive BB-Like Stepsize Motivation What to do if the BB stepsize = s T = s T k − 1 s k − 1 k − 1 y k − 1 α BB 1 α BB 2 or k k s T y T k − 1 y k − 1 k − 1 y k − 1 is very small or even negative? Project it onto the interval � � α min , α max ? k k How to choose α min (and α max ) ? 10 − 30 , 10 − 8 , 10 − 5 , ...... k k For a symmetric but not necessarily positive definite linear system x ∈ R n , A x = b , how to approximate the (inverse) Jacobian matrix by the form α I , in which case it may have negative eigenvalues? Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 15 / 39

  16. A Positive BB-Like Stepsize The New positive stepsize The New positive stepsize α k = � s k − 1 � (1) � y k − 1 � Mentioned in several previous occasions, but not been carefully studied [eg., Dai & Yuan (2001), Dai (2003), Dai & Yang (2006), Mehiddin Al-Baali (2007)] Property 1: Geometry mean � α BB 1 · α BB 2 α k = (2) k k Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 16 / 39

  17. A Positive BB-Like Stepsize The New positive stepsize (Cond.) Propery 2: Certain quasi-Newton property Two features of ∇ 2 f ( x k ) s T k − 1 ∇ 2 f ( x k ) s k − 1 ≈ s T k − 1 y k − 1 (3) y T k − 1 ∇ 2 f ( x k ) − 1 y k − 1 ≈ s T k − 1 y k − 1 (4) Approximation ∇ 2 f ( x k ) − 1 ← H = α I , ∇ 2 f ( x k ) ← H − 1 = α − 1 I � , � � s T k − 1 H − 1 s k − 1 + y T k − 1 H y k − 1 − 2 s T � α k = arg min k − 1 y k − 1 H = α I � 0 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 17 / 39

  18. A Positive BB-Like Stepsize Property 3: One-retard extension of [Dai & Yang, 2006] = � g k � α DY (5) k � A g k � The stepsize (5) is shown to tend to some optimal stepsize: 2 k →∞ α DY lim inf = := arg min α ≥ 0 � I − α A � . (6) k λ 1 + λ n Both the solution and the minimal/maximal eigenpairs can simultaneously obtained (One stone Two birds). Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 18 / 39

  19. Analysis of The New Method Section III. Analysis of The New Method Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 19 / 39

  20. Analysis of The New Method Some notations Assume that � 1 � 0 � � A = , b = , λ > 1 λ 0 Denote g k = ( g ( 1 ) k , g ( 2 ) k ) T Assumption 1 λ > 1 (7) Assumption 2 g ( i ) 1 � = 0 , g ( i ) 2 � = 0 , i = 1 , 2 (8) Define � 2 � g ( 1 ) k q k = (9) � 2 � g ( 2 ) k Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 20 / 39

  21. Analysis of The New Method Some basic relations � α k = � s k − 1 � � y k − 1 � = � g k − 1 � 1 + q k − 1 � A g k − 1 � = (10) λ 2 + q k − 1 � g k + 1 = ( I − α k A ) g k (11) √ λ 2 + q k − 1 − √  1 + q k − 1 g ( 1 ) g ( 1 ) √ k + 1 = � g ( 1 ) k + 1 = ( 1 − α k ) g ( 1 )  k  λ 2 + q k − 1 k √ λ 2 + q k − 1 − λ √ = ⇒ (12) g ( 2 ) k + 1 = ( 1 − λα k ) g ( 2 ) 1 + q k − 1 g ( 2 ) g ( 2 ) √ k + 1 =  k  k λ 2 + q k − 1 Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 21 / 39

  22. Analysis of The New Method Recurrence relation of q k � 2 � � λ 2 + q k − 1 − � 1 + q k − 1 q k + 1 = q k λ 2 + q k − 1 − λ � � 1 + q k − 1 � 2 � λ 2 + q k − 1 − λ 2 + q k − 1 + λ � � � � ( 1 + q k − 1 )( 1 + q k − 1 ) = q k ( λ 2 − 1 ) q k − 1 � 2 � � λ − q k − 1 + τ ( q k − 1 ) q k = , (13) q 2 λ − 1 k − 1 where τ ( w ) = ( 1 + w )( λ 2 + w ) , w ≥ 0 (14) � h ( w ) = λ − w + τ ( w ) , w ≥ 0 (15) λ + 1 Define M k = log q k . Then we obtain M k + 1 = M k − 2 M k − 1 + 2 log ( h ( q k − 1 )) (16) Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 22 / 39

  23. Analysis of The New Method The difficulty: Previously, for the BB1 or BB2 method, we can get the linear recurrence relation M k + 1 = M k − 2 M k − 1 . But now we have got a nonlinear recurrence relation. Yu-Hong Dai (AMSS, CAS) A Positive BB-like Stepsize 20140903 23 / 39

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend