liege university francqui chair 2011 2012 lecture 1
play

Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic - PowerPoint PPT Presentation

Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic complexity of Black-Box Optimization Yurii Nesterov, CORE/INMA (UCL) February 24, 2012 Yu. Nesterov () Complexity of Black-Box Optimization 1/26 February 24, 2012 1 / 26


  1. Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic complexity of Black-Box Optimization Yurii Nesterov, CORE/INMA (UCL) February 24, 2012 Yu. Nesterov () Complexity of Black-Box Optimization 1/26 February 24, 2012 1 / 26

  2. Outline 1 Basic NP-hard problem 2 NP-hardness of some popular problems 3 Lower complexity bounds for Global Minimization 4 Nonsmooth Convex Minimization. Subgradient scheme. 5 Smooth Convex Minimization. Lower complexity bounds 6 Methods for Smooth Minimization with Simple Constraints Yu. Nesterov () Complexity of Black-Box Optimization 2/26 February 24, 2012 2 / 26

  3. Standard Complexity Classes Let data be coded in matrix A , and n be dimension of the problem. Combinatorial Optimization NP-hard problems: 2 n operations. Solvable in O ( p ( n ) � A � ). � ǫ k ln α � A � � p ( n ) Fully polynomial approximation schemes: O . Polynomial-time problems: O ( p ( n ) ln α � A � ). Continuous Optimization � ǫ α � A � β � p ( n ) Sublinear complexity: O , α, β > 0. p ( n ) ln( 1 � � Polynomial-time complexity: O ǫ � A � ) . Yu. Nesterov () Complexity of Black-Box Optimization 3/26 February 24, 2012 3 / 26

  4. Basic NP-hard problem: Problem of stones Given n stones of integer weights a 1 , . . . , a n , decide if it is possible to divide them on two parts of equal weight. Mathematical formulation Find a Boolean solution x i = ± 1, i = 1 , . . . , n , to a single linear equation n � a i x i = 0. i =1 n Another variant: � a i x i = a 1 . i =2 � n � � NB: Solvable in O ln n · | a i | by FFT transform. i =1 Yu. Nesterov () Complexity of Black-Box Optimization 4/26 February 24, 2012 4 / 26

  5. Immediate consequence: quartic polynomial Theorem: Minimization of quartic polynomial of n variables is NP-hard. Proof: Consider the following function: � n � n � 2 � 4 n x 4 i − 1 x 2 + (1 − x 1 ) 4 . � � � f ( x ) = + a i x i i n i =1 i =1 i =1 The first part is � A [ x ] 2 , [ x ] 2 � , where A = I − 1 n e n e T n � 0 with Ae n = 0, and [ x ] 2 i = x 2 i , i = 1 , . . . , n . n Thus, f ( x ) = 0 iff all x i = τ , � a i x i = 0, and x 1 = 1. i =1 Corollary: Minimization of convex quartic polynomial over the unit sphere is NP-hard. Yu. Nesterov () Complexity of Black-Box Optimization 5/26 February 24, 2012 5 / 26

  6. Nonlinear Optimal Control: NP-hard u { f ( x (1)) : x ′ = g ( x , u ) , 0 ≤ t ≤ 1 , x (0) = x 0 } . Problem: min Consider g ( x , u ) = 1 n x · � x , u � − u . Lemma. Let � x 0 � 2 = n . Then � x ( t ) � 2 = n , 0 ≤ t ≤ 1. � � u and let x ′ = ˜ xx T Proof. Consider ˜ g ( x , u ) = � x � 2 − I g ( x , u ). Then � � xx T � x ′ , x � = � � x � 2 − I u , x � = 0 . Thus, � x ( t ) � 2 = � x 0 � 2 . Same is true for x ( t ) defined by g . Note: We have enough degrees of freedom to put x (1) at any position of the sphere. min { f ( y ) : � y � 2 = n } . Hence , our problem is: Yu. Nesterov () Complexity of Black-Box Optimization 6/26 February 24, 2012 6 / 26

  7. Descent direction of nonsmooth nonconvex function � � 1 − 1 Consider φ ( x ) = 1 ≤ i ≤ n | x i | − min max 1 ≤ i ≤ n | x i | + |� a , x �| , γ n + and γ def where a ∈ Z n � = a i ≥ 1. Clearly, φ (0) = 0. i =1 Lemma. It is NP-hard to decide if φ ( x ) < 0 for some x ∈ R n . Proof: 1. Assume that σ ∈ R n with σ i = ± 1 satisfies � a , σ � = 0. Then φ ( σ ) = − 1 γ < 0. 2. Assume φ ( x ) < 0 and max 1 ≤ i ≤ n | x i | = 1. Denote δ = |� a , x �| . Then | x i | > 1 − 1 γ + δ , i = 1 , . . . , n . Denoting σ i = sign x i , we have σ i x i > 1 − 1 γ + δ . Therefore, | σ i − x i | = 1 − σ i x i < 1 γ − δ , and we conclude that |� a , σ �| ≤ |� a , x �| + |� a , σ − x �| ≤ δ + γ max 1 ≤ i ≤ n | σ i − x i | < (1 − γ ) δ + 1 ≤ 1 . Since a ∈ Z n , this is possible iff � a , σ � = 0. Yu. Nesterov () Complexity of Black-Box Optimization 7/26 February 24, 2012 7 / 26

  8. Black-box optimization Oracle: Special unit for computing function value and derivatives at test points. (0-1-2 order.) Analytic complexity: Number of calls of oracle, which is necessary (sufficient) for solving any problem from the class. (Lower/Upper complexity bounds.) Solution: ǫ -approximation of the minimum. Resisting oracle: creates the worst problem instance for a particular method. Starts from “empty” problem. Answers must be compatible with the description of the problem class. The bad problem is created after the method stops. Yu. Nesterov () Complexity of Black-Box Optimization 8/26 February 24, 2012 8 / 26

  9. Bounds for Global Minimization Problem: f ∗ = min x { f ( x ) : x ∈ B n } , B n = { x ∈ R n : 0 ≤ x ≤ e n } . Problem Class: | f ( x ) − f ( y ) | ≤ L � x − y � ∞ ∀ x , y ∈ B n . Oracle: f ( x ) (zero order). x ) − f ∗ ≤ ǫ . Goal: Find ¯ x ∈ B n : f (¯ � L � n . Theorem: N ( ǫ ) ≥ 2 ǫ Proof. Divide B n on p n l ∞ -balls of radius 1 2 p . Resisting oracle: at each test point reply f ( x ) = 0 . Assume, N < p n . Then, ∃ ball with no questions. Hence, we can take f ∗ = − L L 2 p . Hence, ǫ ≥ 2 p . Corollary: Uniform Grid method is worst-case optimal. Yu. Nesterov () Complexity of Black-Box Optimization 9/26 February 24, 2012 9 / 26

  10. Nonsmooth Convex Minimization (NCM) Problem: f ∗ = min x { f ( x ) : x ∈ Q } , where Q ⊆ R n is a convex set: x , y ∈ Q ⇒ [ x , y ] ∈ Q . It is simple. f ( x ) is a sub-differentiable convex function: f ( x ) + � f ′ ( x ) , y − x � , f ( y ) ≥ x , y ∈ Q , for certain subgradient f ′ ( x ) ∈ R n . Oracle: f ( x ) , f ′ ( x ) (first order). Solution: ǫ -approximation in function value. Main inequality: � f ′ ( x ) , x − x ∗ � ≥ f ( x ) − f ∗ ≥ 0, ∀ x ∈ Q . NB: Anti-subgradient decreases the distance to the optimum. Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 10/26 10 / 26

  11. NCM: Lower Complexity Bounds . Let Q ≡ {� x � ≤ 2 R } and x k +1 ∈ x 0 + Lin { f ′ ( x 0 ) , . . . , f ′ ( x k ) } . 2 � x � 2 with µ = 1 ≤ i ≤ m x i + µ L Consider the function f m ( x ) = L max Rm 1 / 2 . L τ + µ m � 2 τ 2 � From the problem: min , we get τ m 1 / 2 , � x ∗ � 2 = m τ 2 m = − L 2 τ ∗ = − L m 1 / 2 , f ∗ R 2 µ m = − LR ∗ = R 2 . µ m = − NB: If x 0 = 0, then after k iterations we can keep x i = 0 for i > k . Lipschitz continuity: f k +1 ( x k ) − f ∗ k +1 ≥ − f ∗ LR k +1 = ( k +1) 1 / 2 . L 2 Strong convexity: f k +1 ( x k ) − f ∗ k +1 ≥ − f ∗ k +1 = 2( k +1) · µ . Both lower bounds are exact ! Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 11/26 11 / 26

  12. Subgradient Method Problem: min x ∈ Q { f ( x ) : g ( x ) ≤ 0 } , where Q is a closed convex set, and convex f , g ∈ C 0 , 0 L ( Q ). � g ′ ( x k ) � > h then a) x k +1 = π Q g ( x k ) � x k − g ( x k ) � � g ′ ( x k ) � 2 g ′ ( x k ) Method If , � � else b) x k +1 = π Q x k − � f ′ ( x k ) � f ′ ( x k ) h . Denote f ∗ 0 ≤ k ≤ N { f ( x k ) : k ∈ b) } . N = min Let N = N a + N b . h 2 � x 0 − x ∗ � 2 , then f ∗ N − f ∗ ≤ hL . Theorem: If N > 1 ( h = ǫ L . ) Proof: Denote r k = � x k − x ∗ � . g 2 ( x k ) � g ′ ( x k ) � 2 � g ′ ( x k ) , x k − x ∗ � + 2 g ( x k ) a): r 2 k +1 − r 2 � g ′ ( x k ) � 2 ≤ − h 2 . k ≤ − k ≤ − 2 h � f ′ ( x k ) , x k − x ∗ � + h 2 ≤ − 2 h b): r 2 k +1 − r 2 L ( f ( x k ) − f ∗ ) + h 2 . � f ′ ( x k ) � Thus, N b 2 h L ( f ∗ N − f ∗ ) ≤ r 2 0 + h 2 ( N b − N a ) = r 2 0 + h 2 (2 N b − N ). Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 12/26 12 / 26

  13. Smooth Convex Minimization (SCM) Lipschitz-continuous gradient: � f ′ ( x ) − f ′ ( y ) � ≤ L � x − y � . Geometric interpretation: for all x , y ∈ dom F we have f ( y ) − f ( x ) − � f ′ ( x ) , y − x � 0 ≤ 1 � � f ′ ( x + τ ( y − x ) − f ′ ( x ) , y − x � dt ≤ L 2 � x − y � 2 . = 0 Sufficient condition: 0 � f ′′ ( x ) � L · I n , x ∈ dom f . Equivalent definition: f ( y ) ≥ f ( x ) + � f ′ ( x ) , y − x � + 1 2 L � f ′ ( x ) − f ′ ( y ) � 2 . Hint: Prove first that f ( x ) − f ∗ ≥ 2 L � f ′ ( x ) � 2 . 1 Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 13/26 13 / 26

  14. SCM: Lower complexity bounds Consider the family of functions ( k ≤ n ): � k − 1 � ( x i − x i +1 ) 2 + x 2 f k ( x ) = 1 x 2 − x 1 ≡ 1 � 1 + 2 � A k x , x � − x 1 . 2 k i =1 k = { x ∈ R n : x i = 0 , i > k } . Let R n Then f k + p ( x ) = f k ( x ), x ∈ R n k . k − 1 Clearly, 0 ≤ � A k h , h � ≤ h 2 2( h 2 i + h 2 i +1 ) + h 2 k ≤ 4 � h � 2 , 1 + � i =1 2 − 1 0      − 1 2 − 1 0         0 − 1 2    k lines   . . . . . .   A k = ,     − 1 2 − 1     0     0 − 1 2        0 n − k , k 0 n − k , n − k Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 14/26 14 / 26

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend