Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic - PowerPoint PPT Presentation

Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic complexity of Black-Box Optimization Yurii Nesterov, CORE/INMA (UCL) February 24, 2012 Yu. Nesterov () Complexity of Black-Box Optimization 1/26 February 24, 2012 1 / 26

Outline 1 Basic NP-hard problem 2 NP-hardness of some popular problems 3 Lower complexity bounds for Global Minimization 4 Nonsmooth Convex Minimization. Subgradient scheme. 5 Smooth Convex Minimization. Lower complexity bounds 6 Methods for Smooth Minimization with Simple Constraints Yu. Nesterov () Complexity of Black-Box Optimization 2/26 February 24, 2012 2 / 26

Standard Complexity Classes Let data be coded in matrix A , and n be dimension of the problem. Combinatorial Optimization NP-hard problems: 2 n operations. Solvable in O ( p ( n ) � A � ). � ǫ k ln α � A � � p ( n ) Fully polynomial approximation schemes: O . Polynomial-time problems: O ( p ( n ) ln α � A � ). Continuous Optimization � ǫ α � A � β � p ( n ) Sublinear complexity: O , α, β > 0. p ( n ) ln( 1 � � Polynomial-time complexity: O ǫ � A � ) . Yu. Nesterov () Complexity of Black-Box Optimization 3/26 February 24, 2012 3 / 26

Basic NP-hard problem: Problem of stones Given n stones of integer weights a 1 , . . . , a n , decide if it is possible to divide them on two parts of equal weight. Mathematical formulation Find a Boolean solution x i = ± 1, i = 1 , . . . , n , to a single linear equation n � a i x i = 0. i =1 n Another variant: � a i x i = a 1 . i =2 � n � � NB: Solvable in O ln n · | a i | by FFT transform. i =1 Yu. Nesterov () Complexity of Black-Box Optimization 4/26 February 24, 2012 4 / 26

Immediate consequence: quartic polynomial Theorem: Minimization of quartic polynomial of n variables is NP-hard. Proof: Consider the following function: � n � n � 2 � 4 n x 4 i − 1 x 2 + (1 − x 1 ) 4 . � � � f ( x ) = + a i x i i n i =1 i =1 i =1 The first part is � A [ x ] 2 , [ x ] 2 � , where A = I − 1 n e n e T n � 0 with Ae n = 0, and [ x ] 2 i = x 2 i , i = 1 , . . . , n . n Thus, f ( x ) = 0 iff all x i = τ , � a i x i = 0, and x 1 = 1. i =1 Corollary: Minimization of convex quartic polynomial over the unit sphere is NP-hard. Yu. Nesterov () Complexity of Black-Box Optimization 5/26 February 24, 2012 5 / 26

Nonlinear Optimal Control: NP-hard u { f ( x (1)) : x ′ = g ( x , u ) , 0 ≤ t ≤ 1 , x (0) = x 0 } . Problem: min Consider g ( x , u ) = 1 n x · � x , u � − u . Lemma. Let � x 0 � 2 = n . Then � x ( t ) � 2 = n , 0 ≤ t ≤ 1. � � u and let x ′ = ˜ xx T Proof. Consider ˜ g ( x , u ) = � x � 2 − I g ( x , u ). Then � � xx T � x ′ , x � = � � x � 2 − I u , x � = 0 . Thus, � x ( t ) � 2 = � x 0 � 2 . Same is true for x ( t ) defined by g . Note: We have enough degrees of freedom to put x (1) at any position of the sphere. min { f ( y ) : � y � 2 = n } . Hence , our problem is: Yu. Nesterov () Complexity of Black-Box Optimization 6/26 February 24, 2012 6 / 26

Descent direction of nonsmooth nonconvex function � � 1 − 1 Consider φ ( x ) = 1 ≤ i ≤ n | x i | − min max 1 ≤ i ≤ n | x i | + |� a , x �| , γ n + and γ def where a ∈ Z n � = a i ≥ 1. Clearly, φ (0) = 0. i =1 Lemma. It is NP-hard to decide if φ ( x ) < 0 for some x ∈ R n . Proof: 1. Assume that σ ∈ R n with σ i = ± 1 satisfies � a , σ � = 0. Then φ ( σ ) = − 1 γ < 0. 2. Assume φ ( x ) < 0 and max 1 ≤ i ≤ n | x i | = 1. Denote δ = |� a , x �| . Then | x i | > 1 − 1 γ + δ , i = 1 , . . . , n . Denoting σ i = sign x i , we have σ i x i > 1 − 1 γ + δ . Therefore, | σ i − x i | = 1 − σ i x i < 1 γ − δ , and we conclude that |� a , σ �| ≤ |� a , x �| + |� a , σ − x �| ≤ δ + γ max 1 ≤ i ≤ n | σ i − x i | < (1 − γ ) δ + 1 ≤ 1 . Since a ∈ Z n , this is possible iff � a , σ � = 0. Yu. Nesterov () Complexity of Black-Box Optimization 7/26 February 24, 2012 7 / 26

Black-box optimization Oracle: Special unit for computing function value and derivatives at test points. (0-1-2 order.) Analytic complexity: Number of calls of oracle, which is necessary (sufficient) for solving any problem from the class. (Lower/Upper complexity bounds.) Solution: ǫ -approximation of the minimum. Resisting oracle: creates the worst problem instance for a particular method. Starts from “empty” problem. Answers must be compatible with the description of the problem class. The bad problem is created after the method stops. Yu. Nesterov () Complexity of Black-Box Optimization 8/26 February 24, 2012 8 / 26

Bounds for Global Minimization Problem: f ∗ = min x { f ( x ) : x ∈ B n } , B n = { x ∈ R n : 0 ≤ x ≤ e n } . Problem Class: | f ( x ) − f ( y ) | ≤ L � x − y � ∞ ∀ x , y ∈ B n . Oracle: f ( x ) (zero order). x ) − f ∗ ≤ ǫ . Goal: Find ¯ x ∈ B n : f (¯ � L � n . Theorem: N ( ǫ ) ≥ 2 ǫ Proof. Divide B n on p n l ∞ -balls of radius 1 2 p . Resisting oracle: at each test point reply f ( x ) = 0 . Assume, N < p n . Then, ∃ ball with no questions. Hence, we can take f ∗ = − L L 2 p . Hence, ǫ ≥ 2 p . Corollary: Uniform Grid method is worst-case optimal. Yu. Nesterov () Complexity of Black-Box Optimization 9/26 February 24, 2012 9 / 26

Nonsmooth Convex Minimization (NCM) Problem: f ∗ = min x { f ( x ) : x ∈ Q } , where Q ⊆ R n is a convex set: x , y ∈ Q ⇒ [ x , y ] ∈ Q . It is simple. f ( x ) is a sub-differentiable convex function: f ( x ) + � f ′ ( x ) , y − x � , f ( y ) ≥ x , y ∈ Q , for certain subgradient f ′ ( x ) ∈ R n . Oracle: f ( x ) , f ′ ( x ) (first order). Solution: ǫ -approximation in function value. Main inequality: � f ′ ( x ) , x − x ∗ � ≥ f ( x ) − f ∗ ≥ 0, ∀ x ∈ Q . NB: Anti-subgradient decreases the distance to the optimum. Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 10/26 10 / 26

NCM: Lower Complexity Bounds . Let Q ≡ {� x � ≤ 2 R } and x k +1 ∈ x 0 + Lin { f ′ ( x 0 ) , . . . , f ′ ( x k ) } . 2 � x � 2 with µ = 1 ≤ i ≤ m x i + µ L Consider the function f m ( x ) = L max Rm 1 / 2 . L τ + µ m � 2 τ 2 � From the problem: min , we get τ m 1 / 2 , � x ∗ � 2 = m τ 2 m = − L 2 τ ∗ = − L m 1 / 2 , f ∗ R 2 µ m = − LR ∗ = R 2 . µ m = − NB: If x 0 = 0, then after k iterations we can keep x i = 0 for i > k . Lipschitz continuity: f k +1 ( x k ) − f ∗ k +1 ≥ − f ∗ LR k +1 = ( k +1) 1 / 2 . L 2 Strong convexity: f k +1 ( x k ) − f ∗ k +1 ≥ − f ∗ k +1 = 2( k +1) · µ . Both lower bounds are exact ! Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 11/26 11 / 26

Subgradient Method Problem: min x ∈ Q { f ( x ) : g ( x ) ≤ 0 } , where Q is a closed convex set, and convex f , g ∈ C 0 , 0 L ( Q ). � g ′ ( x k ) � > h then a) x k +1 = π Q g ( x k ) � x k − g ( x k ) � � g ′ ( x k ) � 2 g ′ ( x k ) Method If , � � else b) x k +1 = π Q x k − � f ′ ( x k ) � f ′ ( x k ) h . Denote f ∗ 0 ≤ k ≤ N { f ( x k ) : k ∈ b) } . N = min Let N = N a + N b . h 2 � x 0 − x ∗ � 2 , then f ∗ N − f ∗ ≤ hL . Theorem: If N > 1 ( h = ǫ L . ) Proof: Denote r k = � x k − x ∗ � . g 2 ( x k ) � g ′ ( x k ) � 2 � g ′ ( x k ) , x k − x ∗ � + 2 g ( x k ) a): r 2 k +1 − r 2 � g ′ ( x k ) � 2 ≤ − h 2 . k ≤ − k ≤ − 2 h � f ′ ( x k ) , x k − x ∗ � + h 2 ≤ − 2 h b): r 2 k +1 − r 2 L ( f ( x k ) − f ∗ ) + h 2 . � f ′ ( x k ) � Thus, N b 2 h L ( f ∗ N − f ∗ ) ≤ r 2 0 + h 2 ( N b − N a ) = r 2 0 + h 2 (2 N b − N ). Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 12/26 12 / 26

Smooth Convex Minimization (SCM) Lipschitz-continuous gradient: � f ′ ( x ) − f ′ ( y ) � ≤ L � x − y � . Geometric interpretation: for all x , y ∈ dom F we have f ( y ) − f ( x ) − � f ′ ( x ) , y − x � 0 ≤ 1 � � f ′ ( x + τ ( y − x ) − f ′ ( x ) , y − x � dt ≤ L 2 � x − y � 2 . = 0 Sufficient condition: 0 � f ′′ ( x ) � L · I n , x ∈ dom f . Equivalent definition: f ( y ) ≥ f ( x ) + � f ′ ( x ) , y − x � + 1 2 L � f ′ ( x ) − f ′ ( y ) � 2 . Hint: Prove first that f ( x ) − f ∗ ≥ 2 L � f ′ ( x ) � 2 . 1 Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 13/26 13 / 26

SCM: Lower complexity bounds Consider the family of functions ( k ≤ n ): � k − 1 � ( x i − x i +1 ) 2 + x 2 f k ( x ) = 1 x 2 − x 1 ≡ 1 � 1 + 2 � A k x , x � − x 1 . 2 k i =1 k = { x ∈ R n : x i = 0 , i > k } . Let R n Then f k + p ( x ) = f k ( x ), x ∈ R n k . k − 1 Clearly, 0 ≤ � A k h , h � ≤ h 2 2( h 2 i + h 2 i +1 ) + h 2 k ≤ 4 � h � 2 , 1 + � i =1 2 − 1 0      − 1 2 − 1 0         0 − 1 2    k lines   . . . . . .   A k = ,     − 1 2 − 1     0     0 − 1 2        0 n − k , k 0 n − k , n − k Yu. Nesterov () Complexity of Black-Box Optimization February 24, 2012 14/26 14 / 26

Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic - PowerPoint PPT Presentation

Liege University: Francqui Chair 2011-2012 Lecture 1: Intrinsic complexity of Black-Box Optimization Yurii Nesterov, CORE/INMA (UCL) February 24, 2012 Yu. Nesterov () Complexity of Black-Box Optimization 1/26 February 24, 2012 1 / 26

Liege University: Francqui Chair 2011-2012 Lecture 3: Huge-scale optimization problems Yurii

Liege University: Francqui Chair 2011-2012 Lecture 5: Algorithmic models of human behavior Yurii

Liege University: Francqui Chair 2011-2012 Lecture 4: Nonlinear analysis of combinatorial problems

Logic, Algorithms, and Automata A Historical Journey Wolfgang Thomas Francqui Lecture, Mons,

Winning Infinite Games in Finite Time Wolfgang Thomas Francqui Lecture, Mons, April 2013

The Composition Method Wolfgang Thomas Francqui Lecture, Mons, April 2013 Mastering compositions

Generalizing Strategies Wolfgang Thomas Francqui Lecture, Mons, April 2013 Wolfgang Thomas

Undecidability Results Wolfgang Thomas Francqui Lecture, Mons, April 2013 Fighting the

Prefix Rewriting and the Pushdown Hierarchy Wolfgang Thomas Francqui Lecture, Mons, April 2013

Results of the Golden 1960s Wolfgang Thomas Francqui Lecture, Mons, April 2013 Golden Times

Improved bounds on crossing numbers of graphs via semidefinite programming Etienne de Klerk

Susannah Taylor Chair Alabama Obesity Task Force Board Conference Call Vacant Co-Chair

Musings on Continual Learning Pulkit Agrawal tv.99 chair.98 chair.99 chair.90 dining

H1 2012 Results Main results Key figures H1 2012 H1 2011 Q2 2012 Q1 2012 Q2 2011 Q1 2011

Gembloux Agro-Bio Tech University of Liege ML Fau auconnie ier Karlsurhe 17/0 17/01/2019

I NTERACTION BETWEEN NUTRITION AND EXERCISE Olivier Bruyre University of Liege, Belgium

BINARY BLACK HOLES IN CIRCULAR ORBITS: AN HELICAL KILLING VECTOR APPROACH Eric Gourgoulhon

i A is a finite quiver Q that embeds in dimemodelw.boundanI a component of S Q surface S is

On Using a Black-Box Floating-Point Simplex for Generating Proof Witnesses Fr ed eric

Discrete Random Variables; Expectation 18.05 Spring 2014 Jeremy Orloff and Jonathan Bloom This

Algorithms for NLP Parsing I Yulia Tsvetkov CMU Slides: Ivan Titov University of

Effective-one-body modeling of binary black holes in the era of gravitational-wave

Supertranslations and superrotations Geoffrey Compre Universit Libre de Bruxelles (ULB)

Primordial black k holes s in light of LIGO/Virgo obse serva vations Ville Vaskonen