nonlinear programming models
play

Nonlinear Programming Models Fabio Schoen 2008 - PowerPoint PPT Presentation

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems min f ( x ) x S R n Standard form: min f ( x ) h i


  1. Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models – p.

  2. Introduction Nonlinear Programming Models – p.

  3. NLP problems min f ( x ) x ∈ S ⊆ R n Standard form: min f ( x ) h i ( x ) = 0 i = 1 , m g j ( x ) ≤ 0 j = 1 , k Here S = { x ∈ R n : h i ( x ) = 0 ∀ i, g j ( x ) ≤ 0 ∀ j } Nonlinear Programming Models – p.

  4. Local and global optima A global minimum or global optimum is any x ⋆ ∈ S such that x ∈ S ⇒ f ( x ) ≥ f ( x ⋆ ) A point ¯ x is a local optimum if ∃ ε > 0 such that x ∈ S ∩ B (¯ x, ε ) ⇒ f ( x ) ≥ f (¯ x ) x, ε ) = { x ∈ R n : � x − ¯ x � ≤ ε } is a ball in R n . where B (¯ Any global optimum is also a local optimum, but the opposite is generally false. Nonlinear Programming Models – p.

  5. Convex Functions A set S ⊆ R n is convex if x, y ∈ S ⇒ λx + (1 − λ ) y ∈ S for all choices of λ ∈ [0 , 1] . Let Ω ⊆ R n : non empty convex set. A function f : Ω → R is convex iff f ( λx + (1 − λ ) y ) ≤ λf ( x ) + (1 − λ ) f ( y ) for all x, y ∈ Ω , λ ∈ [0 , 1] Nonlinear Programming Models – p.

  6. Convex Functions x y Nonlinear Programming Models – p.

  7. Properties of convex functions Every convex function is continuous in the interior of Ω . It might be discontinuous, but only on the frontier. If f is continuously differentiable then it is convex iff f ( y ) ≥ f ( x ) + ( y − x ) T ∇ f ( x ) for all y ∈ Ω Nonlinear Programming Models – p.

  8. Convex functions x y Nonlinear Programming Models – p.

  9. If f is twice continuously differentiable ⇒ f it is convex iff its Hessian matrix is positive semi-definite: � ∂ 2 f � ∇ 2 f ( x ) := ∂x i ∂x j then ∇ 2 f ( x ) � 0 iff v T ∇ 2 f ( x ) v ≥ 0 ∀ v ∈ R n or, equivalently, all eigenvalues of ∇ 2 f ( x ) are non negative. Nonlinear Programming Models – p.

  10. Example: an affine function is convex (and concave) For a quadratic function ( Q : symmetric matrix): f ( x ) = 1 2 x T Qx + b T x + c we have ∇ 2 f ( x ) = Q ∇ f ( x ) = Qx + b ⇒ f is convex iff Q � 0 Nonlinear Programming Models – p. 1

  11. Convex Optimization Problems min f ( x ) x ∈ S is a convex optimization problem iff S is a convex set and f is convex on S . For a problem in standard form min f ( x ) h i ( x ) = 0 i = 1 , m g j ( x ) ≤ 0 j = 1 , k if f is convex, h i ( x ) are affine functions, g j ( x ) are convex functions, then the problem is convex. Nonlinear Programming Models – p. 1

  12. Maximization Slight abuse in notation: a problem max f ( x ) x ∈ S is called convex iff S is a convex set and f is a concave function (not to be confused with minimization of a concave function, (or maximization of a convex function) which are NOT a convex optimization problem) Nonlinear Programming Models – p. 1

  13. Convex and non convex optimization Convex optimization “is easy”, non convex optimization is usually very hard. Fundamental property of convex optimization problems: every local optimum is also a global optimum (will give a proof later) Minimizing a positive semidefinite quadratic function on a polyhedron is easy (polynomially solvable); if even a single eigenvalue of the hessian is negative ⇒ the problem becomes NP –hard Nonlinear Programming Models – p. 1

  14. Convex functions: examples Many (of course not all . . . ) functions are convex! affine functions a T x + b quadratic functions 1 2 x T Qx + b T x + c with Q = Q T , Q � 0 any norm is a convex function x log x (however log x is concave) f is convex if and only if ∀ x 0 , d ∈ R n , its restriction to any line: φ ( α ) = f ( x 0 + αd ) , is a convex function a linear non negative combination of convex functions is convex � g ( x, y ) convex in x for all y ⇒ g ( x, y ) dy convex Nonlinear Programming Models – p. 1

  15. more examples . . . max i { a T i x + b } is convex f, g : convex ⇒ max { f ( x ) , g ( x ) } is convex f a convex functions for any a ∈ A (a possibly uncountable set) ⇒ sup a ∈A f a ( x ) is convex f convex ⇒ f ( Ax + b ) let S ⊆ R n be any set ⇒ f ( x ) = sup s ∈ S � x − s � is convex Trace ( A T X ) = � i,j A ij X ij is convex (it is linear!) log det X − 1 is convex over the set of matrices X ∈ R n × n : X ≻ 0 λ max ( X ) (the largest eigenvalue of a matrix X ) Nonlinear Programming Models – p. 1

  16. Data Approximation Nonlinear Programming Models – p. 1

  17. Table of contents norm approximation maximum likelihood robust estimation Nonlinear Programming Models – p. 1

  18. Norm approximation Problem: x � Ax − b � min where A, b : parameters. Usually the system is over-determined, i.e. b �∈ Range ( A ) . For example, this happens when A ∈ R m × n with m > n and A has full rank. r := Ax − b : “residual”. Nonlinear Programming Models – p. 1

  19. Examples √ � r � = r T r : least squares (or “regression”) √ � r � = r T Pr with P ≻ 0 : weighted least squares � r � = max i | r i | : minimax, or ℓ ∞ or di Tchebichev approximation i | r i | : absolute or ℓ 1 approximation � r � = � Possible (convex) additional constraints: maximum deviation from an initial estimate: � x − x est � ≤ ǫ simple bounds ℓ i ≤ x i ≤ u i ordering: x 1 ≤ x 2 ≤ · · · ≤ x n Nonlinear Programming Models – p. 1

  20. Example: ℓ 1 norm Matrix A ∈ R 100 × 30 80 norm 1 residuals 70 60 50 40 30 20 10 0 -5 -4 -3 -2 -1 0 1 2 3 4 5 Nonlinear Programming Models – p. 2

  21. ℓ ∞ norm 20 ∞ norm residuals 18 16 14 12 10 8 6 4 2 0 -5 -3 0 3 5 -4 -2 -1 1 2 4 Nonlinear Programming Models – p. 2

  22. ℓ 2 norm 18 norm 2 residuals 16 14 12 10 8 6 4 2 0 -5 -3 0 3 5 -4 -2 -1 1 2 4 Nonlinear Programming Models – p. 2

  23. Variants i h ( y i − a T min � i x ) where h : convex function: � z 2 | z | ≤ 1 h linear–quadratic h ( z ) = 2 | z | − 1 | z | > 1 � | z | ≤ 1 0 “dead zone”: h ( z ) = | z | − 1 | z | > 1 � − log(1 − z 2 ) | z | < 1 logarithmic barrier: h ( z ) = ∞ | z | ≥ 1 Nonlinear Programming Models – p. 2

  24. comparison 4 norm 1(x) norm 2(x) 3.5 linquad(x) deadzone(x) 3 logbarrier(x) 2.5 2 1.5 1 0.5 0 -0.5 -1.5 -0.5 0 0.5 1.5 -2 -1 1 2 Nonlinear Programming Models – p. 2

  25. Maximum likelihood Given a sample X 1 , X 2 , . . . , X k and a parametric family of probability density functions L ( · ; θ ) , the maximum likelihood estimate of θ given the sample is ˆ L ( X 1 , . . . , X k ; θ ) θ = arg max θ Example: linear measures with and additive i.i.d. (independent identically dsitributed) noise: X i = a T i θ + ε i (1) where ε i iid random variables with density p ( · ) : k � p ( X i − a T L ( X 1 . . . , X k ; θ ) = i θ ) i =1 Nonlinear Programming Models – p. 2

  26. Max likelihood estimate - MLE (taking the logarithm, which does not change optimum points): ˆ � log( p ( X i − a T θ = arg max i θ )) θ i If p is log–concave ⇒ this problem is convex. Examples: ε ∼ N (0 , σ ) , i.e. p ( z ) = (2 πσ ) − 1 / 2 exp( − z 2 / 2 σ 2 ) ⇒ MLE is the ℓ 2 estimate: θ = arg min � Aθ − X � 2 ; p ( z ) = (1 / (2 a )) exp( −| z | /a ) ⇒ ℓ 1 estimate: ˆ θ = arg min θ � Aθ − X � 1 Nonlinear Programming Models – p. 2

  27. p ( z ) = (1 /a ) exp( − z/a )1 { z ≥ 0 } (negative exponential) ⇒ the estimate can be found solving the LP problem: min 1 T ( X − Aθ ) ≤ Aθ X p uniform on [ − a, a ] ⇒ the MLE is any θ such that � Aθ − X � ∞ ≤ a Nonlinear Programming Models – p. 2

  28. Ellipsoids An ellipsoid is a subset of R n of the form E = { x ∈ R n : ( x − x 0 ) T P − 1 ( x − x 0 ) ≤ 1 } where x 0 ∈ R n is the center of the ellipsoid and P is a symmetric positive-definite matrix. Alternative representations: E = { x ∈ R n : � Ax − b � 2 ≤ 1 } where A ≻ 0 , or E = { x ∈ R n : x = x 0 + Au | � u � 2 ≤ 1 } where A is square and non singular (affine transformation of the unit ball) Nonlinear Programming Models – p. 2

  29. Robust Least Squares i x − b i ) 2 Hp: a i not known, �� i ( a T Least Squares: ˆ x = arg min but it is known that a i ∈ E i = { ¯ a i + P i u : � u � ≤ 1 } where P i = P T i � 0 . Definition: worst case residuals: �� ( a T max i x − b i ) 2 a i ∈E i i A robust estimate of x is the solution of �� ( a T i x − b i ) 2 x r = arg min ˆ x max a i ∈E i i Nonlinear Programming Models – p. 2

  30. RLS It holds: | α + β T y | ≤ | α | + � β �� y � then, choosing y ⋆ = β/ � β � if α ≥ 0 and y ⋆ = − β/ � β � , otherwise if α < 0 , then � y � = 1 and | α + β T y ⋆ | = | α + β T β/ � β � sign ( α ) | = | α | + � β � then: a i ∈E i | ( a T a T i x − b i + u T P i x | max i x − b i ) | = � u �≤ 1 | ¯ max a T = | ¯ i x − b i | + � P i x � Nonlinear Programming Models – p. 3

  31. . . . Thus the Robust Least Squares problem reduces to � 1 / 2 �� a T i x − b i | + � P i x � ) 2 min ( | ¯ i (a convex optimization problem). Transformation: x,t � t � 2 min a T | ¯ i x − b i | + � P i x � ≤ t i ∀ i i.e. Nonlinear Programming Models – p. 3

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend