constrained optimization in recap
play

Constrained Optimization in : Recap October 19, 2018 197 / 424 - PowerPoint PPT Presentation

Constrained Optimization in : Recap October 19, 2018 197 / 424 Global Extrema on Closed Intervals Recall the extreme value theorem. A consequence is that: if either of c or d lies in( a , b ), then it is a critical number of f ; else each


  1. Constrained Optimization in ℜ : Recap October 19, 2018 197 / 424

  2. Global Extrema on Closed Intervals Recall the extreme value theorem. A consequence is that: if either of c or d lies in( a , b ), then it is a critical number of f ; else each of c and d must lie on one of the boundaries of[ a , b ]. This gives us a procedure for finding the maximum and minimum of a continuous function f on a closed bounded interval I : Procedure [Finding extreme values on closed, bounded intervals]: Find the critical points inint ( I ) . 1 2 Compute the values of f at the critical points and at the endpoints of the interval. 3 Select the least and greatest of the computed values. October 19, 2018 198 / 424

  3. Global Extrema on Closed Intervals (contd) To compute the maximum and minimum values of f ( x ) = 4 x 3 − 8 x 2 + 5 x on the interval [0 , 1], ▶ We first compute f ′ ( x ) = 12 x 2 − 16 x + 5which is0at x = 1 , 5 . 2 6 ▶ Values at the critical points are f ( 1 ) = 1, f ( 5 ) = 25 . 2 6 27 ▶ The values at the end points are f (0) = 0and f (1) = 1. ▶ Therefore, the minimum value is f (0) = 0and the maximum value is f (1) = f ( 1 ) =1. 2 In this context, it is relevant to discuss the one-sided derivatives of a function at the endpoints of the closed interval on which it is defined. October 19, 2018 199 / 424

  4. Global Extrema on Closed Intervals (contd) Definition [One-sided derivatives at endpoints] : Let f be defined on a closed bounded interval [ a , b ] . The (right-sided) derivative of f at x = a is defined as f ( a + h ) − f ( a ) f ′ ( a ) =lim h h → 0 + Similarly, the (left-sided) derivative of f at x = b is defined as f ( b + h ) − f ( b ) f ′ ( b ) =lim h h → 0 − Essentially, each of the one-sided derivatives defines one-sided slopes at the endpoints. October 19, 2018 200 / 424

  5. Global Extrema on Closed Intervals (contd) Based on these definitions, the following result can be derived. Claim If f is continuous on [ a , b ] and f ′ ( a ) exists as a real number or as ±∞ , then we have the following necessary conditions for extremum at a. f ′ ( a )= −∞ . ′ ( a ) ≤ 0 or en f If f ( a ) is the maximum value of f on [ a , b ] , th f ( a ) is the minimum value of f on [ a , b ] , then f ′ ( a ) ≥ 0 or f ′ ( a )= ∞ . If If f is continuous on [ a , b ] and f ′ ( b ) exists as a real number or as ±∞ , then we have the following necessary conditions for extremum at b October 19, 2018 201 / 424

  6. Global Extrema on Closed Intervals (contd) Based on these definitions, the following result can be derived. Claim If f is continuous on [ a , b ] and f ′ ( a ) exists as a real number or as ±∞ , then we have the following necessary conditions for extremum at a. If f ( a ) is the maximum value of f on [ a , b ] , then f ′ ( a ) ≤ 0 or f ′ ( a ) = −∞ . If f ( a ) is the minimum value of f on [ a , b ] , then f ′ ( a ) ≥ 0 or f ′ ( a ) = ∞ . If f is continuous on [ a , b ] and f ′ ( b ) exists as a real number or as ±∞ , then we have the following necessary conditions for extremum at b If f ( b ) is the maximum value of f on [ a , b ] , then f ′ ( b ) ≥ 0 or f ′ ( b ) = ∞ . If f ( b ) is the minimum value of f on [ a , b ] , then f ′ ( b ) ≤ 0 or f ′ ( b ) = −∞ . October 19, 2018 201 / 424

  7. Global Extrema on Closed Intervals (contd) The following result gives a useful procedure for finding extrema on closed intervals . Claim If f is continuous on [ a , b ] and f ′′ ( x ) exists for all x ∈ ( a , b ) . Then, If f ′′ ( x ) ≤ 0 , ∀ x ∈ ( a , b ) , then the minimum value of f on [ a , b ] is either f ( a ) or f ( b ) . If, in addition, f has a critical point c ∈ ( a , b ) , then f ( c ) is the maximum value of f on [ a , b ] . If f ′′ ( x ) ≥ 0 , ∀ x ∈ ( a , b ) , then the maximum value of f on [ a , b ] is either f ( a ) or f ( b ) . If, in addition, f has a critical point c ∈ ( a , b ) , then f ( c ) is the minimum value of f on [ a , b ] . October 19, 2018 202 / 424

  8. Global Extrema on Open Intervals The next result is very useful for finding extrema on open intervals. Claim Let I be an open interval and let f ′′ ( x ) exist ∀ x ∈I . If f ′′ ( x ) ≥ 0 , ∀ x ∈I , and if there is a number c ∈I where f ′ ( c ) = 0 , then f ( c ) isthe global minimum value of fon I . If f ′′ ( x ) ≤ 0 , ∀ x ∈I , and if there is a number c ∈I where f ′ ( c ) = 0 , then f ( c ) isthe global maximum value of fon I . x − sec x and 2 For example, let f ( x ) = 3 − sin x I = ( 2 − sec x tan x = 2 = 0 ⇒ x = − π π , ) . f ( x ) = π . Further, ′ cos 2 x 3 6 2 2 3 f ′′ ( x ) = − sec x (tan 2 x +sec 2 x ) < 0on( − π , π ). Therefore, f attains the maximumvalue 2 2 − on I . f ( ) = π π 2 √ 6 9 3 October 19, 2018 203 / 424

  9. Global Extrema on Open Intervals (contd) As another example, let us find the dimensions of the cone with minimum volume that can contain a sphere with radius R . Let h be the height of the cone and r the radius of its base. 1 2 The objective to be minimized is the volume f ( r , h ) = r h . The constraint betwen r and h is π 3 √ h 2 h − R = + r shown in Figure 10. The traingle AEF is similar to traingle ADB and therefore, 2 . R r Figure 10: October 19, 2018 204 / 424

  10. Constrained Optimization and Subgradient Descent October 19, 2018 206 / 424

  11. Constrained Optimization Consider the objective min f ( x ) s.t. g i ( x ) ≤ 0 , ∀ i Recall: Indicator function for g i ( x ) { i ( x ) ≤ 0 0 , if g I ( x ) = g i ∞ , otherwise ▶ We have shown that this is convex if each g i ( x )is convex. Option 1: Subgradient descent on f(x) + I_g(x) October 19, 2018 207 / 424

  12. Constrained Optimization Consider the objective min f ( x ) s.t. g i ( x ) ≤ 0 , ∀ i Recall: Indicator function for g i ( x ) { i ( x ) ≤ 0 0 , if g I ( x ) = g i ∞ , otherwise ▶ We have shown that this is convex if each g i ( x )is convex. ∑ Option 1: Use subgradient descent to minimize f ( x ) + I ( x ) g i i Option 2: Barrier Method (approximate I g i ( x )using some differentiable and non-decreasing function such as − (1/ t )log − u ), Augmented Lagrangian, ADMM, etc. October 19, 2018 207 / 424

  13. Option 1: (Sub)Gradient Descent with Sum of indicators Convert our objective to the following unconstrained optimization problem { } | g ( x ) ≤ 0 is convex if g ( x )is convex. Each C = x i i i We take ∑ min F ( x ) =min f ( x ) + I ( x ) x C i x i Recap a subgradient of F : October 19, 2018 208 / 424

  14. Option 1: (Sub)Gradient Descent with Sum of indicators Convert our objective to the following unconstrained optimization problem { } | g ( x ) ≤ 0 is convex if g ( x )is convex. Each C = x i i i We take ∑ min F ( x ) =min f ( x ) + I ( x ) x C i x i ∑ Recap a subgradient of F : h ( x ) = h ( x ) + h ( x ). Recallthat F f I i Ci k optimizes ▶ h f ( x ) = ∇ f ( x )if f ( x )is differentiable. Also, −∇ f ( x )at x Let us treat the gradient of f at x^k as that vector which minimized the second order quadratic expansion of f around x^k October 19, 2018 208 / 424

  15. Option 1: (Sub)Gradient Descent with Sum of indicators Convert our objective to the following unconstrained optimization problem { } | g ( x ) ≤ 0 is convex if g ( x )is convex. Each C = x i i i We take ∑ min F ( x ) =min f ( x ) + I ( x ) x C i x i ∑ Recap a subgradient of F : h ( x ) = h ( x ) + h ( x ). Recallthat F f I i Ci ▶ h f ( x ) = ∇ f ( x )if f ( x )is differentiable. Also, −∇ f ( x )at x k optimizes the first order 1 : −∇ f ( x ) =argmin f ∇ f ( x ) h + || h || : approximation for f ( x )around x k k T k 2 ( x ) + 2 h Variations on the form of 1 || h || 2 lead to Mirror Descent etc. replacing with entropic 2 r e g u l a r i z e r ▶ h I Ci ( x )is d ∈ R n s.t. d T x ≥ d T y , ∀ y ∈ C . Also, h ( x ) = 0if x is in t h e i n t e r i o r of C , and i I Ci i has other solutions if x is on the boundary: Analysis for convex g i ’s leads to KKT conditions and Dual Ascent etc. October 19, 2018 208 / 424

  16. Option 1: Generalized Gradient Descent Consider the problem of minimizing the following sum of a differentiable function f ( x ) ∑ and a (possibly) nondifferentiable function c ( x )(an example being i I C i ( x )) m x in F ( x ) =mi x n f ( x ) + c ( x ) k leaving As in gradient descent, consider the first order approximation for f ( x )around x k +1 : c ( x )alone to obtain the next iterate x 1 k +1 k ∇ f ( x )( x − x ) + T k k || x − x || + k 2 =argmin f ( x )+ c ( x ) x 2 t x October 19, 2018 209 / 424

  17. Option 1: Generalized Gradient Descent Consider the problem of minimizing the following sum of a differentiable function f ( x ) ∑ and a (possibly) nondifferentiable function c ( x )(an example being i I C i ( x )) m x in F ( x ) =mi x n f ( x ) + c ( x ) k leaving As in gradient descent, consider the first order approximation for f ( x )around x k +1 : c ( x )alone to obtain the next iterate x 1 k +1 =argmin f ( x ) + ∇ f k T ( x )( x − x ) + k k || x − x || + c ( x ) k 2 x 2 t x Deleting f ( k ) from the objective and adding t ||∇ f ( x k ) || 2 to the objective (without any x 2 loss) to complete squares, we obtain x k +1 as: October 19, 2018 209 / 424

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend