introduction to global optimization
play

Introduction to Global Optimization Fabio Schoen 2008 - PowerPoint PPT Presentation

Introduction to Global Optimization Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Introduction to Global Optimization p. Global Optimization Problems x S R n f ( x ) min What is it meant by global optimization? Of course


  1. Let ¯ x best known solution. Let x ) = { x ∈ Ω : c T x ≤ c T ¯ D (¯ x } If D (¯ x ) ⊆ C then ¯ x is optimal; Check: a polytope P (with known vertices) is built which contains D (¯ x ) If all vertices of P are in C ⇒ optimal solution. Otherwise let v : best feasible vertex; the intersection of the segment [0 , v ] with ∂C (if feasible) is an improving point x . Otherwise a cut is introduced in P which is tangent to Ω in x . Introduction to Global Optimization – p. 2

  2. x ) = { x ∈ Ω : c T x ≤ c T ¯ D (¯ x } 4 c T x = 0 3 2 C 1 Ω 0 -1 x ¯ -2 -3 -4 Introduction to Global Optimization – p. 2 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  3. Initialization Given a feasible solution ¯ x , take a polytope P such that P ⊇ D (¯ x ) i.e. y : c T y ≤ c T ¯ x y feasible ⇒ y ∈ P If P ⊂ C , i.e. if y ∈ P ⇒ h ( y ) ≤ 0 then ¯ x is optimal. Checking is easy if we know the vertices of P . Introduction to Global Optimization – p. 2

  4. x ) ⊆ P with vertices V 1 , . . . , V k . V ⋆ := arg max h ( V j ) P : D (¯ 4 c T x = 0 3 2 C 1 Ω 0 -1 x ¯ -2 -3 V ⋆ -4 Introduction to Global Optimization – p. 2 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  5. Step 1 Let V ⋆ the vertex with largest h () value. Surely h ( V ⋆ ) > 0 (otherwise we stop with an optimal solution) Moreover: h (0) < 0 (0 is in the interior of C ). Thus the line from V ⋆ to 0 must intersect the boundary of C Let x k be the intersection point. It might be feasible ( ⇒ improving) or not. Introduction to Global Optimization – p. 2

  6. x k = ∂C ∩ [ V ⋆ , 0] 4 c T x = 0 3 2 C 1 Ω 0 -1 x k x ¯ -2 -3 V ⋆ -4 Introduction to Global Optimization – p. 3 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  7. If x k ∈ Ω , set ¯ x := x k 4 c T x = 0 3 2 C 1 Ω 0 -1 x ¯ -2 -3 -4 Introduction to Global Optimization – p. 3 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  8. Otherwise if x k �∈ Ω , the polytope is divided 4 c T x = 0 3 2 C 1 Ω 0 -1 -2 -3 -4 Introduction to Global Optimization – p. 3 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  9. Otherwise if x k �∈ Ω , the polytope is divided 4 c T x = 0 3 2 C 1 Ω 0 -1 -2 -3 -4 Introduction to Global Optimization – p. 3 -9 -8 -7 -6 -5 -4 -3 -2 -1 0 1 2

  10. Duality for d.c. problems min x ∈ S g ( x ) − h ( x ) where f, g : convex. Let h ⋆ ( u ) := sup { u T x − h ( x ) : x ∈ R n } g ⋆ ( u ) := sup { u T x − g ( x ) : x ∈ R n } the conjugate functions of h e g . The problem inf { h ⋆ ( u ) − g ⋆ ( u ) : u : h ⋆ ( u ) < + ∞} is the Fenchel-Rockafellar dual. If min g ( x ) − h ( x ) admits an optimum, then Fenchel dual is a strong dual. Introduction to Global Optimization – p. 3

  11. If x ⋆ ∈ arg min g ( x ) − h ( x ) then u ⋆ ∈ ∂h ( x ⋆ ) ( ∂ denotes subdifferential) is dual optimal and if u ⋆ ∈ arg min h ⋆ ( u ) − g ⋆ ( u ) then x ⋆ ∈ ∂g ⋆ ( u ⋆ ) is an optimal primal solution. Introduction to Global Optimization – p. 3

  12. A primal/dual algorithm P k : min g ( x ) − ( h ( x k ) + ( x − x k ) T y k ) and D k : min h ⋆ ( y ) − ( g ⋆ ( y k − 1 ) + x T k ( y − y k − 1 ) Introduction to Global Optimization – p. 3

  13. Exact Global Optimization Introduction to Global Optimization – p. 3

  14. GlobOpt - relaxations Consider the global optimization problem (P): min f ( x ) x ∈ X and assume the min exists and is finite and that we can use a relaxation (R): min g ( y ) y ∈ Y Usually both X and Y are subsets of the same space R n . Recall: (R) is a relaxation of (P) iff: X ⊆ Y g ( x ) ≤ f ( x ) for all x ∈ X Introduction to Global Optimization – p. 3

  15. Branch and Bound 1. Solve the relaxation (R) and let L be the (global) optimum value (assume it is feasible for (R)) 2. (Heuristically) solve the original problem (P) (or, more generally, find a “good” feasible solution to (P) in X ). Let U be the best feasible function value known 3. if U − L ≤ ε then stop: U is a certified ε –optimum for (P) 4. otherwise split X and Y into two parts and apply to each of them the same method Introduction to Global Optimization – p. 3

  16. Tools “good relaxations”: easy yet accurate good upper bounding, i.e., good heuristics for (P) Good relaxations can be obtained, e.g., through: convex relaxations domain reduction Introduction to Global Optimization – p. 3

  17. Convex relaxations Assume X is convex and Y = X . If g is the convex envelop of f on X , then solving the convex relaxation (R), in one step gives the certified global optimum for (P). g ( x ) is a convex under-estimator of f on X if: g ( x ) is convex g ( x ) ≤ f ( x ) ∀ x ∈ X g is the convex envelop of f on X if: g is a convex under-estimator of f g ( x ) ≥ h ( x ) ∀ x ∈ X ∀ h : convex under-estimator of f Introduction to Global Optimization – p. 4

  18. A 1-D example Introduction to Global Optimization – p. 4

  19. Convex under-estimator Introduction to Global Optimization – p. 4

  20. Branching Introduction to Global Optimization – p. 4

  21. Bounding Upper bound fathomed lower bounds Introduction to Global Optimization – p. 4

  22. Relaxation of the feasible domain Let min x ∈ S f ( x ) be a GlobOpt problem where f is convex, while S is non convex. A relaxation (outer approximation) is obtained replacing S with a larger set Q . If Q is convex ⇒ convex optimization problem. If the optimal solution to min x ∈ Q f ( x ) belongs to S ⇒ optimal solution to the original problem. Introduction to Global Optimization – p. 4

  23. Example x ∈ [0 , 5] ,y ∈ [0 , 3] − x − 2 y min xy ≤ 3 4 3 2 1 0 0 1 2 3 4 5 6 Introduction to Global Optimization – p. 4

  24. Relaxation x ∈ [0 , 5] ,y ∈ [0 , 3] − x − 2 y min xy ≤ 3 We know that: ( x + y ) 2 = x 2 + y 2 + 2 xy thus xy = (( x + y ) 2 − x 2 − y 2 ) / 2 and, as x and y are non-negative, x 2 ≤ 5 x , y 2 ≤ 3 y , thus a (convex) relaxation of xy ≤ 3 is ( x + y ) 2 − 5 x − 3 y ≤ 6 Introduction to Global Optimization – p. 4

  25. Relaxation 4 3 2 1 0 0 1 2 3 4 5 6 Optimal solution of the relaxed convex problem: (2 , 3) (value: − 8 ) Introduction to Global Optimization – p. 4

  26. Stronger Relaxation x ∈ [0 , 5] ,y ∈ [0 , 3] − x − 2 y min xy ≤ 3 Thus: (5 − x )(3 − y ) ≥ 0 ⇒ 15 − 3 x − 5 y + xy ≥ 0 ⇒ xy ≥ 3 x + 5 y − 15 Thus a (convex) relaxation of xy ≤ 3 is 3 x + 5 y − 15 ≤ 3 i.e.: 3 x + 5 y ≤ 18 Introduction to Global Optimization – p. 4

  27. Relaxation 4 3 2 1 0 0 1 2 3 4 5 6 The optimal solution of the convex (linear) relaxation is (1 , 3) which is feasible ⇒ optimal for the original problem Introduction to Global Optimization – p. 5

  28. Convex (concave) envelopes How to build convex envelopes of a function or how to relax a non convex constraint? Convex envelopes ⇒ lower bounds Convex envelopes of − f ( x ) ⇒ upper bounds Constraint: g ( x ) ≤ 0 ⇒ if h ( x ) is a convex underestimator of g then h ( x ) ≤ 0 is a convex relaxations. Constraint: g ( x ) ≥ 0 ⇒ if h ( x ) is concave and h ( x ) ≥ g ( x ) , then h ( x ) ≥ 0 is a “convex” constraint Introduction to Global Optimization – p. 5

  29. Convex envelopes Definition: a function is polyhedral if it is the pointwise maximum of a finite number of linear functions. (NB: in general, the convex envelope is the pointwise supremum of affine minorants) The generating set X of a function f over a convex set P is the set X = { x ∈ R n : ( x, f ( x )) is a vertex of epi ( conv P ( f )) } I.e., given f we first build its convex envelop in P and then define its epigraph { ( x, y ) : x ∈ P, y ≥ f ( x ) } . This is a convex set whose extreme points can be denoted by V . X are the x coordinates of V Introduction to Global Optimization – p. 5

  30. Generating sets * * * * Introduction to Global Optimization – p. 5

  31. b b b Introduction to Global Optimization – p. 5

  32. Characterization Let f ( x ) be continuously differentiable in a polytope P . The convex envelope of f on P is polyhedral if and only if X ( f ) = Vert ( P ) (the generating set is the vertex set of P ) Corollary: let f 1 , . . . , f m ∈ C 1 ( P ) and � i f i ( x ) possess polyhedral convex envelopes on P . Then � � Conv ( f i ( x )) = Conv f i ( x ) i i iff the generating set of � i Conv ( f i ( x )) is Vert ( P ) Introduction to Global Optimization – p. 5

  33. Characterization If a f ( x ) is such that Conv f ( x ) is polyhedral, than an affine function h ( x ) such that 1. h ( x ) ≤ f ( x ) for all x ∈ Vert ( P ) 2. there exist n + 1 affinely independent vertices of P , V 1 , . . . , V n +1 such that f ( V i ) = h ( V i ) i = 1 , . . . , n + 1 belongs to the polyhedral description of Conv f ( x ) and h ( x ) = conv f ( x ) for any x ∈ Conv ( V 1 , . . . , V n +1 ) . Introduction to Global Optimization – p. 5

  34. Characterization The condition may be reversed: given m affine functions h 1 , . . . , h m such that, for each of them 1. h j ( x ) ≤ f ( x ) for all x ∈ Vert ( P ) 2. there exist n + 1 affinely independent vertices of P , V 1 , . . . , V n +1 such that f ( V i ) = h j ( V i ) i = 1 , . . . , n + 1 Then the function ψ ( x ) = max j φ j ( x ) is the convex envelope of a polyhedral function f iff the generating set of ψ is Vert (P) for every vertex V i we have ψ ( V i ) = f ( V i ) Introduction to Global Optimization – p. 5

  35. Sufficient condition If f ( x ) is lower semi- continuous in P and for all x �∈ Vert ( P ) there exists a line ℓ x : x ∈ interior of P ∩ ℓ x and f ( x ) is concave in a neighborhood of x on ℓ x , then Conv f ( x ) is polyhedral Application: let � f ( x ) = α ij x i x j i,j The sufficient condition holds for f in [0 , 1] n ⇒ bilinear forms are polyhedral in an hypercube Introduction to Global Optimization – p. 5

  36. Application: a bilinear term (Al-Khayyal, Falk (1983)): let x ∈ [ ℓ x , u x ] , y ∈ [ ℓ y , u y ] . Then the convex envelope of xy in [ ℓ x , u x ] × [ ℓ y , u y is φ ( x, y ) = max { ℓ y x + ℓ x y − ℓ x ℓ y ; u y x + u x y − u x u y } In fact: φ ( x, y ) is a under-estimate of xy : ( x − ℓ x )( y − ℓ y ) ≥ 0 xy ≥ ℓ y x + ℓ x y − ℓ x ℓ y and analogously for xy ≥ u y x + u x y − u x u y Introduction to Global Optimization – p. 5

  37. Bilinear terms xy ≥ φ ( x, y ) = max { ℓ y x + ℓ x y − ℓ x ℓ y ; u y x + u x y − u x u y } No other (polyhedral) function underestimating xy is tighter. In fact ℓ y x + ℓ x y − ℓ x ℓ y belongs to the convex envelope: it underestimates xy and coincides with xy at 3 vertices ( ( ℓ x , ℓ y ) , ( ℓ x , u y ) , ( u x , ℓ y ) ). Analogously for the other affine function. All vertices are interpolated by these 2 underestimating hyperplanes ⇒ they form the convex envelop of xy Introduction to Global Optimization – p. 6

  38. All easy then? Of course no! Many things can go wrong . . . It is true that, on the hypercube, a bilinear form: � α ij x i x j i<j is polyhedral (easy to see) but we cannot guarantee in general that the generating set of the envelope are the vertices of the hypercube! (in particular, if α ’s have opposite signs) if the set is not an hypercube, even a bilinear term might be non polyhedral: e.g. xy on the triangle { 0 ≤ x ≤ y ≤ 1 } Finding the (polyhedral) convex envelope of a bilinear form on a generic polytope P is NP–hard! Introduction to Global Optimization – p. 6

  39. Fractional terms A convex underestimate of a fractional term x/y over a box can be obtained through w ≥ ℓ x /y + x/u y − ℓ x /u y if ℓ x ≥ 0 w ≥ x/u y − ℓ x y/ℓ y u y + ℓ x /ℓ y if ℓ x < 0 w ≥ u x /y + x/ℓ y − u x /ℓ y if ℓ x ≥ 0 w ≥ x/ℓ y − u x y/ℓ y u y + u x /u y if ℓ x < 0 (a better underestimate exists) Introduction to Global Optimization – p. 6

  40. Univariate concave terms If f ( x ) , x ∈ [ ℓ x , u x ] , is concave, then the convex envelope is simply its linear interpolation at the extremes of the interval: f ( ℓ x ) + f ( u x ) − f ( ℓ x ) ( x − ℓ x ) u x − ℓ x Introduction to Global Optimization – p. 6

  41. Underestimating a general nonconvex fun Let f ( x ) ∈ C 2 be general non convex. Than a convex underestimate on a box can be defined as n � φ ( x ) = f ( x ) − α i ( x i − ℓ i )( u i − x i ) i =1 where α i > 0 are parameters. The Hessian of φ is ∇ 2 φ ( x ) = ∇ 2 f ( x ) + 2 diag ( α ) φ is convex iff ∇ 2 φ ( x ) is positive semi-definite. Introduction to Global Optimization – p. 6

  42. How to choose α i ’s? One possibility: uniform choice: α i = α . In this case convexity of φ is obtained iff � 0 , − 1 � α ≥ max 2 min x ∈ [ ℓ,u ] λ min ( x ) where λ min ( x ) is the minimum eigenvalue of ∇ 2 f ( x ) Introduction to Global Optimization – p. 6

  43. Key properties φ ( x ) ≤ f ( x ) φ interpolates f at all vertices of [ ℓ, u ] φ is convex Maximum separation: max( f ( x ) − φ ( x )) = 1 � ( u i − ℓ i ) 2 4 α i Thus the error in underestimation decreases when the box is split. Introduction to Global Optimization – p. 6

  44. Estimation of α Compute an interval Hessian [ H ] : [ H ( x )] ij = [ h L ij ( x ) , h U ij ( x )] in [ ℓ, u ] Find α such that [ H ] + 2 diag ( α ) � 0 . Gerschgorin theorem for real matrices: � � � λ min ≥ min h ii − | h ij | i j � = i Extension to interval matrices: � � ij |} u j − ℓ j � h L max {| h L ij | , | h U λ min ≥ min ii − u i − ℓ i i j � = i Introduction to Global Optimization – p. 6

  45. Improvements new relaxation functions (other than quadratic). Example n � (1 − e γ i ( x i − ℓ i ) )(1 − e γ i ( u i − x i ) ) Φ( x ; γ ) = − i =1 gives a tighter underestimate than the quadratic function partitioning: partition the domain into a small number of regions (hyper-rectangules); evaluate a convex underestimator in each region; join the underestimators to form a single convex function in the whole domain Introduction to Global Optimization – p. 6

  46. Domain (range) reduction Techniques for cutting the feasible region without cutting the global optimum solution. Simplest approaches: feasibility-based and optimality-based range reduction (RR). Let the problem be: min x ∈ S f ( x ) Feasibility based RR asks for solving ℓ i = min x i u i = max x i x ∈ S x ∈ S for all i ∈ 1 , . . . , n and then adding the constraints x ∈ [ ℓ, u ] to the problem (or to the sub-problems generated during Branch & Bound) Introduction to Global Optimization – p. 6

  47. Feasibility Based RR If S is a polyhedron, RR requires the solution of LP’s: [ ℓ ¯  , u ¯  ] = min / max x ¯  Ax ≤ b x ∈ [ L, U ] . based RR: from every constraint � “Poor man’s” L.P j a ij x j ≤ b i in which a i ¯  > 0 then � �  ≤ 1 � x ¯ b i − a ij x j ⇒ a i ¯  j � =¯  � �  ≤ 1 � x ¯ b i − min { a ij L j , a ij U j } a i ¯  j � =¯  Introduction to Global Optimization – p. 7

  48. Optimality Based RR Given an incumbent solution ¯ x ∈ S , ranges are updated by solving the sequence: ℓ i = min x i u i = max x i f ( x ) ≤ f (¯ x ) f ( x ) ≤ f (¯ x ) x ∈ S x ∈ S where f ( x ) is a convex underestimate of f in the current domain. RR can be applied iteratively (i.e., at the end of a complete RR sequence, we might start a new one using the new bounds) Introduction to Global Optimization – p. 7

  49. generalization min x ∈ X f ( x ) ( P ) g ( x ) ≤ 0 a (non convex) problem; let min X f ( x ) ( R ) x ∈ ¯ g ( x ) ≤ 0 be a convex relaxation of ( P ) : { x ∈ X : g ( x ) ≤ 0 } ⊆ { x ∈ ¯ X : g ( x ) ≤ 0 } and x ∈ X : g ( x ) ≤ 0 ⇒ f ( x ) ≤ f ( x ) Introduction to Global Optimization – p. 7

  50. R.H.S. perturbation Let φ ( y ) = min X f ( x ) ( R y ) x ∈ ¯ g ( x ) ≤ y be a perturbation of ( R ) . ( R ) convex ⇒ ( R y ) convex for any y . Let ¯ x : an optimal solution of ( R ) and assume that the i –th constraint is active: g (¯ x ) = 0 Then, if ¯ x y is an optimal solution of ( R y ) ⇒ g i ( x ) ≤ y i is active at x y if y i ≤ 0 ¯ Introduction to Global Optimization – p. 7

  51. Duality Assume ( R ) has a finite optimum at ¯ x with value φ (0) and Lagrange multipliers µ . Then the hyperplane H ( y ) = φ (0) − µ T y is a supporting hyperplane of the graph of φ ( y ) at y = 0 , i.e. φ ( y ) ≥ φ (0) − µ T y ∀ y ∈ R m Introduction to Global Optimization – p. 7

  52. Main result If ( R ) is convex with optimum value φ (0) , constraint i is active at the optimum and the Lagrange multiplier is µ i > 0 then, if U is an upper bound for the original problem ( P ) the constraint: g i ( x ) ≥ − ( U − L ) /µ i (where L = φ (0) ) is valid for the original problem ( P ) , i.e. it does not exclude any feasible solution with value better than U . Introduction to Global Optimization – p. 7

  53. proof Problem ( R y ) can be seen as a convex relaxation of the perturbed non convex problem Φ( y ) = min x ∈ X f ( x ) g ( x ) ≤ y and thus φ ( y ) ≤ Φ( y ) . Thus underestimating ( R y ) produces an underestimate of Φ( y ) . Let y := e i y i ; From duality: L − µ T e i y i ≤ φ ( e i y i ) ≤ Φ( e i y i ) If y i < 0 then U is an upper bound also for Φ( e i y i ) , thus L − µ i y i ≤ U . But if y i < 0 then constraint i is active. For any feasible x there exists a y i < 0 such that g ( x ) ≤ y i is active ⇒ we may substitute y i with g i ( x ) and deduce L − µ i g i ( x ) ≤ U Introduction to Global Optimization – p. 7

  54. Applications Range reduction: let x ∈ [ ℓ, u ] in the convex relaxed problem. If variable x i is at its upper bound in the optimal solution, them we can deduce x i ≥ max { ℓ i , u i − ( U − L ) /λ i } where λ i is the optimal multiplier associated to the i –th upper bound. Analogously for active lower bounds: x i ≤ min { u i , ℓ i + ( U − L ) /λ i } Introduction to Global Optimization – p. 7

  55. Let the constraint a T i x ≤ b i be active in an optimal solution of the convex relaxation ( R ) . Then we can deduce the valid inequality a i Tx ≥ b i − ( U − L ) /µ i Introduction to Global Optimization – p. 7

  56. Methods based on “merit functions” Bayesian algorithm: the objective function is considered as a realization of a stochastic process f ( x ) = F ( x ; ω ) A loss function is defined, e.g.: L ( x 1 , ..., x n ; ω ) = min i =1 ,n F ( x i ; ω ) − min x F ( x ; ω ) and the next point to sample is placed in order to minimize the expected loss (or risk) x n +1 = arg min E ( L ( x 1 , ..., x n , x n +1 ) | x 1 , ..., x n ) = arg min E (min( F ( x n +1 ; ω ) − F ( x ; ω )) | x 1 , ..., x n ) Introduction to Global Optimization – p. 7

  57. Radial basis method Given k observations ( x 1 , f 1 ) , . . . , ( x k , f k ) , an interpolant is built: n � s ( x ) = λ i Φ( � x − x i � ) + p ( x ) i =1 p : polynomial of a (prefixed) small degree m . Φ : radial function like, e.g.: Φ( r ) = r linear r 3 Φ( r ) = cubic r 2 log r Φ( r ) = thin plate spline e − γr 2 Φ( r ) = gaussian Polynomial p is necessary to guarantee existence of a unique interpolant (i.e. when the matrix { Φ ij = Φ( � x i − x j � ) } is singular) Introduction to Global Optimization – p. 8

  58. “Bumpiness” Let f ⋆ k an estimate of the value of the global optimum after k observations. Let s y k the (unique) interpolant of the data points ( x i , f i ) i = 1 , . . . , k ( y, f ⋆ k ) Idea: the most likely location of y is such that the resulting interpolant has minimum “bumpiness” Bumpiness measure: σ ( s k ) = ( − 1) m +1 � λ i s y k ( x i ) Introduction to Global Optimization – p. 8

  59. TO BE DONE Introduction to Global Optimization – p. 8

  60. Stochastic methods Pure Random Search - random uniform sampling over the feasible region Best start: like Pure Random Search, but a local search is started from the best observation Multistart: Local searches started from randomly generated starting points Introduction to Global Optimization – p. 8

  61. 3 2 + + 1 + + + + + r s r s s r r s r s s r s r r s r s r s 0 + + -1 + -2 -3 0 3 5 1 2 4 Introduction to Global Optimization – p. 8

  62. 3 2 + + 1 + + + + + r s r s s r r s r s s r s r r s r s r s 0 + + -1 + -2 -3 0 3 5 1 2 4 Introduction to Global Optimization – p. 8

  63. Clustering methods Given a uniform sample, evaluate the objective function Sample Transformation (or concentration): either a fraction of “worst” points are discarded, or a few steps of a gradient method are performed Remaining points are clustered from the best point in each cluster a single local search is started Introduction to Global Optimization – p. 8

  64. Uniform sample s r 5 s r r s s r s r − 3 4 s r s r r s s r s r r s − 5 3 r s r s 0 2 s r s r s r s r r s s r r s r s s r s r s r s r r s − 1 1 r s s r s r s r s r 0 0 1 2 3 4 5 Introduction to Global Optimization – p. 8

  65. Sample concentration s r 5 s r s r r s s r − 3 4 s r s r r s s r s r r s − 5 3 s r s r 0 2 s r + + + + r s + + + + + + + − 1 1 + + + + + 0 0 1 2 3 4 5 Introduction to Global Optimization – p. 8

  66. Clustering r 5 r r r r − 3 4 r r r r r − 5 3 u u r 0 2 r r − 1 1 0 0 1 2 3 4 5 Introduction to Global Optimization – p. 8

  67. Local optimization r 5 r r r r − 3 4 r r r r r − 5 3 u u r 0 2 r r − 1 1 0 0 1 2 3 4 5 Introduction to Global Optimization – p. 9

  68. Clustering: MLSL Sampling proceed in batches of N points. Given sample points X 1 , . . . , X k ∈ [0 , 1] n , label X j as “clustered” iff ∃ Y ∈ X 1 , . . . , X k : �� 1 1 � log k 1 + n n � √ || X j − Y || ≤ ∆ k := σ Γ k 2 2 π and f ( Y ) ≤ f ( X j ) Introduction to Global Optimization – p. 9

  69. Simple Linkage A sequential sample is generated (batches consist of a single observation). A local search is started only from the last sampled point (i.e. there is no “recall”) unless there exists a sufficiently near sampled point with better function valure Introduction to Global Optimization – p. 9

  70. Smoothing methods Given f : R n → R , the Gaussian transform is defined as: 1 � −� y − x � 2 /λ 2 � � � f � λ ( x ) = R n f ( y ) exp π n/ 2 λ n When λ is sufficiently large ⇒� f � λ is convex. Idea: starting with a large enough λ , minimize the smoothed function and slowly decrease λ towards 0. Introduction to Global Optimization – p. 9

  71. Smoothing methods 3 2.5 2 1.5 1 0.5 0 10 5 -10 0 -5 0 -5 5 -10 10 Introduction to Global Optimization – p. 9

  72. 3 2.5 2 1.5 1 0.5 0 10 5 -10 0 -5 0 -5 5 -10 10 Introduction to Global Optimization – p. 9

  73. 2.4 2.2 2 1.8 1.6 1.4 1.2 1 0.8 0.6 10 5 -10 0 -5 0 -5 5 -10 10 Introduction to Global Optimization – p. 9

  74. 2.2 2 1.8 1.6 1.4 1.2 1 0.8 10 5 -10 0 -5 0 -5 5 -10 10 Introduction to Global Optimization – p. 9

  75. 2.2 2 1.8 1.6 1.4 1.2 1 0.8 10 5 -10 0 -5 0 -5 5 -10 10 Introduction to Global Optimization – p. 9

  76. Transformed function landscape Elementary idea: local optimization smooths out many “high frequency” oscillations Introduction to Global Optimization – p. 9

Download Presentation
Download Policy: The content available on the website is offered to you 'AS IS' for your personal information and use only. It cannot be commercialized, licensed, or distributed on other websites without prior consent from the author. To download a presentation, simply click this link. If you encounter any difficulties during the download process, it's possible that the publisher has removed the file from their server.

Recommend


More recommend