Optimization and Simulation Constrained optimization Michel - - PowerPoint PPT Presentation

optimization and simulation
SMART_READER_LITE
LIVE PREVIEW

Optimization and Simulation Constrained optimization Michel - - PowerPoint PPT Presentation

Optimization and Simulation Constrained optimization Michel Bierlaire Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique F ed erale de Lausanne M. Bierlaire (TRANSP-OR ENAC


slide-1
SLIDE 1

Optimization and Simulation

Constrained optimization Michel Bierlaire

Transport and Mobility Laboratory School of Architecture, Civil and Environmental Engineering Ecole Polytechnique F´ ed´ erale de Lausanne

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 1 / 72

slide-2
SLIDE 2

The problem

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 2 / 72

slide-3
SLIDE 3

The problem

Optimization: the problem

min

x∈Rn f (x)

subject to h(x) = g(x) ≤ x ∈ X ⊆ Rn Modeling elements

1 Decision variables: x 2 Objective function: f : Rn → R (n > 0) 3 Constraints:

equality: h : R → Rm (m ≥ 0) inequality: g : Rn → Rp (p ≥ 0) X is a convex set

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 3 / 72

slide-4
SLIDE 4

The problem

The problem

Assumptions xi, i = 1, . . . , n, are continuous variables f , g and h are sufficiently differentiable Y = {x ∈ Rn|h(x) = 0, g(x) ≤ 0 and x ∈ X} is non empty Local minimum x∗ ∈ Y is a local minimum of the above problem if there exists ε > 0 such that f (x∗) ≤ f (x) ∀x ∈ Y such that x − x∗ < ε. Global minimum x∗ ∈ Y is a global minimum of the above problem if f (x∗) ≤ f (x) ∀x ∈ Y .

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 4 / 72

slide-5
SLIDE 5

Duality

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 5 / 72

slide-6
SLIDE 6

Duality

Lagrangian

Assume X = Rn in the above problem Consider λ ∈ Rm Consider µ ∈ Rp Definition The function L : Rn+m+p → R defined as L(x, λ, µ) = f (x) + λTh(x) + µTg(x) = f (x) + m

i=1 λihi(x) + p j=1 µjgj(x)

is called the lagrangian function.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 6 / 72

slide-7
SLIDE 7

Duality

Dual function

Dual function The function q : Rm+p → R defined as q(λ, µ) = min

x∈Rn L(x, λ, µ)

is called the dual function of the optimization problem. Dual variables Parameters λ and µ are called dual variables. x are called primal variables. Bound on the optimal solution If x∗ is a global minimum of the optimization problem, then, for any λ ∈ Rm and any µ ∈ R, µ ≥ 0, we have q(λ, µ) ≤ f (x∗).

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 7 / 72

slide-8
SLIDE 8

Duality

Dual problem

Constrain the dual function to be bounded Let Xq ⊆ Rm+p be the domain of q, that is Xq = {λ, µ|q(λ, µ) > −∞} Dual problem max

λ,µ q(λ, µ)

subject to µ ≥ 0 and (λ, µ) ∈ Xq is called the dual problem of the original problem, which is called the primal problem in this context.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 8 / 72

slide-9
SLIDE 9

Duality

Duality results

Weak duality theorem Let x∗ be a global minimum of the primal problem, and (λ∗, µ∗) a global maximum of the dual problem. Then, q(λ∗, µ∗) ≤ f (x∗). Convexity-concavity of the dual problem The objective function of the dual problem is concave. The feasible set of the dual problem is convex.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 9 / 72

slide-10
SLIDE 10

Feasible directions

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 10 / 72

slide-11
SLIDE 11

Feasible directions

Feasible directions

Definitions x ∈ Rn is a feasible point if it verifies the constraints Given x feasible, d is a feasible direction in x if there is η > 0 such that x + αd is feasible for any 0 ≤ α ≤ η. Convex constraints Let X ⊆ Rn be a convex set, and x, y ∈ X, x = y. The direction d = y − x is feasible in x. Moreover, for each 0 ≤ α ≤ 1, αx + (1 − α)y is feasible.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 11 / 72

slide-12
SLIDE 12

Feasible directions

Feasible directions

Interior point Let X ⊆ Rn Let x be an interior point, that is there exists ε > 0 such that x − z ≤ ε = ⇒ z ∈ X. Then, any direction d is feasible in x.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 12 / 72

slide-13
SLIDE 13

Feasible directions

Feasible sequence

Definition Consider the generic optimization problem Let x+ ∈ Rn be a feasible point The sequence (xk)k is said to be feasible in x+ if

limk→∞ xk = x+, ∃k0 such that xk is feasible if k ≥ k0, xk = x+ for all k.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 13 / 72

slide-14
SLIDE 14

Feasible directions

Feasible sequence

Example One equality constraint h(x) = x2

1 − x2 = 0,

Feasible point: x+ = (0, 0)T Feasible sequence: xk =

  • 1

k 1 k2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 14 / 72

slide-15
SLIDE 15

Feasible directions

Feasible sequence

  • 1
  • 0.5

0.5 1 x2 x1 h(x) = x2

1 − x2 = 0

x+ = 0

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 15 / 72

slide-16
SLIDE 16

Feasible directions

Feasible direction at the limit

Main idea Consider the sequence of directions dk = xk − x+ xk − x+, and take the limit. Directions dk are not necessarily feasible The sequence may not always converge Subsequences must then be considered

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 16 / 72

slide-17
SLIDE 17

Feasible directions

Feasible direction at the limit

  • 1
  • 0.5

0.5 1 x2 x1 h(x) = x2

1 − x2 = 0

x+ = 0

  • d3

d2 d1 d

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 17 / 72

slide-18
SLIDE 18

Feasible directions

Feasible direction at the limit

Example Constraint: h(x) = x2

1 − x2 = 0

Feasible point: x+ = (0, 0)T Feasible sequence: xk =

  • (−1)k

k 1 k2

  • Sequence of directions:

dk =

  • (−1)kk

√ k2+1 1 √ k2+1,

  • Two feasible directions at the limit
  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 18 / 72

slide-19
SLIDE 19

Feasible directions

Feasible direction at the limit

  • 1
  • 0.5

0.5 1 x2 x1 h(x) = x2

1 − x2 = 0

x+ = 0

  • d4

d3 d2 d1 d′ d′′

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 19 / 72

slide-20
SLIDE 20

Feasible directions

Feasible direction at the limit

Definition Consider the generic optimization problem Let x+ ∈ Rn be feasible Let (xk)k be a feasible sequence in x+ Then, d = 0 is a feasible direction at the limit in x+ for the sequence (xk)k if there exists a subsequence (xki)i such that d d = lim

i→∞

xki − x+ xki − x+.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 20 / 72

slide-21
SLIDE 21

Feasible directions

Feasible direction at the limit

Notes It is sometimes called a tangent direction. THe set of all tangent directions is called the tangent cone. Any feasible direction d is also a feasible direction at the limit, for the sequence xk = x+ + 1 k d.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 21 / 72

slide-22
SLIDE 22

Feasible directions

Linearized cone

Definition Consider the generic optimization problem Let x+ ∈ Rn be feasible The set of directions d such that dT∇gi(x+) ≤ 0, ∀i = 1, . . . , p such that gi(x+) = 0, and dT∇hi(x+) = 0, i = 1, . . . , m, as well as their multiples αd, α > 0, is the linearized cone at x+.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 22 / 72

slide-23
SLIDE 23

Feasible directions

Linearized cone

  • 1
  • 0.5

0.5 1 x2 x1 h(x) = x2

1 − x2 = 0

x+ = 0 d′ d′′ ∇h(x+)

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 23 / 72

slide-24
SLIDE 24

Feasible directions

Linearized cone

Theorem: Consider the generic optimization problem Let x+ ∈ Rn be feasible If d is a feasible direction at the limit at x+ Then d belongs to the linearized cone at x+

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 24 / 72

slide-25
SLIDE 25

Feasible directions

Constraint qualification

Definition: Consider the generic optimization problem Let x+ ∈ Rn be feasible The constraint qualification condition is verified if every direction in the linearized cone at x+ is also in the tangent cone, that is, it is a feasible direction at the limit at x+. This is verified in particular if the constraints are linear, or if the gradients of the constraints active at x+ are linearly independent.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 25 / 72

slide-26
SLIDE 26

Optimality conditions

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 26 / 72

slide-27
SLIDE 27

Optimality conditions

Optimality conditions

Necessary condition for the generic problem Let x∗ be a local minimum of the generic problem Then ∇f (x∗)Td ≥ 0 for each direction d which is feasible at the limit at x∗. Intuition No “feasible” direction is a descent direction

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 27 / 72

slide-28
SLIDE 28

Optimality conditions Convex constraints

Optimality conditions: convex problem (I)

Consider the problem min

x f (x)

subject to x ∈ X ⊆ Rn where X is convex and not empty. Necessary optimality condition If x∗ is a local minimum of this problem Then, for any x ∈ X, ∇f (x∗)T(x − x∗) ≥ 0.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 28 / 72

slide-29
SLIDE 29

Optimality conditions Convex constraints

Optimality conditions: convex problem (II)

Necessary condition with projection Assume now that X is convex and closed. For any y ∈ Rn, we note by [y]P the projection of y on X. If x∗ is a local minimum, then x∗ = [x∗ − α∇f (x∗)]P ∀α > 0. Moreover, if f is convex, the condition is sufficient. Note Useful when the projection is easy to compute (e.g. bound constraints)

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 29 / 72

slide-30
SLIDE 30

Optimality conditions Lagrange multipliers: necessary conditions

Optimality conditions: Karush-Kuhn-Tucker

The problem min

x∈Rn f (x)

subject to h(x) = [h : Rn → Rm] g(x) ≤ [g : Rn → Rp] x ∈ X = Rn Assumptions x∗ is a local minimum L is the Lagrangian L(x, λ, µ) = f (x) + λTh(x) + µTg(x). The constraint qualification condition is verified.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 30 / 72

slide-31
SLIDE 31

Optimality conditions Lagrange multipliers: necessary conditions

Optimality conditions: Karush-Kuhn-Tucker

Then... ... there exists a unique λ∗ ∈ Rm and a unique µ∗ ∈ Rp such that ∇xL(x∗, λ∗, µ∗) = ∇f (x∗) + (λ∗)T∇h(x∗) + (µ∗)T∇g(x∗) = 0, µ∗

j ≥ 0

j = 1, . . . , p, and µ∗

j gj(x∗) = 0

j = 1, . . . , p. If f , g and h are twice differentiable, we also have yT∇2

xxL(x∗, λ∗, µ∗)y ≥ 0

∀y = 0 such that yT∇hi(x∗) = 0 i = 1, . . . , m yT∇gi(x∗) = 0 i = 1, . . . , p such that gi(x∗) = 0.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 31 / 72

slide-32
SLIDE 32

Optimality conditions Lagrange multipliers: sufficient conditions

KKT: sufficient conditions

Let x∗ ∈ Rn, λ∗ ∈ Rm and µ∗ ∈ Rp be such that ∇xL(x∗, λ∗, µ∗) = 0 h(x∗) = 0, g(x∗) ≤ 0 µ∗ ≥ 0, µ∗

j gj(x∗) = 0

∀j, µ∗

j > 0

∀j such that gi(x∗) = 0. yT∇2

xxL(x∗, λ∗, µ∗)y > 0

∀y = 0 such that yT∇hi(x∗) = 0 i = 1, . . . , m yT∇gi(x∗) = 0 i = 1, . . . , p such that gi(x∗) = 0. Then x∗ is a strict local minimum of the problem.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 32 / 72

slide-33
SLIDE 33

Algorithms

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 33 / 72

slide-34
SLIDE 34

Algorithms Constrained Newton

Constrained Newton

Context Problem with a convex constraint set. Assumption: it is easy to project on the set. Examples: bound constraints, linear constraints. Main idea In the unconstrained case, Newton = preconditioned steepest descent Consider first the projected gradient method Precondition it.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 34 / 72

slide-35
SLIDE 35

Algorithms Constrained Newton

Projected gradient method

0.5 1 1.5 2 2.5 3 3.5 4 4.5 5

  • 0.5

0.5 1 1.5 2 2.5 3 x0 x∗ x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 35 / 72

slide-36
SLIDE 36

Algorithms Constrained Newton

Condition number

Consider ∇2f (x) positive definite. Let λ1 be the largest eigenvalue, and λn the smallest. The condition number is equal to λ1/λn. Geometrically, it is the ratio between the largest and the smallest curvature. The closest it is to one, the better.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 36 / 72

slide-37
SLIDE 37

Algorithms Constrained Newton

Condition number

  • 2
  • 1

1 2

  • 2
  • 1

1 2 x1 x2

  • 2
  • 1

1 2

  • 2
  • 1

1 2 x1 x2 Cond = 9/2 Cond = 1

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 37 / 72

slide-38
SLIDE 38

Algorithms Constrained Newton

Preconditioning

Preconditioning = appropriate change of variables Let M ∈ Rn×n be invertible. Change of variables = linear application x′ = Mx. Consider a function f : Rn → R ˜ f (x′) = f (M−1x′) ∇˜ f (x′) = M−T∇f (M−1x′) = M−T∇f (x) ∇2 ˜ f (x′) = M−T∇2f (M−1x′)M−1 = M−T∇2f (x)M−1.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 38 / 72

slide-39
SLIDE 39

Algorithms Constrained Newton

Preconditioning

Consider a function f : Rn → R ˜ f (x′) = f (M−1x′) ∇˜ f (x′) = M−T∇f (M−1x′) = M−T∇f (x) ∇2 ˜ f (x′) = M−T∇2f (M−1x′)M−1 = M−T∇2f (x)M−1. What change of variable? Consider ∇2f (x) = LLT. Define x′ = LTx, that is, M = LT. Then, ∇2 ˜ f (x′) = L−1∇2f (x)L−T = L−1LLTL−T = I.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 39 / 72

slide-40
SLIDE 40

Algorithms Constrained Newton

Readings

Bierlaire (2006) Chapter 18. Bertsekas (1999) Section 2.3.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 40 / 72

slide-41
SLIDE 41

Algorithms Interior point methods

Interior point methods

Motivation At an interior point, every direction is feasible. It gives more freedom to the algorithm. Main ideas Focus first on being feasible. Then try to become optimal.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 41 / 72

slide-42
SLIDE 42

Algorithms Interior point methods

Barrier functions

Definition Let X ⊂ Rn be a closed set. Let g : Rn → Rm a convex function. Let S be the set of interior points for g: S = {x ∈ Rn|x ∈ X, g(x) < 0}. A function barrier B : S → R is continuous and such that lim

x∈S,g(x)→0 B(x) = +∞.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 42 / 72

slide-43
SLIDE 43

Algorithms Interior point methods

Barrier functions

Examples B(x) = −

m

  • j=1

ln(−gj(x)) B(x) = −

m

  • j=1

1 gj(x).

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 43 / 72

slide-44
SLIDE 44

Algorithms Interior point methods

Barrier functions: example (logarithmic)

1 ≤ x ≤ 3 = ⇒ B(x) = − ln(x − 1) − ln(3 − x). 5 10 15 20 25 30 1 1.5 2 2.5 3 εB(x) x ε = 100 ε = 10 ε = 1 5 10 15 20 25 30 1 1.5 2 2.5 3 εB(x) x ε = 100 ε = 10 ε = 1 5 10 15 20 25 30 1 1.5 2 2.5 3 εB(x) x ε = 100 ε = 10 ε = 1

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 44 / 72

slide-45
SLIDE 45

Algorithms Interior point methods

Barrier methods

Algorithmic idea Define a sequence of parameters (εk)k such that

0 < εk+1 < εk, k = 0, 1, . . . limk εk = 0.

At each iteration, solve xk = argminx∈S f (x) + εkB(x). Issues The subproblem should be easy to solve. In particular, we should rely on unconstrained optimization. A descent method should not go outside the constraints, thanks to the barrier. The speed of convergence of (εk)k is critical.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 45 / 72

slide-46
SLIDE 46

Algorithms Interior point methods

Barrier methods

Typical applications Linear optimization, convex optimization Example min x1 + 2x2 + 3x3 subject to x1 + x2 + x3 = 1 xi ≥ 0 , i = 1, 2, 3 . Problem with barrier xk = argminx1+x2+x3=1, x>0 x1 + 2x2 + 3x3 − εk(ln x1 + ln x2 + ln x3) .

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 46 / 72

slide-47
SLIDE 47

Algorithms Interior point methods

Central path

x1 x2 x3

  • x∞
  • x∗
  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 47 / 72

slide-48
SLIDE 48

Algorithms Interior point methods

Readings

Bierlaire (2006) Chapter 19. Bertsekas (1999) Section 4.1. See also: Wright, S. J. (1997) Primal-Dual Interior-Point Methods, SIAM

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 48 / 72

slide-49
SLIDE 49

Algorithms Augmented lagrangian

Augmented lagrangian

Main ideas Focus first on reducing the objective function, even if constraints are violated. Then recover feasibility. Inspired by the optimality conditions. Only equality constraints min

x∈Rn f (x)

subject to h(x) = 0 [h : Rn → Rm] Slack variables g(x) ≤ 0 ⇐ ⇒ g(x) + z2 = 0

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 49 / 72

slide-50
SLIDE 50

Algorithms Augmented lagrangian

Augmented Lagrangian

Penalize constraint violation a lagrangian relaxation, and a quadratic penalty function. Augmented lagrangian Lc(x, λ) = f (x) + λTh(x) + c 2h(x)2.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 50 / 72

slide-51
SLIDE 51

Algorithms Augmented lagrangian

Augmented Lagrangian: lagrangian relaxation

Assume λ∗ is known (see optimality conditions) The solution is given by solving the unconstrained problem min

x∈Rn Lc(x, λ∗) = f (x) + (λ∗)Th(x) + c

2h(x)2. with c sufficiently large. Not practical λ∗ is not known. As complicated to obtain than solving the original problem. So we approximate it.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 51 / 72

slide-52
SLIDE 52

Algorithms Augmented lagrangian

Augmented Lagrangian: quadratic penalty

Take any λ If c becomes large enough, any non feasible point is non optimal for min

x∈Rn Lc(x, λ) = f (x) + λTh(x) + c

2h(x)2. Consider a sequence (ck)k such that lim

ck→∞ = +∞.

Then, for a given λ, the sequence xk = argminx∈Rn Lck(x, λ) converges to a solution of the constrained problem.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 52 / 72

slide-53
SLIDE 53

Algorithms Augmented lagrangian

Augmented Lagrangian: quadratic penalty

Main issue If ck is large, Lck(x, λ) is ill-conditioned. Methods for unconstrained optimization become slow, or may even fail to converge. But... if λ is close to λ∗, no need for large values of ck. Theoretical result Under relatively general conditions, the sequence lim

k λk + ckh(xk)

converges to λ∗.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 53 / 72

slide-54
SLIDE 54

Algorithms Augmented lagrangian

Example

Problem min

x∈R2

1 2

  • −x2

1 + x2 2

  • subject to

x1 = 1 . Solution x∗ =

  • 1

T and λ∗ = 1 Augmented lagrangian Lc(x, λ) = 1 2

  • −x2

1 + x2 2

  • + λ(x1 − 1) + c

2 (x1 − 1)2 .

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 54 / 72

slide-55
SLIDE 55

Algorithms Augmented lagrangian

Example

Unconstrained minimization ∇xLc(x, λ) = (c − 1)x1 + λ − c x2

  • ,

which is zero at x =   c − λ c − 1   and ∇2

xxLc(x, λ) =

c − 1 1

  • .
  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 55 / 72

slide-56
SLIDE 56

Algorithms Augmented lagrangian

Example

Constrained problem

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗
  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

x1 x2 x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 56 / 72

slide-57
SLIDE 57

Algorithms Augmented lagrangian

Example

λ = λ∗ = 1, c = 2

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 57 / 72

slide-58
SLIDE 58

Algorithms Augmented lagrangian

Example

λ = λ∗ = 1, c = 0.5

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 58 / 72

slide-59
SLIDE 59

Algorithms Augmented lagrangian

Example

λ = λ∗ = 1, c = 10

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 59 / 72

slide-60
SLIDE 60

Algorithms Augmented lagrangian

Example

λ = 0, c = 2

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

+ x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 60 / 72

slide-61
SLIDE 61

Algorithms Augmented lagrangian

Example

λ = 0, c = 5

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗ +

x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 61 / 72

slide-62
SLIDE 62

Algorithms Augmented lagrangian

Example

λ = 0, c = 10

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗+

x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 62 / 72

slide-63
SLIDE 63

Algorithms Augmented lagrangian

Example

λ = 0, c = 100

  • 1-0.5 0 0.5 1 1.5 2 2.5 3
  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2

  • x∗

+ x1 x2

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 63 / 72

slide-64
SLIDE 64

Algorithms Augmented lagrangian

Augmented Lagrangian: algorithm

1 Use an unconstrained optimization algorithm to solve

xk+1 = argminx∈Rn Lck(x, λk) to a given precision εk.

2 If xk+1 is close to feasibility:

update the estimate of the multipliers: λk+1 = λk + ckh(xk) keep ck = ck+1, require more precision: εk+1 = εk/ck.

3 If xk+1 is far from feasibility:

keep λk+1 = λk increase ck, relax the precision: εk+1 = ε0/ck+1.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 64 / 72

slide-65
SLIDE 65

Algorithms Augmented lagrangian

Readings

Bierlaire (2006) Chapter 20. Bertsekas (1999) Section 4.2.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 65 / 72

slide-66
SLIDE 66

Algorithms Sequential quadratic programming

Sequential quadratic programming

Main ideas Apply Newton’s method to solve the necessary optimality conditions ∇L(x∗, λ∗) = 0. One iteration amounts to solve a quadratic problem. Enforce global convergence with a merit function. Only equality constraints min

x∈Rn f (x)

subject to h(x) = 0 [h : Rn → Rm]

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 66 / 72

slide-67
SLIDE 67

Algorithms Sequential quadratic programming

Sequential quadratic programming

Lagrangian and derivatives L(x, λ) = f (x) + λTh(x). ∇L(x, λ) = ∇xL(x, λ) h(x)

  • ,

∇2L(x, λ) = ∇2

xxL(x, λ)

∇h(x) ∇h(x)T

  • .

Solve ∇L(x, λ) = 0 Newton’s method: at each iteration, find d ∈ Rn+m such that ∇2L(xk, λk)d = −∇L(xk, λk).

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 67 / 72

slide-68
SLIDE 68

Algorithms Sequential quadratic programming

Sequential quadratic programming

Equivalent quadratic optimization problem min

d ∇f (xk)Td + 1

2dT∇2

xxL(xk, λk)d

subject to ∇h(xk)Td + h(xk) = 0. Quadratic optimization An analytical solution can be derived for this problem. In practice, dedicated iterative algorithms are used. Numerical stability and computational efficiency are important.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 68 / 72

slide-69
SLIDE 69

Algorithms Sequential quadratic programming

Sequential quadratic programming

Local convergence Newton’s method is not globally convergent. The same applies to the SQP method described above. Global convergence Idea: apply similar globalization techniques as for unconstrained

  • ptimization (line search, trust region).

Main concept: reject a candidate if it is not sufficiently better than the current one. But what does “better” mean here? Two (potentially) conflicting objectives:

decrease f (x) bring h(x) close to 0.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 69 / 72

slide-70
SLIDE 70

Algorithms Sequential quadratic programming

Sequential quadratic programming

Merit function φc(x) = f (x) + ch(x)1 = f (x) + c

m

  • i=1

|hi(x)|. Globalization Line search: use Wolfe’s conditions on the merit function. Technical difficulties: need to

guarantee that d is a descent direction for φc, deal with the non differentiability of φc.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 70 / 72

slide-71
SLIDE 71

Algorithms Sequential quadratic programming

Readings

Bierlaire (2006) Chapter 21. Bertsekas (1999) Section 4.3.

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 71 / 72

slide-72
SLIDE 72

Algorithms Sequential quadratic programming

Outline

1

The problem

2

Duality

3

Feasible directions

4

Optimality conditions Convex constraints Lagrange multipliers: necessary conditions Lagrange multipliers: sufficient conditions

5

Algorithms Constrained Newton Interior point methods Augmented lagrangian Sequential quadratic programming

  • M. Bierlaire (TRANSP-OR ENAC EPFL)

Optimization and Simulation 72 / 72