15. Duality Upper and lower bounds General duality Constraint - - PowerPoint PPT Presentation

15 duality
SMART_READER_LITE
LIVE PREVIEW

15. Duality Upper and lower bounds General duality Constraint - - PowerPoint PPT Presentation

CS/ECE/ISyE 524 Introduction to Optimization Spring 201718 15. Duality Upper and lower bounds General duality Constraint qualifications Counterexample Complementary slackness Examples Sensitivity analysis Laurent


slide-1
SLIDE 1

CS/ECE/ISyE 524 Introduction to Optimization Spring 2017–18

  • 15. Duality

❼ Upper and lower bounds ❼ General duality ❼ Constraint qualifications ❼ Counterexample ❼ Complementary slackness ❼ Examples ❼ Sensitivity analysis

Laurent Lessard (www.laurentlessard.com)

slide-2
SLIDE 2

Upper bounds

Optimization problem (not necessarily convex!): minimize

x∈D

f0(x) subject to: fi(x) ≤ 0 for i = 1, . . . , m hj(x) = 0 for j = 1, . . . , r

❼ D is the domain of all functions involved. ❼ Suppose the optimal value is p⋆. ❼ Upper bounds: if x ∈ D satisfies fi(x) ≤ 0 and hj(x) = 0

for all i and j, then: p⋆ ≤ f0(x).

❼ Any feasible x yields an upper bound for p⋆.

15-2

slide-3
SLIDE 3

Lower bounds

Optimization problem (not necessarily convex!): minimize

x∈D

f0(x) subject to: fi(x) ≤ 0 for i = 1, . . . , m hj(x) = 0 for j = 1, . . . , r

❼ As with LPs, use the constraints to find lower bounds ❼ For any λi ≥ 0 and νj ∈ R, if x ∈ D is feasible, then

f0(x) ≥ f0(x) +

m

  • i=1

λifi(x) +

r

  • j=1

νjhj(x)

15-3

slide-4
SLIDE 4

Lower bounds

f0(x) ≥ f0(x) +

m

  • i=1

λifi(x) +

r

  • j=1

νjhj(x)

  • Lagrangian L(x, λ, ν)

This is a lower bound on f0, but we want a lower bound on p⋆. Minimize right side over x ∈ D and left side over feasible x. p⋆ ≥

  • inf

x∈D L(x, λ, ν)

  • = g(λ, ν)

This inequality holds whenever λ ≥ 0.

15-4

slide-5
SLIDE 5

Lower bounds

L(x, λ, ν) := f0(x) +

m

  • i=1

λifi(x) +

r

  • j=1

νjhj(x) Whenever λ ≥ 0, we have: g(λ, ν) :=

  • inf

x∈D L(x, λ, ν)

  • ≤ p⋆

Useful fact: g(λ, ν) is a concave function. This is true even if the original optimization problem is not convex! (because g is a pointwise minimum of affine functions)

15-5

slide-6
SLIDE 6

General duality

Primal problem (P) minimize

x∈D

f0(x) subject to: fi(x) ≤ 0 ∀i hj(x) = 0 ∀j Dual problem (D) maximize

λ,ν

g(λ, ν) subject to: λ ≥ 0 If x and λ are feasible points of (P) and (D) respectively: g(λ, ν) ≤ d⋆ ≤ p⋆ ≤ f0(x) This is called the Lagrange dual. Bad news: strong duality (p⋆ = d⋆) does not always hold!

15-6

slide-7
SLIDE 7

Example (Srikant)

minimize

x

x2 + 1 subject to: (x − 2)(x − 4) ≤ 0

  • 1

1 2 3 4 5 x

  • 5

5 10 15 20 25 30

x2 + 1 (x - 2) (x - 4)

❼ optimum occurs at x = 2, has value p⋆ = 5

15-7

slide-8
SLIDE 8

Example (Srikant)

Lagrangian: L(x, λ) = x2 + 1 + λ(x − 2)(x − 4)

  • 1

1 2 3 4 5 x

  • 5

5 10 15 20 25 30

❼ Plot for different values of λ ≥ 0 ❼ g(λ) = infx L(x, λ) should be a lower bound on

p⋆ = 5 for all λ ≥ 0.

15-8

slide-9
SLIDE 9

Example (Srikant)

Lagrangian: L(x, λ) = x2 + 1 + λ(x − 2)(x − 4)

❼ Minimize the Lagrangian:

g(λ) = inf

x

L(x, λ) = inf

x (λ + 1)x2 − 6λx + (8λ + 1)

If λ ≤ −1, it is unbounded. If λ > −1, the minimum

  • ccurs when 2(λ + 1)x − 6λ = 0, so ˆ

x =

3λ λ+1.

g(λ) =

  • −9λ2/(1 + λ) + 1 + 8λ

λ > −1 −∞ λ ≤ −1

15-9

slide-10
SLIDE 10

Example (Srikant)

maximize

λ

− 9λ2/(1 + λ) + 1 + 8λ subject to: λ ≥ 0

  • 1

1 2 3 4 5 λ

  • 10
  • 5

5 g(λ)

❼ optimum occurs at λ = 2, has value d⋆ = 5 ❼ same optimal value as primal problem! (strong duality)

15-10

slide-11
SLIDE 11

Constraint qualifications

❼ weak duality (d⋆ ≤ p⋆) always holds. Even when the

  • ptimization problem is not convex.

❼ strong duality (d⋆ = p⋆) often holds for convex problems

(but not always). A constraint qualification is a condition that guarantees strong duality. An example we’ve already seen:

❼ If the optimization problem is an LP, strong duality holds

15-11

slide-12
SLIDE 12

Slater’s constraint qualification

minimize

x∈D

f0(x) subject to: fi(x) ≤ 0 for i = 1, . . . , m hj(x) = 0 for j = 1, . . . , r Slater’s constraint qualification: If the optimization problem is convex and strictly feasible, then strong duality holds.

❼ convexity requires: D and fi are convex and hj are affine. ❼ strict feasibility means there exists some ˜

x in the interior of D such that fi(˜ x) < 0 for i = 1, . . . , m.

15-12

slide-13
SLIDE 13

Slater’s constraint qualification

If the optimization problem is convex and strictly feasible, then strong duality holds.

❼ Good news: Slater’s constraint qualification is rather weak.

i.e. it is usually satisfied by convex problems.

❼ Can be relaxed so that strict feasibility is not required for

the linear constraints.

15-13

slide-14
SLIDE 14

Counterexample (Boyd)

minimize

x∈R, y>0

e−x subject to: x2/y ≤ 0

❼ The function x2/y is convex

for y > 0 (see plot)

❼ The objective e−x is convex ❼ Feasible set: {(0, y) | y > 0} ❼ Solution is trivial (p⋆ = 1)

15-14

slide-15
SLIDE 15

Counterexample (Boyd)

minimize

x∈R, y>0

e−x subject to: x2/y ≤ 0

❼ Lagrangian: L(x, y, λ) = e−x + λx2/y ❼ Dual function: g(λ) = infx,y>0 (e−x + λx2/y) = 0. ❼ The dual problem is:

maximize

λ≥0

So we have d⋆ = 0 < 1 = p⋆.

❼ Slater’s constraint qualification is not satisfied!

15-15

slide-16
SLIDE 16

About Slater’s constraint qualification

Slater’s condition is only sufficient. (Slater) = ⇒ (strong duality)

❼ There exist problems where Slater’s condition fails,

yet strong duality holds.

❼ There exist nonconvex problems with strong duality.

15-16

slide-17
SLIDE 17

Complementary slackness

Assume strong duality holds. If x⋆ is primal optimal and (λ⋆, ν⋆) is dual optimal, then we have: g(λ⋆, ν⋆) = d⋆ = p⋆ = f0(x⋆) f0(x⋆) = g(λ⋆, ν⋆) = inf

x∈D

  • f0(x) +

m

  • i=1

λ⋆

i fi(x) + r

  • j=1

ν⋆

j hj(x)

  • ≤ f0(x⋆) +

m

  • i=1

λ⋆

i fi(x⋆) + r

  • j=1

ν⋆

j hj(x⋆)

≤ f0(x⋆) The last inequality holds because x⋆ is primal feasible. We conclude that the inequalities must all be equalities.

15-17

slide-18
SLIDE 18

Complementary slackness

❼ We concluded that:

f0(x⋆) = f0(x⋆) +

m

  • i=1

λ⋆

i fi(x⋆) + r

  • j=1

ν⋆

j hj(x⋆)

But fi(x⋆) ≤ 0 and hj(x⋆) = 0. Therefore: λ⋆

i fi(x⋆) = 0

for i = 1, . . . , m

❼ This property is called complementary slackness. We’ve

seen it before for linear programs. λ⋆

i > 0 =

⇒ fi(x⋆) = 0 and fi(x⋆) < 0 = ⇒ λ⋆

i = 0 15-18

slide-19
SLIDE 19

Dual of an LP

minimize

x≥0

cTx subject to: Ax ≥ b

❼ Lagrangian: L(x, λ) = cTx + λT(b − Ax) ❼ Dual function: g(λ) = min

x≥0 (c − ATλ)Tx + λTb

g(λ) =

  • λTb

if ATλ ≤ c −∞

  • therwise

15-19

slide-20
SLIDE 20

Dual of an LP

minimize

x≥0

cTx subject to: Ax ≥ b

❼ Dual is:

maximize

λ≥0

λTb subject to: ATλ ≤ c

❼ This is the same result that we found when we were

studying duality for linear programs.

15-20

slide-21
SLIDE 21

Dual of an LP

What if we treat x ≥ 0 as a constraint instead? (D = Rn). minimize

x

cTx subject to: Ax ≥ b x ≥ 0

❼ Lagrangian: L(x, λ, µ) = cTx + λT(b − Ax) − µTx ❼ Dual function: g(λ, µ) = min

x (c − ATλ − µ)Tx + λTb

g(λ) =

  • λTb

if ATλ + µ = c −∞

  • therwise

15-21

slide-22
SLIDE 22

Dual of an LP

What if we treat x ≥ 0 as a constraint instead? (D = Rn). minimize

x

cTx subject to: Ax ≥ b x ≥ 0

❼ Dual is:

maximize

λ≥0, µ≥0

λTb subject to: ATλ + µ = c

❼ Solution is the same, µ acts as the slack variable.

15-22

slide-23
SLIDE 23

Dual of a convex QP

Suppose Q ≻ 0. Let’s find the dual of the QP: minimize

x 1 2xTQx

subject to: Ax ≥ b

❼ Lagrangian: L(x, λ) = 1

2xTQx + λT(b − Ax)

❼ Dual function: g(λ) = min

x

1

2xTQx + λT(b − Ax)

  • Minimum occurs at: ˆ

x = Q−1ATλ g(λ) = − 1

2λTAQ−1ATλ + λTb 15-23

slide-24
SLIDE 24

Dual of a convex QP

Suppose Q ≻ 0. Let’s find the dual of the QP: minimize

x 1 2xTQx

subject to: Ax ≥ b

❼ Dual is also a QP:

maximize

λ

− 1

2λTAQ−1ATλ + λTb

subject to: λ ≥ 0

❼ It’s still easy to solve (maximizing a concave function)

15-24

slide-25
SLIDE 25

Sensitivity analysis

min

x∈D

f0(x) s.t. fi(x) ≤ ui ∀i hj(x) = vj ∀j max

λ,ν

g(λ, ν) − λTu − νTv s.t. λ ≥ 0

❼ As with LPs, dual variables quantify the sensitivity of the

  • ptimal cost to changes in each of the constraints.

❼ A change in ui causes a bigger change in p⋆ if λ⋆

i is larger.

❼ A change in vj causes a bigger change in p⋆ if ν⋆

j is larger.

❼ If p⋆(u, v) is differentiable, then:

λ⋆

i = −∂p⋆(0, 0)

∂ui and ν⋆

j = −∂p⋆(0, 0)

∂vj

15-25