CS675: Convex and Combinatorial Optimization Fall 2019 Duality of - - PowerPoint PPT Presentation

cs675 convex and combinatorial optimization fall 2019
SMART_READER_LITE
LIVE PREVIEW

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of - - PowerPoint PPT Presentation

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of Convex Optimization Problems Instructor: Shaddin Dughmi Outline The Lagrange Dual Problem 1 Duality 2 Optimality Conditions 3 Recall: Optimization Problem in Standard Form


slide-1
SLIDE 1

CS675: Convex and Combinatorial Optimization Fall 2019 Duality of Convex Optimization Problems

Instructor: Shaddin Dughmi

slide-2
SLIDE 2

Outline

1

The Lagrange Dual Problem

2

Duality

3

Optimality Conditions

slide-3
SLIDE 3

Recall: Optimization Problem in Standard Form

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. For convex optimization problems in standard form, fi is convex and hi is affine. Let D denote the domain of all these functions (i.e. when their value is finite)

The Lagrange Dual Problem 1/25

slide-4
SLIDE 4

Recall: Optimization Problem in Standard Form

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. For convex optimization problems in standard form, fi is convex and hi is affine. Let D denote the domain of all these functions (i.e. when their value is finite)

This Lecture + Next

We will develop duality theory for convex optimization problems, generalizing linear programming duality.

The Lagrange Dual Problem 1/25

slide-5
SLIDE 5

Running Example: Linear Programming

We have already seen the standard form LP below maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0

The Lagrange Dual Problem 2/25

slide-6
SLIDE 6

Running Example: Linear Programming

We have already seen the standard form LP below maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0 Along the way, we will recover the following standard form dual minimize y⊺b subject to A⊺y c y 0

The Lagrange Dual Problem 2/25

slide-7
SLIDE 7

The Lagrangian

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. Basic idea of Lagrangian duality is to relax/soften the constraints by replacing each with a linear “penalty term” or “cost” in the objective.

The Lagrange Dual Problem 3/25

slide-8
SLIDE 8

The Lagrangian

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. Basic idea of Lagrangian duality is to relax/soften the constraints by replacing each with a linear “penalty term” or “cost” in the objective.

The Lagrangian Function

L(x, λ, ν) = f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x) λi is Lagrange Multiplier for i’th inequality constraint

Required to be nonnegative

νi is Lagrange Multiplier for i’th equality constraint

Allowed to be of arbitrary sign

The Lagrange Dual Problem 3/25

slide-9
SLIDE 9

The Lagrange Dual Function

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. The Lagrange dual function gives the optimal value of the primal problem subject to the softened constraints

The Lagrange Dual Problem 4/25

slide-10
SLIDE 10

The Lagrange Dual Function

minimize f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k. The Lagrange dual function gives the optimal value of the primal problem subject to the softened constraints

The Lagrange Dual Function

g(λ, ν) = inf

x∈D L(x, λ, ν) = inf x∈D

  • f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x)

  • Observe: g is a concave function of the Lagrange multipliers

We will see: Its quite common for the Lagrange dual to be unbounded (−∞) for some λ and ν By convention, domain of g is (λ, ν) s.t. g(λ, ν) > −∞

The Lagrange Dual Problem 4/25

slide-11
SLIDE 11

Langrange Dual of LP

minimize −c⊺x subject to Ax − b 0 −x 0 First, the Lagrangian function L(x, λ) = −c⊺x + λ⊺

1(Ax − b) − λ⊺ 2x

= (A⊺λ1 − c − λ2)⊺x − λ⊺

1b

The Lagrange Dual Problem 5/25

slide-12
SLIDE 12

Langrange Dual of LP

minimize −c⊺x subject to Ax − b 0 −x 0 First, the Lagrangian function L(x, λ) = −c⊺x + λ⊺

1(Ax − b) − λ⊺ 2x

= (A⊺λ1 − c − λ2)⊺x − λ⊺

1b

And the Lagrange Dual g(λ) = inf

x L(x, λ)

=

  • −∞

if A⊺λ1 − c − λ2 = 0 −λ⊺

1b

if A⊺λ1 − c − λ2 = 0

The Lagrange Dual Problem 5/25

slide-13
SLIDE 13

Langrange Dual of LP

minimize −c⊺x subject to Ax − b 0 −x 0 First, the Lagrangian function L(x, λ) = −c⊺x + λ⊺

1(Ax − b) − λ⊺ 2x

= (A⊺λ1 − c − λ2)⊺x − λ⊺

1b

And the Lagrange Dual g(λ) = inf

x L(x, λ)

=

  • −∞

if A⊺λ1 − c − λ2 = 0 −λ⊺

1b

if A⊺λ1 − c − λ2 = 0 So we restrict the domain of g to λ satisfying A⊺λ1 − c − λ2 = 0

The Lagrange Dual Problem 5/25

slide-14
SLIDE 14

Interpretation: “Soft” Lower Bound

min f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k.

The Lagrange Dual Function

g(λ, ν) = inf

x∈D L(x, λ, ν) = inf x∈D

  • f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x)

  • The Lagrange Dual Problem

6/25

slide-15
SLIDE 15

Interpretation: “Soft” Lower Bound

min f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k.

The Lagrange Dual Function

g(λ, ν) = inf

x∈D L(x, λ, ν) = inf x∈D

  • f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x)

  • Fact

g(λ, ν) is a lowerbound on OPT(primal) for every λ 0 and ν ∈ Rk.

The Lagrange Dual Problem 6/25

slide-16
SLIDE 16

Interpretation: “Soft” Lower Bound

min f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k.

The Lagrange Dual Function

g(λ, ν) = inf

x∈D L(x, λ, ν) = inf x∈D

  • f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x)

  • Fact

g(λ, ν) is a lowerbound on OPT(primal) for every λ 0 and ν ∈ Rk.

Proof

Every primal feasible x incurs nonpositive penalty by L(x, λ, ν) Therefore, L(x∗, λ, ν) ≤ f0(x∗) So g(λ, ν) ≤ f0(x∗) = OPT(Primal)

The Lagrange Dual Problem 6/25

slide-17
SLIDE 17

Interpretation: “Soft” Lower Bound

min f0(x) subject to fi(x) ≤ 0, for i = 1, . . . , m. hi(x) = 0, for i = 1, . . . , k.

The Lagrange Dual Function

g(λ, ν) = inf

x∈D L(x, λ, ν) = inf x∈D

  • f0(x) +

m

  • i=1

λifi(x) +

k

  • i=1

νihi(x)

  • Interpretation

A “hard” feasibility constraint can be thought of as imposing a penalty of +∞ if violated, and a penalty/reward of 0 if satisfied Lagrangian imposes a “soft” linear penalty for violating a constraint, and a reward for slack Lagrange dual finds the optimal subject to these soft constraints

The Lagrange Dual Problem 6/25

slide-18
SLIDE 18

Interpretation: Geometric

Most easily visualized in the presence of a single inequality constraint minimize f0(x) subject to f1(x) ≤ 0 Let G be attainable constraint/objective function value tuples

i.e. (u, t) ∈ G if there is an x such that f1(x) = u and f0(x) = t

p∗ = inf {t : (u, t) ∈ G, u ≤ 0} g(λ) = inf {λu + t : (u, t) ∈ G}

The Lagrange Dual Problem 7/25

slide-19
SLIDE 19

Interpretation: Geometric

Most easily visualized in the presence of a single inequality constraint minimize f0(x) subject to f1(x) ≤ 0 Let G be attainable constraint/objective function value tuples

i.e. (u, t) ∈ G if there is an x such that f1(x) = u and f0(x) = t

p∗ = inf {t : (u, t) ∈ G, u ≤ 0} g(λ) = inf {λu + t : (u, t) ∈ G} λu + t = g(λ) is a supporting hyperplane to G pointing northeast Must intersect vertical axis below p∗ Therefore g(λ) ≤ p∗

The Lagrange Dual Problem 7/25

slide-20
SLIDE 20

The Lagrange Dual Problem

This is the problem of finding the best lower bound on OPT(primal) implied by the Lagrange dual function maximize g(λ, ν) subject to λ 0 Note: this is a convex optimization problem, regardless of whether primal problem was convex By convention, sometimes we add “dual feasibility” constraints to impose “nontrivial” lowerbounds (i.e. g(λ, ν) ≥ −∞) (λ∗, ν∗) solving the above are referred to as the dual optimal solution

The Lagrange Dual Problem 8/25

slide-21
SLIDE 21

Langrange Dual Problem of LP

maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0

Recall

Our Lagrange dual function for the above minimization LP (to the right), defined over the domain A⊺λ1 − c − λ2 = 0. g(λ) = −λ⊺

1b

The Lagrange Dual Problem 9/25

slide-22
SLIDE 22

Langrange Dual Problem of LP

maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0

Recall

Our Lagrange dual function for the above minimization LP (to the right), defined over the domain A⊺λ1 − c − λ2 = 0. g(λ) = −λ⊺

1b

The Lagrange dual problem can then be written as −maximize −λ⊺

1b

subject to A⊺λ1 − c − λ2 = 0 λ 0

The Lagrange Dual Problem 9/25

slide-23
SLIDE 23

Langrange Dual Problem of LP

maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0

Recall

Our Lagrange dual function for the above minimization LP (to the right), defined over the domain A⊺λ1 − c − λ2 = 0. g(λ) = −λ⊺

1b

The Lagrange dual problem can then be written as −maximize −λ⊺

1b

subject to

✭✭✭✭✭✭✭✭✭

A⊺λ1 − c − λ2 = 0 A⊺λ1 c λ 0

The Lagrange Dual Problem 9/25

slide-24
SLIDE 24

Langrange Dual Problem of LP

maximize c⊺x subject to Ax b x 0 −minimize −c⊺x subject to Ax − b 0 −x 0

Recall

Our Lagrange dual function for the above minimization LP (to the right), defined over the domain A⊺λ1 − c − λ2 = 0. g(λ) = −λ⊺

1b

The Lagrange dual problem can then be written as minimize y⊺b subject to A⊺y c y 0 −maximize −λ⊺

1b

subject to

✭✭✭✭✭✭✭✭✭

A⊺λ1 − c − λ2 = 0 A⊺λ1 c λ 0

The Lagrange Dual Problem 9/25

slide-25
SLIDE 25

Another Example: Conic Optimization Problem

minimize c⊺x subject to Ax = b x ∈ K x ∈ K can equivalently be written as z⊺x ≤ 0, ∀z ∈ K◦ L(x, λ, ν) = c⊺x + ν⊺(Ax − b) +

  • z∈K◦

λz · z⊺x = (c − A⊺ν +

  • z∈K◦

λz · z)⊺x + ν⊺b

The Lagrange Dual Problem 10/25

slide-26
SLIDE 26

Another Example: Conic Optimization Problem

minimize c⊺x subject to Ax = b x ∈ K x ∈ K can equivalently be written as z⊺x ≤ 0, ∀z ∈ K◦ L(x, λ, ν) = c⊺x + ν⊺(Ax − b) +

  • z∈K◦

λz · z⊺x = (c − A⊺ν +

  • z∈K◦

λz · z)⊺x + ν⊺b Can think of λ 0 as choosing some s ∈ K◦ L(x, s, ν) = (c − A⊺ν + s)⊺x + ν⊺b

The Lagrange Dual Problem 10/25

slide-27
SLIDE 27

Another Example: Conic Optimization Problem

minimize c⊺x subject to Ax = b x ∈ K x ∈ K can equivalently be written as z⊺x ≤ 0, ∀z ∈ K◦ L(x, λ, ν) = c⊺x + ν⊺(Ax − b) +

  • z∈K◦

λz · z⊺x = (c − A⊺ν +

  • z∈K◦

λz · z)⊺x + ν⊺b Can think of λ 0 as choosing some s ∈ K◦ L(x, s, ν) = (c − A⊺ν + s)⊺x + ν⊺b Lagrange dual function g(s, ν) is bounded when coefficient of x is zero, in which case it has value ν⊺b

The Lagrange Dual Problem 10/25

slide-28
SLIDE 28

Another Example: Conic Optimization Problem

minimize c⊺x subject to Ax = b x ∈ K maximize ν⊺b subject to A⊺ν − c ∈ K◦ x ∈ K can equivalently be written as z⊺x ≤ 0, ∀z ∈ K◦ L(x, λ, ν) = c⊺x + ν⊺(Ax − b) +

  • z∈K◦

λz · z⊺x = (c − A⊺ν +

  • z∈K◦

λz · z)⊺x + ν⊺b Can think of λ 0 as choosing some s ∈ K◦ L(x, s, ν) = (c − A⊺ν + s)⊺x + ν⊺b Lagrange dual function g(s, ν) is bounded when coefficient of x is zero, in which case it has value ν⊺b

The Lagrange Dual Problem 10/25

slide-29
SLIDE 29

Outline

1

The Lagrange Dual Problem

2

Duality

3

Optimality Conditions

slide-30
SLIDE 30

Weak Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Duality 11/25

slide-31
SLIDE 31

Weak Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Weak Duality

OPT(dual) ≤ OPT(primal). We have already argued holds for every optimization problem Duality Gap: difference between optimal dual and primal values

Duality 11/25

slide-32
SLIDE 32

Recall: Geometric Interpretation of Weak Duality

minimize f0(x) subject to f1(x) ≤ 0 Let G be attainable constraint/objective function value tuples

i.e. (u, t) ∈ G if there is an x such that f1(x) = u and f0(x) = t

p∗ = inf {t : (u, t) ∈ G, u ≤ 0} g(λ) = inf {λu + t : (u, t) ∈ G}

Duality 12/25

slide-33
SLIDE 33

Recall: Geometric Interpretation of Weak Duality

minimize f0(x) subject to f1(x) ≤ 0 Let G be attainable constraint/objective function value tuples

i.e. (u, t) ∈ G if there is an x such that f1(x) = u and f0(x) = t

p∗ = inf {t : (u, t) ∈ G, u ≤ 0} g(λ) = inf {λu + t : (u, t) ∈ G}

Fact

The equation λu + t = g(λ) defines a supporting hyperplane to G, intersecting t axis at g(λ) ≤ p∗.

Duality 12/25

slide-34
SLIDE 34

Strong Duality

Strong Duality

We say strong duality holds if OPT(dual) = OPT(primal). Equivalently: there exists a setting of Lagrange multipliers so that g(λ, ν) gives a tight lowerbound on primal optimal value. In general, does not hold for non-convex optimization problems Usually, but not always, holds for convex optimization problems.

Mild assumptions, such as Slater’s condition, needed.

Duality 13/25

slide-35
SLIDE 35

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0 Let A be everything northeast (i.e. “worse”) than G

i.e. (u, t) ∈ A if there is an x such that f1(x) ≤ u and f0(x) ≤ t

p∗ = inf {t : (0, t) ∈ A} g(λ) = inf {λu + t : (u, t) ∈ A}

Duality 14/25

slide-36
SLIDE 36

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0 Let A be everything northeast (i.e. “worse”) than G

i.e. (u, t) ∈ A if there is an x such that f1(x) ≤ u and f0(x) ≤ t

p∗ = inf {t : (0, t) ∈ A} g(λ) = inf {λu + t : (u, t) ∈ A}

Fact

The equation λu + t = g(λ) defines a supporting hyperplane to A, intersecting t axis at g(λ) ≤ p∗.

Duality 14/25

slide-37
SLIDE 37

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Fact

When f0 and f1 are convex, A is convex.

Duality 15/25

slide-38
SLIDE 38

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Fact

When f0 and f1 are convex, A is convex.

Proof

Assume (u, t) and (u′, t′) are in A

Duality 15/25

slide-39
SLIDE 39

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Fact

When f0 and f1 are convex, A is convex.

Proof

Assume (u, t) and (u′, t′) are in A ∃x, x′ with (f1(x), f0(x)) ≤ (u, t) and (f1(x′), f0(x′)) ≤ (u′, t′).

Duality 15/25

slide-40
SLIDE 40

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Fact

When f0 and f1 are convex, A is convex.

Proof

Assume (u, t) and (u′, t′) are in A ∃x, x′ with (f1(x), f0(x)) ≤ (u, t) and (f1(x′), f0(x′)) ≤ (u′, t′). By Jensen’s inequality (f1(αx+(1−α)x′), f0(αx+(1−α)x′)) ≤ (αu+(1−α)u′, αt+(1−α)t′)

Duality 15/25

slide-41
SLIDE 41

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Fact

When f0 and f1 are convex, A is convex.

Proof

Assume (u, t) and (u′, t′) are in A ∃x, x′ with (f1(x), f0(x)) ≤ (u, t) and (f1(x′), f0(x′)) ≤ (u′, t′). By Jensen’s inequality (f1(αx+(1−α)x′), f0(αx+(1−α)x′)) ≤ (αu+(1−α)u′, αt+(1−α)t′) Therefore, segment connecting (u, t) and (u′, t′) also in A.

Duality 15/25

slide-42
SLIDE 42

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Theorem (Informal)

There is a choice of λ so that g(λ) = p∗. Therefore, strong duality holds.

Duality 16/25

slide-43
SLIDE 43

Geometric Proof of Strong Duality

minimize f0(x) subject to f1(x) ≤ 0

Theorem (Informal)

There is a choice of λ so that g(λ) = p∗. Therefore, strong duality holds.

Proof

Recall (0, p∗) is on the boundary of A By the supporting hyperplane theorem, there is a supporting hyperplane to A at (0, p∗) Direction of the supporting hyperplane gives us an appropriate λ

Duality 16/25

slide-44
SLIDE 44

I Lied (A little)

minimize f0(x) subject to f1(x) ≤ 0 In our proof, we ignored a technicality that can prevent strong duality from holding.

Duality 17/25

slide-45
SLIDE 45

I Lied (A little)

minimize f0(x) subject to f1(x) ≤ 0 In our proof, we ignored a technicality that can prevent strong duality from holding. What if our supporting hyperplane H at (0, p∗) is vertical?

The normal to H is perpendicular to the t axis

In this case, no finite λ exists such that (λ, 1) is normal to H.

Duality 17/25

slide-46
SLIDE 46

I Lied (A little)

minimize f0(x) subject to f1(x) ≤ 0 In our proof, we ignored a technicality that can prevent strong duality from holding. What if our supporting hyperplane H at (0, p∗) is vertical?

The normal to H is perpendicular to the t axis

In this case, no finite λ exists such that (λ, 1) is normal to H. Somewhat counterintuitively, this can happen even in simple convex optimization problems (though its somewhat rare in practice)

Duality 17/25

slide-47
SLIDE 47

Violation of Strong Duality

minimize e−x subject to

x2 y ≤ 0

Let domain of constraint be region y ≥ 1 Problem is convex, with feasible region given by x = 0 Optimal value is 1, at x = 0 and y = 1

Duality 18/25

slide-48
SLIDE 48

Violation of Strong Duality

minimize e−x subject to

x2 y ≤ 0

Let domain of constraint be region y ≥ 1 Problem is convex, with feasible region given by x = 0 Optimal value is 1, at x = 0 and y = 1 A = R2

++

({0} × [1, ∞]) Therefore, any supporting hyperplane to A at (0, 1) must be vertical. Optimal dual value is 0; a duality gap of 1.

Duality 18/25

slide-49
SLIDE 49

Slater’s Condition

There exists a point x ∈ D where all inequality constraints are strictly satisfied (i.e. fi(x) < 0). I.e. the optimization problem is strictly feasible. A sufficient condition for strong duality. Forces supporting hyperplane to be non-vertical

Duality 19/25

slide-50
SLIDE 50

Slater’s Condition

There exists a point x ∈ D where all inequality constraints are strictly satisfied (i.e. fi(x) < 0). I.e. the optimization problem is strictly feasible. A sufficient condition for strong duality. Forces supporting hyperplane to be non-vertical Can be weakened to requiring strict feasibility only of non-affine constraints

Duality 19/25

slide-51
SLIDE 51

Outline

1

The Lagrange Dual Problem

2

Duality

3

Optimality Conditions

slide-52
SLIDE 52

Recall: Lagrangian Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Optimality Conditions 20/25

slide-53
SLIDE 53

Recall: Lagrangian Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Weak Duality

OPT(dual) ≤ OPT(primal).

Optimality Conditions 20/25

slide-54
SLIDE 54

Recall: Lagrangian Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Strong Duality

OPT(dual) = OPT(primal).

Optimality Conditions 20/25

slide-55
SLIDE 55

Dual Solution as a Certificate

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0 Dual solutions serves as a certificate of optimality If f0(x) = g(λ, ν), and both are feasible, then both are optimal.

Optimality Conditions 21/25

slide-56
SLIDE 56

Dual Solution as a Certificate

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0 Dual solutions serves as a certificate of optimality If f0(x) = g(λ, ν), and both are feasible, then both are optimal. If f0(x) − g(λ, ν) ≤ ǫ, then both are within ǫ of optimality.

OPT(primal) and OPT(dual) lie in the interval [g(λ, ν), f0(x)]

Optimality Conditions 21/25

slide-57
SLIDE 57

Dual Solution as a Certificate

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0 Dual solutions serves as a certificate of optimality If f0(x) = g(λ, ν), and both are feasible, then both are optimal. If f0(x) − g(λ, ν) ≤ ǫ, then both are within ǫ of optimality.

OPT(primal) and OPT(dual) lie in the interval [g(λ, ν), f0(x)]

Primal-dual algorithms use dual certificates to recognize

  • ptimality, or bound sub-optimality.

Optimality Conditions 21/25

slide-58
SLIDE 58

Implications of Strong Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Facts

If strong duality holds, and x∗ and (λ∗, ν∗) are feasible & optimal, then x∗ minimizes L(x, λ∗, ν∗) over all x. λ∗

i fi(x∗) = 0 for all i = 1, . . . , m. (Complementary Slackness)

Optimality Conditions 22/25

slide-59
SLIDE 59

Implications of Strong Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Facts

If strong duality holds, and x∗ and (λ∗, ν∗) are feasible & optimal, then x∗ minimizes L(x, λ∗, ν∗) over all x. λ∗

i fi(x∗) = 0 for all i = 1, . . . , m. (Complementary Slackness)

Proof

f0(x∗) = g(λ∗, ν∗) = min

x L(x, λ∗, ν∗)

≤ L(x∗, λ∗, ν∗) = f0(x∗) +

m

  • i=1

λ∗

i fi(x∗) + k

  • i=1

ν∗

i hi(x∗)

≤ f0(x∗)

Optimality Conditions 22/25

slide-60
SLIDE 60

Implications of Strong Duality

Primal Problem min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. Dual Problem max g(λ, ν) s.t. λ 0

Facts

If strong duality holds, and x∗ and (λ∗, ν∗) are feasible & optimal, then x∗ minimizes L(x, λ∗, ν∗) over all x. λ∗

i fi(x∗) = 0 for all i = 1, . . . , m. (Complementary Slackness)

Interpretation

Lagrange multipliers (λ∗, ν∗) “simulate” the primal feasibility constraints Interpreting λi as the “value” of the i’th constraint, at optimality

  • nly the binding constraints are “valuable”

Recall economic interpretation of LP

Optimality Conditions 22/25

slide-61
SLIDE 61

min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. max g(λ, ν) s.t. λ 0

KKT Conditions

Suppose the primal problem is convex and defined on an open domain, and moreover the constraint functions are differentiable everywhere in the domain. If strong duality holds, then x∗ and (λ∗, ν∗) are optimal iff: x∗ and (λ∗, ν∗) are feasible λ∗

i fi(x∗) = 0 for all i (Complementary Slackness)

▽xL(x∗, λ∗, ν∗) = ▽f0(x∗)+m

i=1 λ∗ i ▽fi(x∗)+k i=1 ν∗ i ▽hi(x∗) = 0

Optimality Conditions 23/25

slide-62
SLIDE 62

min f0(x) s.t. fi(x) ≤ 0, ∀i = 1, . . . , m. hi(x) = 0, ∀i = 1, . . . , k. max g(λ, ν) s.t. λ 0

KKT Conditions

Suppose the primal problem is convex and defined on an open domain, and moreover the constraint functions are differentiable everywhere in the domain. If strong duality holds, then x∗ and (λ∗, ν∗) are optimal iff: x∗ and (λ∗, ν∗) are feasible λ∗

i fi(x∗) = 0 for all i (Complementary Slackness)

▽xL(x∗, λ∗, ν∗) = ▽f0(x∗)+m

i=1 λ∗ i ▽fi(x∗)+k i=1 ν∗ i ▽hi(x∗) = 0

Why are KKT Conditions Useful?

Derive an analytical solution to some convex optimization problems Gain structural insights

Optimality Conditions 23/25

slide-63
SLIDE 63

Example: Equality-constrained Quadratic Program

minimize

1 2x⊺Px + q⊺x + r

subject to Ax = b KKT Conditions: Ax∗ = b and Px∗ + q + A⊺ν∗ = 0 Simply a solution of a linear system with variables x∗ and ν∗.

m + n constraints and m + n variables

Optimality Conditions 24/25

slide-64
SLIDE 64

Example: Market Equilibria (Fisher’s Model)

Buyers B, and goods G. Buyer i has utility uij for each unit of good G. Buyer i has budget mi, and there’s one divisible unit of each good.

Optimality Conditions 25/25

slide-65
SLIDE 65

Example: Market Equilibria (Fisher’s Model)

Buyers B, and goods G. Buyer i has utility uij for each unit of good G. Buyer i has budget mi, and there’s one divisible unit of each good. Does there exist a market equilibrium?

Prices pj on items, such that each player can buy his favorite bundle that he can afford and the market clears (supply = demand).

Optimality Conditions 25/25

slide-66
SLIDE 66

Example: Market Equilibria (Fisher’s Model)

Buyers B, and goods G. Buyer i has utility uij for each unit of good G. Buyer i has budget mi, and there’s one divisible unit of each good. Does there exist a market equilibrium?

Prices pj on items, such that each player can buy his favorite bundle that he can afford and the market clears (supply = demand).

Eisenberg-Gale Convex Program

maximize

  • i mi log

j uijxij

subject to

  • i xij ≤ 1,

for j ∈ G. x 0

Optimality Conditions 25/25

slide-67
SLIDE 67

Example: Market Equilibria (Fisher’s Model)

Buyers B, and goods G. Buyer i has utility uij for each unit of good G. Buyer i has budget mi, and there’s one divisible unit of each good. Does there exist a market equilibrium?

Prices pj on items, such that each player can buy his favorite bundle that he can afford and the market clears (supply = demand).

Eisenberg-Gale Convex Program

maximize

  • i mi log

j uijxij

subject to

  • i xij ≤ 1,

for j ∈ G. x 0 Using KKT conditions, we can prove that the dual variables corresponding to the item supply constraints are market-clearing prices!

Optimality Conditions 25/25