Convex Optimization ( EE227A: UC Berkeley ) Lecture 6 (Conic - - PowerPoint PPT Presentation

convex optimization
SMART_READER_LITE
LIVE PREVIEW

Convex Optimization ( EE227A: UC Berkeley ) Lecture 6 (Conic - - PowerPoint PPT Presentation

Convex Optimization ( EE227A: UC Berkeley ) Lecture 6 (Conic optimization) 07 Feb, 2013 Suvrit Sra Organizational Info Quiz coming up on 19th Feb. Project teams by 19th Feb Good if you can mix your research with class projects


slide-1
SLIDE 1

Convex Optimization

(EE227A: UC Berkeley)

Lecture 6

(Conic optimization) 07 Feb, 2013

  • Suvrit Sra
slide-2
SLIDE 2

Organizational Info

◮ Quiz coming up on 19th Feb. ◮ Project teams by 19th Feb ◮ Good if you can mix your research with class projects ◮ More info in a few days

2 / 31

slide-3
SLIDE 3

Mini Challenge

Kummer’s confluent hypergeometric function M(a, c, x) :=

  • j≥0

(a)j (c)j xj j! , a, c, x ∈ R, and (a)0 = 1, (a)j = a(a + 1) · · · (a + j − 1) is the rising-factorial.

3 / 31

slide-4
SLIDE 4

Mini Challenge

Kummer’s confluent hypergeometric function M(a, c, x) :=

  • j≥0

(a)j (c)j xj j! , a, c, x ∈ R, and (a)0 = 1, (a)j = a(a + 1) · · · (a + j − 1) is the rising-factorial. Claim: Let c > a > 0 and x ≥ 0. Then the function ha,c(µ; x) := µ → Γ(a + µ) Γ(c + µ) M(a + µ, c + µ, x) is strictly log-convex on [0, ∞) (note that h is a function of µ). Recall: Γ(x) := ∞

0 tx−1e−tdt is the Gamma function (which is

known to be log-convex for x ≥ 1; see also Exercise 3.52 of BV).

3 / 31

slide-5
SLIDE 5

LP formulation

Write min Ax − b1 as a linear program. min Ax − b1 x ∈ Rn min

  • i |aT

i x − bi|

min

x,t

  • i ti,

|aT

i x − bi| ≤ ti,

i = 1, . . . , m. min

x,t

1T t, −ti ≤ aT

i x − bi ≤ ti,

i = 1, . . . , m.

4 / 31

slide-6
SLIDE 6

LP formulation

Write min Ax − b1 as a linear program. min Ax − b1 x ∈ Rn min

  • i |aT

i x − bi|

min

x,t

  • i ti,

|aT

i x − bi| ≤ ti,

i = 1, . . . , m. min

x,t

1T t, −ti ≤ aT

i x − bi ≤ ti,

i = 1, . . . , m. Exercise: Recast Ax − b2

2 + λBx1 as a QP.

4 / 31

slide-7
SLIDE 7

Cone programs – overview

◮ Last time we briefly saw LP, QP, SOCP, SDP

5 / 31

slide-8
SLIDE 8

Cone programs – overview

◮ Last time we briefly saw LP, QP, SOCP, SDP LP (standard form) min fT x s.t. Ax = b, x ≥ 0. Feasible set X = {x | Ax = b} ∩ Rn

+ (nonneg orthant)

5 / 31

slide-9
SLIDE 9

Cone programs – overview

◮ Last time we briefly saw LP, QP, SOCP, SDP LP (standard form) min fT x s.t. Ax = b, x ≥ 0. Feasible set X = {x | Ax = b} ∩ Rn

+ (nonneg orthant)

Input data: (A, b, c) Structural constraints: x ≥ 0.

5 / 31

slide-10
SLIDE 10

Cone programs – overview

◮ Last time we briefly saw LP, QP, SOCP, SDP LP (standard form) min fT x s.t. Ax = b, x ≥ 0. Feasible set X = {x | Ax = b} ∩ Rn

+ (nonneg orthant)

Input data: (A, b, c) Structural constraints: x ≥ 0. How should we generalize this model?

5 / 31

slide-11
SLIDE 11

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map?

6 / 31

slide-12
SLIDE 12

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map? ◮ Quickly becomes nonconvex, potentially intractable

6 / 31

slide-13
SLIDE 13

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map? ◮ Quickly becomes nonconvex, potentially intractable Generalize structural constraint Rn

+

6 / 31

slide-14
SLIDE 14

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map? ◮ Quickly becomes nonconvex, potentially intractable Generalize structural constraint Rn

+

♣ Replace nonneg orthant by a convex cone K;

6 / 31

slide-15
SLIDE 15

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map? ◮ Quickly becomes nonconvex, potentially intractable Generalize structural constraint Rn

+

♣ Replace nonneg orthant by a convex cone K; ♣ Replace ≥ by conic inequality

6 / 31

slide-16
SLIDE 16

Cone programs – overview

◮ Replace linear map x → Ax by a nonlinear map? ◮ Quickly becomes nonconvex, potentially intractable Generalize structural constraint Rn

+

♣ Replace nonneg orthant by a convex cone K; ♣ Replace ≥ by conic inequality ♣ Nesterov and Nemirovski developed nice theory in late 80s ♣ Rich class of cones for which cone programs are tractable

6 / 31

slide-17
SLIDE 17

Conic inequalities

◮ We are looking for “good” vector inequalities on Rn

7 / 31

slide-18
SLIDE 18

Conic inequalities

◮ We are looking for “good” vector inequalities on Rn ◮ Characterized by the set K := {x ∈ Rn | x 0}

  • f vector nonneg w.r.t.

x y ⇔ x − y 0 ⇔ x − y ∈ K.

7 / 31

slide-19
SLIDE 19

Conic inequalities

◮ We are looking for “good” vector inequalities on Rn ◮ Characterized by the set K := {x ∈ Rn | x 0}

  • f vector nonneg w.r.t.

x y ⇔ x − y 0 ⇔ x − y ∈ K. ◮ Necessary and sufficient condition for a set K ⊂ Rn to define a useful vector inequality is: it should be a nonempty, pointed cone.

7 / 31

slide-20
SLIDE 20

Cone programs – inequalities

  • K is nonempty: K = ∅
  • K is closed wrt addition: x, y ∈ K =

⇒ x + y ∈ K

  • K closed wrt noneg scaling: x ∈ K, α ≥ 0 =

⇒ αx ∈ K

  • K is pointed: x, −x ∈ K =

⇒ x = 0

8 / 31

slide-21
SLIDE 21

Cone programs – inequalities

  • K is nonempty: K = ∅
  • K is closed wrt addition: x, y ∈ K =

⇒ x + y ∈ K

  • K closed wrt noneg scaling: x ∈ K, α ≥ 0 =

⇒ αx ∈ K

  • K is pointed: x, −x ∈ K =

⇒ x = 0 Cone inequality x K y ⇐ ⇒ x − y ∈ K x ≻K y ⇐ ⇒ x − y ∈ int(K).

8 / 31

slide-22
SLIDE 22

Conic inequalities

◮ Cone underlying standard coordinatewise vector inequalities: x ≥ y ⇔ xi ≥ yi ⇔ xi − yi ≥ 0, is the nonegative orthant Rn

+.

9 / 31

slide-23
SLIDE 23

Conic inequalities

◮ Cone underlying standard coordinatewise vector inequalities: x ≥ y ⇔ xi ≥ yi ⇔ xi − yi ≥ 0, is the nonegative orthant Rn

+.

◮ Two more important properties that Rn

+ has as a cone:

It is closed

  • xi ∈ Rn

+

  • → x =

⇒ x ∈ Rn

+

It has nonempty interior (contains Euclidean ball of positive radius)

9 / 31

slide-24
SLIDE 24

Conic inequalities

◮ Cone underlying standard coordinatewise vector inequalities: x ≥ y ⇔ xi ≥ yi ⇔ xi − yi ≥ 0, is the nonegative orthant Rn

+.

◮ Two more important properties that Rn

+ has as a cone:

It is closed

  • xi ∈ Rn

+

  • → x =

⇒ x ∈ Rn

+

It has nonempty interior (contains Euclidean ball of positive radius) ◮ We’ll require our cones to also satisfy these two properties.

9 / 31

slide-25
SLIDE 25

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

10 / 31

slide-26
SLIDE 26

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

♣ The nonnegative orthant Rn

+

♣ The second order cone Qn := {(x, t) ∈ Rn | x2 ≤ t} ♣ The semidefinite cone: Sn

+ :=

  • X = XT 0
  • .

10 / 31

slide-27
SLIDE 27

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

♣ The nonnegative orthant Rn

+

♣ The second order cone Qn := {(x, t) ∈ Rn | x2 ≤ t} ♣ The semidefinite cone: Sn

+ :=

  • X = XT 0
  • .

♣ Other cones K given by Cartesian products of these

10 / 31

slide-28
SLIDE 28

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

♣ The nonnegative orthant Rn

+

♣ The second order cone Qn := {(x, t) ∈ Rn | x2 ≤ t} ♣ The semidefinite cone: Sn

+ :=

  • X = XT 0
  • .

♣ Other cones K given by Cartesian products of these ♣ These cones are “nice”: ♣ LP, QP, SOCP, SDP: all are cone programs

10 / 31

slide-29
SLIDE 29

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

♣ The nonnegative orthant Rn

+

♣ The second order cone Qn := {(x, t) ∈ Rn | x2 ≤ t} ♣ The semidefinite cone: Sn

+ :=

  • X = XT 0
  • .

♣ Other cones K given by Cartesian products of these ♣ These cones are “nice”: ♣ LP, QP, SOCP, SDP: all are cone programs ♣ Can treat them theoretically in a uniform way (roughly)

10 / 31

slide-30
SLIDE 30

Conic optimization problems

Standard form cone program

min f Tx s.t. Ax = b, x ∈ K min f Tx s.t. Ax K b.

♣ The nonnegative orthant Rn

+

♣ The second order cone Qn := {(x, t) ∈ Rn | x2 ≤ t} ♣ The semidefinite cone: Sn

+ :=

  • X = XT 0
  • .

♣ Other cones K given by Cartesian products of these ♣ These cones are “nice”: ♣ LP, QP, SOCP, SDP: all are cone programs ♣ Can treat them theoretically in a uniform way (roughly) ♣ Not all cones are nice!

10 / 31

slide-31
SLIDE 31

Cone programs – tough case

Copositive cone

  • Def. Let CPn :=
  • A ∈ Sn×n | xT Ax ≥ 0, ∀x ≥ 0
  • .

Exercise: Verify that CPn is a convex cone.

11 / 31

slide-32
SLIDE 32

Cone programs – tough case

Copositive cone

  • Def. Let CPn :=
  • A ∈ Sn×n | xT Ax ≥ 0, ∀x ≥ 0
  • .

Exercise: Verify that CPn is a convex cone. If someone told you convex is “easy” ... they lied!

11 / 31

slide-33
SLIDE 33

Cone programs – tough case

Copositive cone

  • Def. Let CPn :=
  • A ∈ Sn×n | xT Ax ≥ 0, ∀x ≥ 0
  • .

Exercise: Verify that CPn is a convex cone. If someone told you convex is “easy” ... they lied! ◮ Testing membership in CPn is co-NP complete.

(Deciding whether given matrix is not copositive is NP-complete.)

◮ Copositive cone programming: NP-Hard

11 / 31

slide-34
SLIDE 34

Cone programs – tough case

Copositive cone

  • Def. Let CPn :=
  • A ∈ Sn×n | xT Ax ≥ 0, ∀x ≥ 0
  • .

Exercise: Verify that CPn is a convex cone. If someone told you convex is “easy” ... they lied! ◮ Testing membership in CPn is co-NP complete.

(Deciding whether given matrix is not copositive is NP-complete.)

◮ Copositive cone programming: NP-Hard Exercise: Verify that the following matrix is copositive:

A :=      1 −1 1 1 −1 −1 1 −1 1 1 1 −1 1 −1 1 1 1 −1 1 −1 −1 1 1 −1 1      .

11 / 31

slide-35
SLIDE 35

SOCP in conic form

min fT x s.t. Aix + bi2 ≤ cT

i x + di

i = 1, . . . , m Let Ai ∈ Rni×n; so Aix + bi ∈ Rni.

12 / 31

slide-36
SLIDE 36

SOCP in conic form

min fT x s.t. Aix + bi2 ≤ cT

i x + di

i = 1, . . . , m Let Ai ∈ Rni×n; so Aix + bi ∈ Rni. K = Qn1 × Qn2 × · · · × Qnm, A =             −A1 −cT

1

  • −A2

−cT

2

  • .

. . −Am −cT

m

           , b =            b1 d1 b2 d2 . . . bm dm            . SOCP in conic form min fT x Ax K b

12 / 31

slide-37
SLIDE 37

SOCP representation

Exercise: Let 0 ≺ Q = LLT , then show that xT Qx + bT x + c ≤ 0 ⇔ LT x + L−1b2 ≤

  • bT Q−1b − c

13 / 31

slide-38
SLIDE 38

SOCP representation

Exercise: Let 0 ≺ Q = LLT , then show that xT Qx + bT x + c ≤ 0 ⇔ LT x + L−1b2 ≤

  • bT Q−1b − c

Rotated second-order cone Qn

r :=

  • (x, y, z) ∈ Rn+1 | x2 ≤ √yz, y ≥ 0, z ≥ 0
  • .

13 / 31

slide-39
SLIDE 39

SOCP representation

Exercise: Let 0 ≺ Q = LLT , then show that xT Qx + bT x + c ≤ 0 ⇔ LT x + L−1b2 ≤

  • bT Q−1b − c

Rotated second-order cone Qn

r :=

  • (x, y, z) ∈ Rn+1 | x2 ≤ √yz, y ≥ 0, z ≥ 0
  • .

Convert into standard SOC (verify!)

  • 2x

y − z

  • 2

≤ (y + z) ⇐ ⇒ x2 ≤ √yz. Exercise: Rewrite the constraint xT Qx ≤ t, where both x and t are variables using the rotated second order cone.

13 / 31

slide-40
SLIDE 40

Convex QP as SOCP

min xT Qx + cT x s.t. Ax = b.

14 / 31

slide-41
SLIDE 41

Convex QP as SOCP

min xT Qx + cT x s.t. Ax = b. min

x,t

cT x + t s.t. Ax = b, xT Qx ≤ t.

14 / 31

slide-42
SLIDE 42

Convex QP as SOCP

min xT Qx + cT x s.t. Ax = b. min

x,t

cT x + t s.t. Ax = b, xT Qx ≤ t. min

x,t

cT x + t s.t. Ax = b, (2LT x, t, 1) ∈ Qn

r .

Since, xT Qx = xT LLT x = LT x2

2

14 / 31

slide-43
SLIDE 43

Convex QCQPs as SOCP

Quadratically Constrained QP min q0(x) s.t. qi(x) ≤ 0, i = 1, . . . , m where each qi(x) = xT Pix + bT

i x + ci is a convex quadratic.

15 / 31

slide-44
SLIDE 44

Convex QCQPs as SOCP

Quadratically Constrained QP min q0(x) s.t. qi(x) ≤ 0, i = 1, . . . , m where each qi(x) = xT Pix + bT

i x + ci is a convex quadratic.

Exercise: Show how QCQPs can be cast at SOCPs using Qn

r

Hint: See Lecture 5!

15 / 31

slide-45
SLIDE 45

Convex QCQPs as SOCP

Quadratically Constrained QP min q0(x) s.t. qi(x) ≤ 0, i = 1, . . . , m where each qi(x) = xT Pix + bT

i x + ci is a convex quadratic.

Exercise: Show how QCQPs can be cast at SOCPs using Qn

r

Hint: See Lecture 5! Exercise: Explain why we cannot cast SOCPs as QCQPs. That is, why cannot we simply use the equivalence Ax + b2 ≤ cT x + d ⇔ Ax + b2

2 ≤ (cT x + d)2, cT x + d ≥ 0.

Hint: Look carefully at the inequality!

15 / 31

slide-46
SLIDE 46

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} .

16 / 31

slide-47
SLIDE 47

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} . Robust half-space constraint: ◮ Wish to ensure aT

i x ≤ bi holds irrespective of which ai we pick

from the uncertainty set Ei.

16 / 31

slide-48
SLIDE 48

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} . Robust half-space constraint: ◮ Wish to ensure aT

i x ≤ bi holds irrespective of which ai we pick

from the uncertainty set Ei. This happens, if bi ≥ supai∈Ei aT

i x.

16 / 31

slide-49
SLIDE 49

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} . Robust half-space constraint: ◮ Wish to ensure aT

i x ≤ bi holds irrespective of which ai we pick

from the uncertainty set Ei. This happens, if bi ≥ supai∈Ei aT

i x.

sup

u2≤1

(¯ ai + Piu)T x = ¯ aT

i x + P T i x2.

16 / 31

slide-50
SLIDE 50

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} . Robust half-space constraint: ◮ Wish to ensure aT

i x ≤ bi holds irrespective of which ai we pick

from the uncertainty set Ei. This happens, if bi ≥ supai∈Ei aT

i x.

sup

u2≤1

(¯ ai + Piu)T x = ¯ aT

i x + P T i x2.

◮ We used the fact that supu2≤1 uT v = v2 (recall dual-norms)

16 / 31

slide-51
SLIDE 51

Robust LP

min cT x s.t. aT

i x ≤ bi

∀ai ∈ Ei where Ei := {¯ ai + Piu | u2 ≤ 1} . Robust half-space constraint: ◮ Wish to ensure aT

i x ≤ bi holds irrespective of which ai we pick

from the uncertainty set Ei. This happens, if bi ≥ supai∈Ei aT

i x.

sup

u2≤1

(¯ ai + Piu)T x = ¯ aT

i x + P T i x2.

◮ We used the fact that supu2≤1 uT v = v2 (recall dual-norms) SOCP formulation min cT x, s.t. ¯ aT

i x + P T i x2 ≤ bi,

i = 1, . . . , m.

16 / 31

slide-52
SLIDE 52

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones.

17 / 31

slide-53
SLIDE 53

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X

17 / 31

slide-54
SLIDE 54

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X ◮ Wlog we may assume K = Sn

+ (Why?)

17 / 31

slide-55
SLIDE 55

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X ◮ Wlog we may assume K = Sn

+ (Why?)

◮ Say K = Sn1

+ × Sn2 +

17 / 31

slide-56
SLIDE 56

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X ◮ Wlog we may assume K = Sn

+ (Why?)

◮ Say K = Sn1

+ × Sn2 +

◮ The condition (X1, X2) ∈ K ⇔ X := Diag(X1, X2) ∈ Sn1+n2

+

17 / 31

slide-57
SLIDE 57

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X ◮ Wlog we may assume K = Sn

+ (Why?)

◮ Say K = Sn1

+ × Sn2 +

◮ The condition (X1, X2) ∈ K ⇔ X := Diag(X1, X2) ∈ Sn1+n2

+

◮ Thus, by imposing non diagonals blocks to be zero, we reduce to where K is the semidefinite cone itself (of suitable dimension).

17 / 31

slide-58
SLIDE 58

Semidefinite Program (SDP)

Cone program (semidefinite) min cT x s.t. Ax = b, x ∈ K, where K is a product of semidefinite cones. Standard form ◮ Think of x as a matrix variable X ◮ Wlog we may assume K = Sn

+ (Why?)

◮ Say K = Sn1

+ × Sn2 +

◮ The condition (X1, X2) ∈ K ⇔ X := Diag(X1, X2) ∈ Sn1+n2

+

◮ Thus, by imposing non diagonals blocks to be zero, we reduce to where K is the semidefinite cone itself (of suitable dimension). ◮ So, in matrix notation: cT x → Tr(CX); aT

i x = bi → Tr(AiX) = bi; and

x ∈ K as X 0.

17 / 31

slide-59
SLIDE 59

SDP

SDP (conic form) min

y∈Rn

cT y s.t. A(y) := A0 + y1A1 + y2A2 + . . . + ynAn 0.

18 / 31

slide-60
SLIDE 60

SDP

SDP (conic form) min

y∈Rn

cT y s.t. A(y) := A0 + y1A1 + y2A2 + . . . + ynAn 0. Standard form SDP min Tr(CX) s.t. Tr(AiX) = bi, i = 1, . . . , m X 0.

18 / 31

slide-61
SLIDE 61

SDP

SDP (conic form) min

y∈Rn

cT y s.t. A(y) := A0 + y1A1 + y2A2 + . . . + ynAn 0. Standard form SDP min Tr(CX) s.t. Tr(AiX) = bi, i = 1, . . . , m X 0. One can be converted into another

18 / 31

slide-62
SLIDE 62

SDP – CVX form

cvx_begin variables X(n,n) symmetric; minimize( trace(C*X) ) subject to for i = 1:m, trace(A{i}*X) == b(i); end X == semidefinite(n); cvx_end Note: remember symmetric and semidefinite

19 / 31

slide-63
SLIDE 63

SDP representation – LP

LP as SDP min fT x s.t. Ax ≤ b.

20 / 31

slide-64
SLIDE 64

SDP representation – LP

LP as SDP min fT x s.t. Ax ≤ b. SDP formulation min fT x s.t. A(x) := diag(b1 − aT

1 x, . . . , bm − aT mx) 0.

20 / 31

slide-65
SLIDE 65

SDP representation – SOCP

SOCP as SDP min fT x s.t. AT

i x + bi ≤ cT i x + di,

i = 1, . . . , m.

21 / 31

slide-66
SLIDE 66

SDP representation – SOCP

SOCP as SDP min fT x s.t. AT

i x + bi ≤ cT i x + di,

i = 1, . . . , m. SDP formulation x2 ≤ t ⇐ ⇒ t xT x tI

  • Schur-complements:
  • A

BT B C

  • 0 ⇐

⇒ A − BT C−1B 0.

21 / 31

slide-67
SLIDE 67

SDP representation – SOCP

SOCP as SDP min fT x s.t. AT

i x + bi ≤ cT i x + di,

i = 1, . . . , m. SDP formulation x2 ≤ t ⇐ ⇒ t xT x tI

  • Schur-complements:
  • A

BT B C

  • 0 ⇐

⇒ A − BT C−1B 0. AT

i x + bi ≤ cT i x + di ⇐

⇒ cT

i x + di

(AT

i x + bi)T

AT

i x + bi

(cT

i x + di)

  • 0.

21 / 31

slide-68
SLIDE 68

SDP / LMI representation

  • Def. A set S ⊂ Rn is called linear matrix inequality (LMI) repre-

sentable if there exist symmetric matrices A0, . . . , An such that S = {x ∈ Rn | A0 + x1A1 + · · · + xnAn 0} . S is called SDP representable if it equals the projection of some higher dimensional LMI representable set.

22 / 31

slide-69
SLIDE 69

SDP / LMI representation

  • Def. A set S ⊂ Rn is called linear matrix inequality (LMI) repre-

sentable if there exist symmetric matrices A0, . . . , An such that S = {x ∈ Rn | A0 + x1A1 + · · · + xnAn 0} . S is called SDP representable if it equals the projection of some higher dimensional LMI representable set. ♠ Linear inequalities: Ax ≤ b iff    b1 − aT

1 x

... bm − aT

mx

   0.

22 / 31

slide-70
SLIDE 70

SDP / LMI representation

♠ Convex quadratics: xT LLT x + bT x ≤ c iff I LT x xT L c − bT x

  • 23 / 31
slide-71
SLIDE 71

SDP / LMI representation

♠ Convex quadratics: xT LLT x + bT x ≤ c iff I LT x xT L c − bT x

  • ♠ Eigenvalue inequalities:

λmax(X) ≤ t, iff tI − X 0

23 / 31

slide-72
SLIDE 72

SDP / LMI representation

♠ Convex quadratics: xT LLT x + bT x ≤ c iff I LT x xT L c − bT x

  • ♠ Eigenvalue inequalities:

λmax(X) ≤ t, iff tI − X 0 λmin(X) ≥ t iffX − tI 0 λmax cvx λminconcave.

23 / 31

slide-73
SLIDE 73

SDP / LMI representation

♠ Convex quadratics: xT LLT x + bT x ≤ c iff I LT x xT L c − bT x

  • ♠ Eigenvalue inequalities:

λmax(X) ≤ t, iff tI − X 0 λmin(X) ≥ t iffX − tI 0 λmax cvx λminconcave. ♠ Matrix norm: X ∈ Rm×n, X2 ≤ t (i.e., σmax(X) ≤ t) iff tIm X XT tIn

  • 0.
  • Proof. t2I XXT =

⇒ t2 ≥ λmax(XXT ) = σ2

max(X).

23 / 31

slide-74
SLIDE 74

SDP / LMI representation

♠ Sum of top eigenvalues: For X ∈ Sn, k

i=1 λi(X) ≤ t iff

t − ks − Tr(Z) ≥ 0 Z 0 Z − X + sI 0.

24 / 31

slide-75
SLIDE 75

SDP / LMI representation

♠ Sum of top eigenvalues: For X ∈ Sn, k

i=1 λi(X) ≤ t iff

t − ks − Tr(Z) ≥ 0 Z 0 Z − X + sI 0. Proof: Suppose k

i=1 λi(X) ≤ t.

24 / 31

slide-76
SLIDE 76

SDP / LMI representation

♠ Sum of top eigenvalues: For X ∈ Sn, k

i=1 λi(X) ≤ t iff

t − ks − Tr(Z) ≥ 0 Z 0 Z − X + sI 0. Proof: Suppose k

i=1 λi(X) ≤ t. Then, choosing s = λk and

Z = Diag(λ1 − s, . . . , λk − s, 0, . . . , 0), above LMIs hold.

24 / 31

slide-77
SLIDE 77

SDP / LMI representation

♠ Sum of top eigenvalues: For X ∈ Sn, k

i=1 λi(X) ≤ t iff

t − ks − Tr(Z) ≥ 0 Z 0 Z − X + sI 0. Proof: Suppose k

i=1 λi(X) ≤ t. Then, choosing s = λk and

Z = Diag(λ1 − s, . . . , λk − s, 0, . . . , 0), above LMIs hold. Conversely, if above LMI holds, then, (since Z 0) X Z + sI = ⇒ k

i=1 λi(X) ≤

k

i=1(λi(Z) + s)

≤ n

i=1 λi(Z) + ks

≤ t (from first ineq.).

24 / 31

slide-78
SLIDE 78

SDP / LMI Representation

♠ Nuclear norm: X ∈ Rm×n; Xtr := n

i=1 σi(X) ≤ t iff

25 / 31

slide-79
SLIDE 79

SDP / LMI Representation

♠ Nuclear norm: X ∈ Rm×n; Xtr := n

i=1 σi(X) ≤ t iff

t − ns − Tr(Z) ≥ Z

  • Z −

X XT

  • + sIm+n
  • 0.

25 / 31

slide-80
SLIDE 80

SDP / LMI Representation

♠ Nuclear norm: X ∈ Rm×n; Xtr := n

i=1 σi(X) ≤ t iff

t − ns − Tr(Z) ≥ Z

  • Z −

X XT

  • + sIm+n
  • 0.

Follows from: λ

  • X

XT

  • = (±σ(X), 0, . . . , 0).

25 / 31

slide-81
SLIDE 81

SDP / LMI Representation

♠ Nuclear norm: X ∈ Rm×n; Xtr := n

i=1 σi(X) ≤ t iff

t − ns − Tr(Z) ≥ Z

  • Z −

X XT

  • + sIm+n
  • 0.

Follows from: λ

  • X

XT

  • = (±σ(X), 0, . . . , 0).

Alternatively, we may SDP-represent nuclear norm as Xtr ≤ t ⇔ ∃U, V : U X XT V

  • 0,

Tr(U + V ) ≤ 2t. Proof is slightly more involved (see lecture notes).

25 / 31

slide-82
SLIDE 82

SDP example

Logarithmic Chebyshev approximation min max

1≤i≤m | log(aT i x) − log bi|

26 / 31

slide-83
SLIDE 83

SDP example

Logarithmic Chebyshev approximation min max

1≤i≤m | log(aT i x) − log bi|

| log(aT

i x) − log bi| = log max(aT i x/bi, bi/aT i x)

26 / 31

slide-84
SLIDE 84

SDP example

Logarithmic Chebyshev approximation min max

1≤i≤m | log(aT i x) − log bi|

| log(aT

i x) − log bi| = log max(aT i x/bi, bi/aT i x)

Reformulation min

x,t

t s.t. 1/t ≤ aT

i x/bi ≤ t,

i = 1, . . . , m.

26 / 31

slide-85
SLIDE 85

SDP example

Logarithmic Chebyshev approximation min max

1≤i≤m | log(aT i x) − log bi|

| log(aT

i x) − log bi| = log max(aT i x/bi, bi/aT i x)

Reformulation min

x,t

t s.t. 1/t ≤ aT

i x/bi ≤ t,

i = 1, . . . , m. aT

i x/bi

1 1 t

  • 0,

i = 1, . . . , m.

26 / 31

slide-86
SLIDE 86

Least-squares SDP

min X − Y 2

2

s.t. X 0. Exercise 1: Try solving using CVX (assume Y T = Y ); note ·2 above is the operator 2-norm; not the Frobenius norm. Exercise 2: Recast as SDP. Hint: Begin with minX,t t s.t. . . . Exercise 3: Solve the two questions also with X − Y 2

F

Exercise 4: Verify against analytic solution: X = UΛ+U T , where Y = UΛU T , and Λ+ = Diag(max(0, λ1), . . . , max(0, λn)).

27 / 31

slide-87
SLIDE 87

SDP relaxation

Binary Least-squares min Ax − b2 xi ∈ {−1, +1} i = 1, . . . , n. ◮ Fundamental problem (engineering, computer science) ◮ Nonconvex; xi ∈ {−1, +1} – 2n possible solutions ◮ Very hard in general (even to approximate)

28 / 31

slide-88
SLIDE 88

SDP relaxation

Binary Least-squares min Ax − b2 xi ∈ {−1, +1} i = 1, . . . , n. ◮ Fundamental problem (engineering, computer science) ◮ Nonconvex; xi ∈ {−1, +1} – 2n possible solutions ◮ Very hard in general (even to approximate) min xT AT Ax − 2xT AT b + bT b x2

i = 1

min Tr(AT AxxT ) − 2bT Ax x2

i = 1

28 / 31

slide-89
SLIDE 89

SDP relaxation

Binary Least-squares min Ax − b2 xi ∈ {−1, +1} i = 1, . . . , n. ◮ Fundamental problem (engineering, computer science) ◮ Nonconvex; xi ∈ {−1, +1} – 2n possible solutions ◮ Very hard in general (even to approximate) min xT AT Ax − 2xT AT b + bT b x2

i = 1

min Tr(AT AxxT ) − 2bT Ax x2

i = 1

min Tr(AT AY ) − 2bT Ax s.t. Y = xxT , diag(Y ) = 1.

28 / 31

slide-90
SLIDE 90

SDP relaxation

Binary Least-squares min Ax − b2 xi ∈ {−1, +1} i = 1, . . . , n. ◮ Fundamental problem (engineering, computer science) ◮ Nonconvex; xi ∈ {−1, +1} – 2n possible solutions ◮ Very hard in general (even to approximate) min xT AT Ax − 2xT AT b + bT b x2

i = 1

min Tr(AT AxxT ) − 2bT Ax x2

i = 1

min Tr(AT AY ) − 2bT Ax s.t. Y = xxT , diag(Y ) = 1. ◮ Still hard: Y = xxT is a nonconvex constraint.

28 / 31

slide-91
SLIDE 91

SDP relaxation

Replace Y = xxT by Y xxT . Thus, we obtain min Tr(AT AY ) − 2bT Ax Y xxT , diag(Y ) = 1.

29 / 31

slide-92
SLIDE 92

SDP relaxation

Replace Y = xxT by Y xxT . Thus, we obtain min Tr(AT AY ) − 2bT Ax Y xxT , diag(Y ) = 1. This is an SDP, since Y xxT ⇔ Y x xT 1

  • (using Schur complements).

29 / 31

slide-93
SLIDE 93

SDP relaxation

Replace Y = xxT by Y xxT . Thus, we obtain min Tr(AT AY ) − 2bT Ax Y xxT , diag(Y ) = 1. This is an SDP, since Y xxT ⇔ Y x xT 1

  • (using Schur complements).

◮ Optimal value gives lower bound on binary LS ◮ Recover binary x by randomized rounding Exercise: Try the above problem in CVX.

29 / 31

slide-94
SLIDE 94

Nonconvex quadratic optimization

min xT Ax + bT x xT Pix + bT

i x + c ≤ 0,

i = 1, . . . , m.

30 / 31

slide-95
SLIDE 95

Nonconvex quadratic optimization

min xT Ax + bT x xT Pix + bT

i x + c ≤ 0,

i = 1, . . . , m. Exercise: Show that xT Qx = Tr(QxxT ) (where Q is symmetric).

30 / 31

slide-96
SLIDE 96

Nonconvex quadratic optimization

min xT Ax + bT x xT Pix + bT

i x + c ≤ 0,

i = 1, . . . , m. Exercise: Show that xT Qx = Tr(QxxT ) (where Q is symmetric). min

X,x

Tr(AX) + bT x Tr(PiX) + bT

i x + c ≤ 0,

i = 1, . . . , m X 0, rank(X) = 1.

30 / 31

slide-97
SLIDE 97

Nonconvex quadratic optimization

min xT Ax + bT x xT Pix + bT

i x + c ≤ 0,

i = 1, . . . , m. Exercise: Show that xT Qx = Tr(QxxT ) (where Q is symmetric). min

X,x

Tr(AX) + bT x Tr(PiX) + bT

i x + c ≤ 0,

i = 1, . . . , m X 0, rank(X) = 1. ◮ Relax nonconvex rank(X) = 1 to X xxT . ◮ Can be quite bad, but sometimes also quite tight.

30 / 31

slide-98
SLIDE 98

References

1 L. Vandenberghe. MLSS 2012 Lecture slides; EE236B slides 2 A. Nemirovski. Lecture slides on modern convex optimization.

31 / 31