Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui - - PowerPoint PPT Presentation

convex optimization
SMART_READER_LITE
LIVE PREVIEW

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui - - PowerPoint PPT Presentation

Convex Optimization 4. Convex Optimization Problems Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2018 SJTU Ying Cui 1 / 64 Outline Optimization problems Convex optimization Linear optimization


slide-1
SLIDE 1

Convex Optimization

  • 4. Convex Optimization Problems
  • Prof. Ying Cui

Department of Electrical Engineering Shanghai Jiao Tong University

2018

SJTU Ying Cui 1 / 64

slide-2
SLIDE 2

Outline

Optimization problems Convex optimization Linear optimization problems Quadratic optimization problems Geometric programming Semidefinite programing Vector optimization

SJTU Ying Cui 2 / 64

slide-3
SLIDE 3

Optimization problem in standard form

min

x

f0(x) s.t. fi(x) ≤ 0, i = 1, · · · , m hi(x) = 0, i = 1, · · · , p ◮ optimization variable: x ∈ Rn ◮ objective function: f0 : Rn → R ◮ inequality constraint functions: fi : Rn → R, i = 1, · · · , m ◮ equality constraint functions: hi : Rn → R, i = 1, · · · , p ◮ domain: D = m

i=0 domfi ∩ p i=1 domhi

◮ feasible point x: x ∈ D and x satisfies all constraints ◮ feasible set or constraint set X: set of all feasible points ◮ feasible problem: problem with nonempty feasible set

SJTU Ying Cui 3 / 64

slide-4
SLIDE 4

Optimal and locally optimal points

◮ optimal value: p∗ = inf{f0(x) | fi(x) ≤ 0, i = 1, · · · , m, hi(x) = 0, i = 1, · · · , p}

◮ p∗ = ∞ if problem is infeasible (standard convention: the infimum of the empty set is ∞) ◮ p∗ = −∞ if problem is unbounded below

◮ (globally) optimal point x∗: x∗ is feasible and f0(x∗) = p∗ ◮ optimal set Xopt: set of optimal points

◮ if Xopt is nonempty, optimal value is achieved and problem is solvable ◮ otherwise, optimal value is not achieved (always occurs when problem unbounded below)

◮ locally optimal point x: ∃R > 0 such that x is optimal for min

z

f0(z) s.t. fi(z) ≤ 0, i = 1, · · · , m hi(z) = 0, i = 1, · · · , p z − x 2≤ R

SJTU Ying Cui 4 / 64

slide-5
SLIDE 5

Optimal and locally optimal points

◮ if fi(x) = 0 for feasible point x, i-th inequality constraint fi(x) ≤ 0 is active at x ◮ if fi(x) < 0 for feasible point x, i-th inequality constraint fi(x) ≤ 0 is inactive at x ◮ equality constraints are active at all feasible points ◮ a constraint is redundant if deleting it does not change the feasible set examples (simple unconstrained problems, n = 1, m = p = 0) ◮ f0(x) = 1/x, dom f0 = R++ : p∗ = 0, not achieved ◮ f0(x) = − log x, dom f0 = R++ : p∗ = −∞, problem unbounded below ◮ f0(x) = x log x, dom f0 = R++ : p∗ = −1/e, achieved at the unique optimal point x∗ = 1/e ◮ f0(x) = x3 − 3x, p∗ = −∞, problem unbounded below, locally optimal point x = 1

SJTU Ying Cui 5 / 64

slide-6
SLIDE 6

Explicit and implicit constraints

◮ explicit constraints: fi(x) ≤ 0, i = 1, · · · , m, hi(x) = 0, i = 1, · · · , p

◮ a problem is unconstrained if it has no explicit constraints (m = p = 0)

◮ implicit constraint x ∈ D =

m

  • i=0

domfi ∩

p

  • i=1

domhi example: min

x

f0(x) = −

k

  • i=1

log(bi − aT

i x)

is an unconstrained problem with implicit constraints aT

i x < bi, i = 1, · · · , k

SJTU Ying Cui 6 / 64

slide-7
SLIDE 7

Feasibility problems

The feasibility problem is to determine whether the constraints are consistent, and if so, find a point that satisfies them, i.e., find x s.t. fi(x) ≤ 0, i = 1, ..., m hi(x) = 0, i = 1, ..., p It can be considered a special case of the general problem with f0(x) = 0, i.e., min

x

s.t. fi(x) ≤ 0, i = 1, ..., m hi(x) = 0, i = 1, ..., p ◮ p∗ = 0 if the feasible set X is nonempty

◮ any feasible point x ∈ X is optimal

◮ p∗ = ∞ if the feasible set X is empty

SJTU Ying Cui 7 / 64

slide-8
SLIDE 8

Convex optimization problems in standard form

min

x

f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m aT

i x = bi,

i = 1, ..., p (or Ax = b) ◮ problem is convex, if objective function f0 and inequality constraint functions f1, ..., fm are convex, and equality constraints are affine

◮ problem is quasiconvex, if f0 is quasiconvex and f1, ..., fm are convex

◮ feasible set of a convex optimization problem is convex

◮ intersection of domain D = m

i=0 domfi with m sublevel sets

{x|fi(x) ≤ 0} and p hyperplanes {x|aT

i x = bi} (all convex)

SJTU Ying Cui 8 / 64

slide-9
SLIDE 9

Abstract form convex optimization problem

example min

x

f0(x) = x2

1 + x2 2

s.t. f1(x) = x1/(1 + x2

2) ≤ 0

h1(x) = (x1 + x2)2 = 0 ◮ not a convex optimization problem in standard form (according to our definition)

◮ f1 is not convex, h1 is not affine

◮ minimize a convex function over a convex set

◮ f0 is convex, feasible set {(x1, x2)|x1 = −x2 ≤ 0} is convex

◮ equivalent (but not identical) to the convex problem min

x

x2

1 + x2 2

s.t. x1 ≤ 0 x1 + x2 = 0

SJTU Ying Cui 9 / 64

slide-10
SLIDE 10

Local and global optima

fundamental property: any locally optimal point of a convex problem is (globally) optimal proof: Suppose x is locally optimal, i.e., x is feasible and f0(x)

(a)

= inf{f0(z)|z feasible, ||z − x||2 ≤ R} for some R > 0. Suppose x is not globally optimal, i.e., there exists a feasible y such that f0(y) < f0(x). Evidently ||y − x||2 > R. Consider z = θy + (1 − θ)x with θ =

R 2y−x2 . Then, we have

||z − x||2 = θ||y − x||2 = R/2 < R, and by convexity of the feasible set, z is feasible. By convexity of f0, we have f0(z) ≤ θf0(y) + (1 − θ)f0(x) < f0(x) which contradicts (a).

SJTU Ying Cui 10 / 64

slide-11
SLIDE 11

Optimality criterion for differentiable f0

x is optimal iff x ∈ X and ∇f0(x)T (y − x) ≥ 0 for all y ∈ X where X = {x|fi(x) ≤ 0, i = 1, · · · , m, hi(x) = 0, i = 1, · · · , p} denotes the feasible set

−∇f0(x) X x Figure 4.2 Geometric interpretation of the optimality condition (4.21). The feasible set X is shown shaded. Some level curves of f0 are shown as dashed lines. The point x is optimal: −∇f0(x) defines a supporting hyperplane (shown as a solid line) to X at x.

◮ geometric interpretation: if ∇f0(x) = 0, −∇f0(x) defines a supporting hyperplane to feasible set X at x

SJTU Ying Cui 11 / 64

slide-12
SLIDE 12

Optimality criterion for differentiable f0

◮ unconstrained problem: minx f0(x) x is optimal iff x ∈ dom f0, ∇f0(x) = 0 ◮ equality constrained problem: minx f0(x) s.t. Ax = b x is optimal iff there exists a v ∈ Rp such that x ∈ dom f0, Ax = b, ∇f0(x) + Av = 0 ◮ minimization over nonnegative orthant: minx f0(x) s.t. x 0 x is optimal iff x ∈ domf0, x 0, ∇f0(x) 0, (∇f0(x))ixi = 0, i = 1, · · · , n

◮ condition (∇f0(x))ixi = 0 is called complementarity: the sparsity patterns (i.e., the set of indices corresponding to nonzero components) of the vectors x and ∇f0(x) are complementary (i.e., have empty intersection)

SJTU Ying Cui 12 / 64

slide-13
SLIDE 13

Equivalent convex problems

◮ two problems are (informally) equivalent if the solution of one is readily obtained from the solution of the other, and vice-versa ◮ some common equivalent transformations preserve convexity

SJTU Ying Cui 13 / 64

slide-14
SLIDE 14

Equivalent convex problems

eliminating equality constraints min

x

f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m Ax = b is equivalent to min

z

f0(Fz + x0) s.t. fi(Fz + x0) ≤ 0, i = 1, ..., m where F and x0 are such that Ax = b ⇔ x = Fz + x0 for some z ◮ in principle, we can restrict our attention to convex

  • ptimization problems without equality constraints

◮ in many cases, however, it is better to retain equality constraints, for ease of analysis, or not to ruin efficiency of an algorithm that solves it

SJTU Ying Cui 14 / 64

slide-15
SLIDE 15

Equivalent convex problems

introducing equality constraints min

x

f0(A0x + b0) s.t. fi(Aix + bi) ≤ 0, i = 1, ..., m where Ai ∈ Rki×n, is equivalent to min

x,y

f0(y0) s.t. fi(yi) ≤ 0, i = 1, ..., m yi = Aix + bi, i = 0, ..., m ◮ introduce a new variable yi ∈ Rki, replace fi(Aix + bi) with fi(yi), and add the linear equality constraint yi = Aix + bi

SJTU Ying Cui 15 / 64

slide-16
SLIDE 16

Equivalent convex problems

introducing slack variables for linear inequalities min

x

f0(x) s.t. aT

i x ≤ bi,

i = 1, ..., m is equivalent to min

x,s

f0(x) s.t. aT

i x + si = bi,

i = 1, ..., m si ≥ 0, i = 1, ..., m ◮ si is called the slack variable associated with aT

i x ≤ bi

◮ replace each linear inequality constraint with a linear equality constraint and a nonnegativity constraint

SJTU Ying Cui 16 / 64

slide-17
SLIDE 17

Equivalent convex problems

epigraph problem form min

x

f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m Ax = b is equivalent to min

x,t

t s.t. f0(x) − t ≤ 0 fi(x) ≤ 0, i = 1, ..., m Ax = b ◮ an optimization problem in the ‘graph space’ (x, t): minimize t over the epigraph of f0, subject to the constraints on x ◮ linear objective is universal for convex optimization, as convex

  • ptimization is readily transformed to one with linear objective

◮ can simplify theoretical analysis and algorithm development

SJTU Ying Cui 17 / 64

slide-18
SLIDE 18

Equivalent convex problems

minimizing over some variables min

x1,x2

f0(x1, x2) s.t. fi(x1) ≤ 0, i = 1, ..., m1 ˜ fi(x2) ≤ 0, i = 1, ..., m2 is equivalent to min

x1

  • f0(x1)

s.t. fi(x1) ≤ 0, i = 1, ..., m1 where ˜ f0(x1) = inf{f0(x1, z)|˜ fi(z) ≤ 0, i = 1, ..., m2} note: ˜ f0(x1) is convex in x1 ◮ If f (x, y) is convex in (x, y) and C is a convex nonempty set, then g(x) = inf

y∈C f (x, y) is convex

SJTU Ying Cui 18 / 64

slide-19
SLIDE 19

Quasiconvex optimization

p⋆ = min

x

f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m Ax = b where f0 : Rn → R is quasiconvex, and f1, ..., fm are convex ◮ quasiconvex inequality constraint functions can be replaced with equivalent convex ones with same 0-sublevel sets

SJTU Ying Cui 19 / 64

slide-20
SLIDE 20

Quasiconvex optimization

locally optimal solutions and optimality conditions ◮ quasiconvex optimization problem can have locally optimal points that are not (globally) optimal ◮ optimality criterion for differentiable f0: x is optimal if x ∈ X and ∇f0(x)T (y − x) > 0 for all y ∈ X \ {x}

◮ condition is only sufficient for optimality ◮ condition requires the gradient of f0 to be nonzero

(x, f(x)) Figure 4.3 A quasiconvex function f on R, with a locally optimal point x that is not globally optimal. This example shows that the simple optimality condition f ′(x) = 0, valid for convex functions, does not hold for quasiconvex functions.

SJTU Ying Cui 20 / 64

slide-21
SLIDE 21

Quasiconvex optimization

convex representation of sublevel sets of f0 if f0 is quasiconvex, there exists a family of functions φt such that: ◮ φt(x) is convex in x for fixed t ◮ t-sublevel set of f0 is 0-sublevel set of φt, i.e., f0(x) ≤ t ⇔ φt(x) ≤ 0 ◮ for each x, φt(x) is nonincreasing in t example (convex over concave function) f0(x) = p(x) q(x) with p convex, q concave, and p(x) ≥ 0, q(x) > 0 on dom f0 can take φt(x) = p(x) − tq(x) : ◮ for t ≥ 0, φt convex in x ◮ p(x)/q(x) ≤ t iff φt(x) ≤ 0 ◮ for each x, φt(x) is nonincreasing in t

SJTU Ying Cui 21 / 64

slide-22
SLIDE 22

Quasiconvex optimization

quasiconvex optimization via convex feasibility problems convex feasibility problem in x for any fixed t find x φt(x) ≤ 0, fi(x) ≤ 0, i = 1, ..., m Ax = b ◮ if feasible, t ≥ p∗; if infeasible, t ≤ p∗

Bisection method for quasiconvex optimization (⌈log2((u − l)/ǫ)⌉ itera- tions) given l ≤ p∗, u ≥ p∗, tolerance ǫ > 0. repeat

  • 1. t := (l + u)/2.
  • 2. Solve the convex feasibility problem.
  • 3. if feasible, u := t; else l := t.

until u − l ≤ ǫ

SJTU Ying Cui 22 / 64

slide-23
SLIDE 23

Linear program (LP)

min

x

cTx + d s.t. Gx h Ax = b where c ∈ Rn, d ∈ R, G ∈ Rm×n, h ∈ Rm, A ∈ Rp×n and b ∈ Rp ◮ objective and constraint functions are all affine ◮ linear programs are convex optimization problems ◮ omit d in linear objective function (not affect Xopt or X) ◮ feasible set is a polyhedron

P x⋆ −c Figure 4.4 Geometric interpretation of an LP. The feasible set P, which is a polyhedron, is shaded. The objective cT x is linear, so its level curves are hyperplanes orthogonal to c (shown as dashed lines). The point x⋆ is

  • ptimal; it is the point in P as far as possible in the direction −c.

SJTU Ying Cui 23 / 64

slide-24
SLIDE 24

Linear program (LP)

Two special cases of LP: Standard form LPs min

x

cTx s.t. Ax = b x 0 the only inequalities are component-wise nonnegativity constraints Inequality form LPs min

x

cTx s.t. Ax b no equality constraints

SJTU Ying Cui 24 / 64

slide-25
SLIDE 25

Linear program (LP)

converting LPs to standard form sometimes useful to transform a general LP to a standard form LP (e.g., in order to use an algorithm for standard form LPs) ◮ introduce slack variables si, i = 1, · · · , m for inequalities: min

x,s

cTx + d s.t. Gx + s = h Ax = b s 0 ◮ express x = x+ − x−, x+, x− 0: min

x+,x−,s

cTx+ − cTx− + d s.t. Gx+ − Gx− + s = h Ax+ − Ax− = b x+ 0, x− 0, s 0

SJTU Ying Cui 25 / 64

slide-26
SLIDE 26

LP-examples

diet problem: choose quantities x1, · · · , xn of n foods ◮ one unit of food j costs cj, contains amount aij of nutrient i ◮ healthy diet requires nutrient i in quantity at least bi to find cheapest healthy diet: min

x

cTx s.t. Ax b, x 0

SJTU Ying Cui 26 / 64

slide-27
SLIDE 27

LP-examples

Chebyshev center of a polyhedron: find center xc of largest Euclidean ball B = {xc + u| u2 ≤ r} that lies in a polyhedron P = {x ∈ |aT

i x ≤ bi, i = 1, · · · , m}, called Chebyshev center of

polyhedron P (farthest from the boundary)

xcheb xcheb Figure 8.5 Chebyshev center of a polyhedron C, in the Euclidean norm. The center xcheb is the deepest point inside C, in the sense that it is farthest from the exterior, or complement, of C. The center xcheb is also the center of the largest Euclidean ball (shown lightly shaded) that lies inside C.

◮ B ⊆ P (i.e., aT

i x ≤ bi for all x ∈ B) iff

sup{aT

i (xc + u)| u2 ≤ r} = aT i xc + sup{aT i u| u2 ≤ r} =

aT

i xc + rai2 ≤ bi

max

xc,r

r s.t. aT

i xc + rai2 ≤ bi,

i = 1, · · · , m

SJTU Ying Cui 27 / 64

slide-28
SLIDE 28

LP-examples

piecewise-linear minimization: min

x

max

i=1,··· ,m(aT i x + bi)

can be transformed to an equivalent LP ◮ form the epigraph problem min

x,t

t s.t. max

i=1,··· ,m aT i x + bi ≤ t

◮ express the inequality as a set of m separate inequalities min

x,t

t s.t. aT

i x + bi ≤ t,

i = 1, · · · , m

SJTU Ying Cui 28 / 64

slide-29
SLIDE 29

Linear-fractional program

minimize a ratio of affine functions over a polyhedron: min

x

f0(x) = cTx + d eTx + f (domf0(x) = {x|eT x + f > 0}) s.t. Gx h Ax = b bisection method: linear-fractional programs are quasiconvex problems and can be solved by bisection method ◮ f0(x) is quasiconvex (quasilinear) transforming to an LP: let y =

x eT x+f and z = 1 eT x+f

min

y,z

cT y + dz s.t. Gy hz Ay = bz eT y + fz = 1 z ≥ 0

SJTU Ying Cui 29 / 64

slide-30
SLIDE 30

Linear-fractional program

generalized linear-fractional program f0(x) = max

i=1,···,r

cT

i x + di

eT

i x + fi

, domf0(x) = {x|eT

i x+fi > 0, i = 1, · · · , r}

general linear-fractional programs are quasiconvex problems and can be solved by bisection method ◮ f0(x) is the pointwise maximum of r quasiconvex (quasilinear) functions, and therefore quasiconvex example Von Neumann growth problem: allocate activity to maximize growth rate of slowest growing sector max

x,x+

mini=1,··· ,nx+

i /xi

s.t. x+ 0, Bx+ Ax ◮ x, x+ ∈ Rn: activity levels of n sectors, in current, next period ◮ (Ax)i, (Bx+)i: produced, consumed amounts of good i ◮ x+

i /xi: growth rate of sector i

SJTU Ying Cui 30 / 64

slide-31
SLIDE 31

Quadratic program (QP)

minimize a convex quadratic function over a polyhedron min

x

(1/2)xT Px + qTx + r s.t. Gx h Ax = b where P ∈ Sn

+, q ∈ Rn, r ∈ R, G ∈ Rm×n and A ∈ Rp×n

◮ objective function is convex quadratic and constraint functions are affine ◮ QPs include LPs as a special case, by taking P = 0

P x⋆ −∇f0(x⋆) Figure 4.5 Geometric illustration of QP. The feasible set P, which is a poly- hedron, is shown shaded. The contour lines of the objective function, which is convex quadratic, are shown as dashed curves. The point x⋆ is optimal.

SJTU Ying Cui 31 / 64

slide-32
SLIDE 32

QP-examples

least-squares: unconstrained QP (A ∈ Rk×n) min

x

Ax − b2

2 = xTATAx − 2bTAx + bTb

analytical solution x∗ = A†b ◮ singular value decomposition of A: A = UΣV T ◮ pseudo-inverse of A: A† = V Σ−1UT ∈ Rn×k constrained least-squares: add linear constraints, e.g., upper and lower bounds on x min

x

Ax − b2

2

s.t. l x u no simple analytical solution

SJTU Ying Cui 32 / 64

slide-33
SLIDE 33

QP-examples

linear program with random cost: minimize risk-sensitive cost (a linear combination of expected cost and cost variance), capturing a trade-off between small expected cost and small cost variance min

x

¯ cTx + γxT Σx = EcTx + γvar(cTx) s.t. Gx h, Ax = b ◮ c ∈ Rn is random vector with mean ¯ c and covariance E(c − ¯ c)(c − ¯ c)T = Σ ◮ for given x, cTx ∈ R is random variable with mean EcT x = ¯ cTx and variance varcTx = E(cT x − EcTx)2 = xT Σx ◮ risk aversion parameter γ > 0: control the trade-off between expected cost and cost variance (risk)

SJTU Ying Cui 33 / 64

slide-34
SLIDE 34

Quadratically constrained quadratic program (QCQP)

min

x

(1/2)xT P0x + qT

0 x + r0

s.t. (1/2)xT Pix + qT

i x + ri ≤ 0,

i = 1, · · · , m Ax = b where Pi ∈ Sn

+, qi ∈ Rn, r ∈ R, A ∈ Rp×n, i = 0, · · · , m

◮ objective and inequality constraint functions are convex quadratic ◮ if P1, · · · , Pm ∈ Sn

++, feasible region is intersection of m

ellipsoids and p hyperplanes ◮ QCQPs include QPs as a special case, by taking Pi = 0, i = 1, · · · , m

SJTU Ying Cui 34 / 64

slide-35
SLIDE 35

Second-order cone programming (SOCP)

min

x

f Tx s.t. Aix + bi2 ≤ cT

i x + di,

i = 1, · · · , m Fx = g where f , ci ∈ Rn, Ai ∈ Rni×n, di ∈ R, F ∈ Rp×n and g ∈ Rp ◮ each inequality constraint is a second-order cone (SOC) constraint: (Aix + bi, cT

i x + di) ∈ second-order cone in Rni+1

◮ more general than QCQP (and of course LP)

◮ if ci = 0, SOCP reduces to QCQP, by squaring each inequality constraint ◮ if Ai = 0, SOCP reduces to LP

SJTU Ying Cui 35 / 64

slide-36
SLIDE 36

SOCP-example

Robust linear programming: min

x

cTx s.t. aT

i x ≤ bi,

i = 1, · · · , m parameters c, ai, bi can be uncertain two common approaches to handling uncertainty (e.g., in ai) ◮ deterministic model: ai lies in given ellipsoid Ei, and aT

i x ≤ bi

must hold for all ai ∈ Ei min

x

cTx s.t. aT

i x ≤ bi, for all ai ∈ Ei,

i = 1, · · · , m ◮ stochastic model: ai is a random variable, and aT

i x ≤ bi must

hold with a probability exceeding η (chance constraint) min

x

cTx s.t. prob(aT

i x ≤ bi) ≥ η,

i = 1, · · · , m

SJTU Ying Cui 36 / 64

slide-37
SLIDE 37

SOCP-example

deterministic approach via SOCP ◮ choose an ellipsoid Ei with center ¯ ai ∈ Rn, semi-axes determined by singular values/vectors of Pi ∈ Rn×n: Ei = {¯ ai + Piu|u2 ≤ 1} ◮ aT

i x ≤ bi for all ai ∈ Ei iff supu2≤1(¯

ai + Piu)Tx = ¯ aT

i x + supu2≤1(Piu)Tx = ¯

aT

i x + PT i x2 ≤ bi

◮ robust LP min

x

cTx s.t. aT

i x ≤ bi for all ai ∈ Ei,

i = 1, · · · , m is equivalent to the SOCP min

x

cTx s.t. ¯ aT

i x + PT i x2 ≤ bi,

i = 1, · · · , m

SJTU Ying Cui 37 / 64

slide-38
SLIDE 38

SOCP-example

stochastic approach via SOCP

◮ assume ai ∈ Rn is Gaussian with mean ¯ ai and covariance Σi, i.e., ai ∼ N(¯ ai, Σi) ◮ aT

i x is Gaussian r.v. with mean ¯

aT

i x and variance xTΣix:

prob(aT

i x ≤ bi) = Φ

  • bi − ¯

aT

i x

Σ1/2

i

x2

  • where Φ(x) = (1/

√ 2π) x

−∞ e−t2/2dt is CDF of N(0, 1)

◮ robust LP min

x

cTx s.t. prob(aT

i x ≤ bi) ≥ η,

i = 1, · · · , m with η ≥ 1/2, is equivalent to the SOCP min

x

cTx s.t. ¯ aT

i x + Φ−1(η)Σ1/2 i

x2 ≤ bi, i = 1, · · · , m

SJTU Ying Cui 38 / 64

slide-39
SLIDE 39

Geometric programming (GP)

monomial function: f : Rn → R f (x) = cxa1

1 xa2 2 · · · xan n ,

dom f = Rn

++

with c > 0 and ai ∈ R ◮ monomials are closed under multiplication and division posynomial function (sum of monomials): f : Rn → R f (x) =

K

  • k=1

ckxa1k

1 xa2k 2

· · · xank

n ,

dom f = Rn

++

with ck > 0 and aik ∈ R ◮ posynomials are closed under addition, multiplication, and nonnegative scaling ◮ a posynomial multiplied by a monomial is a posynomial ◮ a posynomial divided by a monomial is a posynomial

SJTU Ying Cui 39 / 64

slide-40
SLIDE 40

Geometric programming (GP)

geometric program (GP) min

x

f0(x) s.t. fi(x) ≤ 1, i = 1, · · · , m hi(x) = 1, i = 1, · · · , p with fi posynomial and hi monomial, implying domain D = Rn

++

(i.e., implicit constraint x ≻ 0) extensions of GP: f posinomial and h (hi) monomial ◮ f (x) ≤ h(x) ⇒ f (x)/h(x) ≤ 1 (f /h posinomial)

◮ f (x) ≤ a (a > 0) ⇒ f (x)/a ≤ 1 (f /a posinomial)

◮ h1(x) = h2(x) ⇒ h1(x)/h2(x) = 1 (h1/h2 monomial) ◮ maximize a nonzero monomial objective function, by minimizing its inverse (which is also a monomial)

SJTU Ying Cui 40 / 64

slide-41
SLIDE 41

Geometric program in convex form

GPs are not (in general) but can be transformed to convex problems by a change of variables yi = log xi and taking log of

  • bjective and constraint functions

◮ monomial f (x) = cxa1

1 xa2 2 · · · xan n transforms to

log f (ey1, · · · , eyn) = aTy + b (b = log c) ◮ posynomial f (x) = K

k=1 ckxa1k 1 xa2k 2

· · · xank

n

transforms to log f (ey1, · · · , eyn) = log K

k=1 eaT

k y+bk

  • (bk = logck)

◮ geometric program transforms to convex problem min

x

log K

  • k=1

exp(aT

0ky + b0k)

  • s.t.

log K

  • k=1

exp(aT

iky + bik)

  • ≤ 0,

i = 1, · · · , m Gy + d = 0

SJTU Ying Cui 41 / 64

slide-42
SLIDE 42

GP-examples

design of cantilever beam: N segments with unit lengths and rectangular cross-sections of width wi and height hi, and given vertical force F applied at the right end, causing the beam to deflect (downward) and inducing stress in each segment

F segment 4 segment 3 segment 2 segment 1 Figure 4.6 Segmented cantilever beam with 4 segments. Each segment has unit length and a rectangular profile. A vertical force F is applied at the right end of the beam.

minimize total weight subject to upper & lower bounds on wi, hi upper bound & lower bounds on aspect ratios hi/wi upper bound on stress in each segment upper bound on vertical deflection at the end of the beam variables: wi, hi for i = 1, · · · , N

SJTU Ying Cui 42 / 64

slide-43
SLIDE 43

GP-examples

  • bjective and constraint functions

◮ total weight w1h1 + · · · + wNhN is a posynomial ◮ aspect ratio hi/wi and inverse aspect ratio wi/hi are monomials ◮ maximum stress in segment i given by 6iF/(wih2

i ) is a

monomial ◮ vertical deflection yi and slope vi of central axis at the right end of segment i are defined recursively as vi = 12(i − 1/2) F Ewih3

i

+ vi+1 yi = 6(i − 1/3) F Ewih3

i

+ vi+1 + yi+1 for i = N, N − 1, . . . , 1 with vN+1 = yN+1 = 0 (E: Young’s modulus) vi and yi can be shown to be posynomial functions of w, h by induction

SJTU Ying Cui 43 / 64

slide-44
SLIDE 44

GP-examples

formulation as a GP min

w,h

w1h1 + ... + wNhN s.t. w−1

maxwi ≤ 1,

wminw−1

i

≤ 1, i = 1, ..., N h−1

maxhi ≤ 1,

hminh−1

i

≤ 1, i = 1, ..., N S−1

maxw−1 i

hi ≤ 1, Sminwih−1

i

≤ 1, i = 1, ..., N 6iFσ−1

maxw−1 i

h−2

i

≤ 1, i = 1, ..., N y −1

maxy1 ≤ 1

◮ write wmin ≤ wi ≤ wmax and hmin ≤ hi ≤ hmax as wmin/wi ≤ 1, wi/wmax ≤ 1, hmin/hi ≤ 1, hi/hmax ≤ 1 ◮ write Smin ≤ hi/wi ≤ Smax as Sminwi/hi ≤ 1, hi/(wiSmax) ≤ 1

SJTU Ying Cui 44 / 64

slide-45
SLIDE 45

GP-examples

minimizing spectral radius of nonnegative matrix Perron-Frobenius eigenvalue: λpf (A) ◮ suppose matrix A ∈ Rn×n is (elementwise) positive and irreducible (i.e., matrix (I + A)n−1 is elementwise positive) ◮ Perron-Frobenius theorem: A has a positive real eigenvalue λpf (A) equal to its spectral radius (largest magnitude of its eigenvalues), i.e., λpf (A) = maxi |λi(A)| ◮ alternative characterization: λpf (A) = inf {λ|Av λv for some v ≻ 0}

◮ Av λv can be expressed as

n

  • j=1

Aijvj/(λvi) ≤ 1, i = 1, ..., n

◮ λpf (A) determines asymptotic growth (decay) rate of Ak (i.e., Ak ∼ λk

pf ) as k → ∞ and ((1/λpf )A)k converges as k → ∞

SJTU Ying Cui 45 / 64

slide-46
SLIDE 46

GP-examples

minimizing spectral radius of matrix of posynomials ◮ suppose entries of matrix A(x) are posynomial functions of some underlying variable x ∈ Rk ◮ choose x to minimize λpf (A(x)) min

λ,x

λpf (A(x)) ◮ equivalent geometric program: min

λ,v,x

λ s.t.

n

  • j=1

A(x)ijvj/(λvi) ≤ 1, i = 1, ..., n

SJTU Ying Cui 46 / 64

slide-47
SLIDE 47

GP-examples

power control in wireless networks ◮ an interference network with K transmitter-receiver pairs

◮ channel power of link k gk, transmission power of transmitter k pk, and noise power at receiver k nk ◮ receive SINR at receiver k: γk(p) =

pkgk K

i=1,i=k pigi+nk

◮ (maximum) rate of link k at high SINR: rk(p) = log2(1 + γk(p)) ≈ log2(γk(p)) = log2

  • pkgk

K

i=1,i=k pigi+nk

  • ◮ sum rate:

Rsum(p) =

K

  • k=1

rk(p) ≈ log2 K

  • k=1

pkgk K

i=1,i=k pigi + nk

  • ◮ weighted sum rate:

Rws(p) =

K

  • k=1

wkrk(p) ≈ log2 K

  • k=1
  • pkgk

K

i=1,i=k pigi + nk

wk ◮ worst-case rate: Rmin(p) = min

k=1,...,K rk(p) ≈

min

k=1,...,K log2 pkgk K

i=1,i=k pigi+nk

SJTU Ying Cui 47 / 64

slide-48
SLIDE 48

GP-examples

◮ optimize power allocation to maximize sum rate: max

p

Rsum(p) min

p

  • k

K

i=1,i=k pigi + nk

pkgk s.t. 0 ≤ pk ≤ Pk, ∀k ⇐ ⇒ s.t. pk Pk ≤ 1, ∀k ◮ optimize power allocation to maximize weighted sum rate: max

p

Rws(p) min

p

  • k

K

i=1,i=k pigi + nk

pkgk wk s.t. 0 ≤ pk ≤ Pk, ∀k ⇐ ⇒ s.t. pk Pk ≤ 1, ∀k

SJTU Ying Cui 48 / 64

slide-49
SLIDE 49

◮ optimize power allocation to maximize worst-case rate (maxmin fairness): max

p

Rmin(p) min

p,t t−1

s.t. 0 ≤ pk ≤ Pk, ∀k ⇐ ⇒ s.t. t(K

i=1,i=k pigi + nk)

pkgk ≤ 1, ∀k

SJTU Ying Cui 49 / 64

slide-50
SLIDE 50

References for geometric programming

◮ M. Chiang, “Geometric programming for communication systems,” Found. Trends Commun. Inform. Theory, vol. 2,

  • no. 1-2, pp. 1–156, 2005.

◮ S. Boyd, S.-J. Kim, L. Vandenberghe, and A. Hassibi, “A tutorial on geometric programming,” Optim. Eng., vol. 8, no. 1, pp. 67–127, 2007

SJTU Ying Cui 50 / 64

slide-51
SLIDE 51

Generalized inequality constraints

convex problem with generalized inequality constraints allow the inequality constraint functions to be vector valued, and use generalized inequalities in the constraints min

x

f0(x) s.t. fi(x) Ki 0, i = 1, ..., m Ax = b where f0 : Rn → R is convex, Ki ⊆ Rki are proper cones, and fi : Rn → Rki are Ki-convex ◮ an ordinary convex optimization problem (Ki = Rki

+, i = 1, · · · , m) is a special case

◮ results for ordinary convex optimization problems still hold

◮ the feasible set, any sublevel set, and the optimal set are convex ◮ any point that is locally optimal is globally optimal ◮ optimality condition for differentiable f0: x is optimal iff x ∈ X and ∇f0(x)T(y − x) ≥ 0 for all y ∈ X

SJTU Ying Cui 51 / 64

slide-52
SLIDE 52

Conic form problems (cone programs)

among the simplest convex optimization problems with generalized inequalities are conic form problems min

x

cTx s.t. Fx + g K 0 Ax = b ◮ generalize LPs (reduce to LPs when K = Rm

+)

conic form problem in standard form min

x

cTx s.t. x K 0 Ax = b conic form problem in inequality form min

x

cTx s.t. Fx + g K 0

SJTU Ying Cui 52 / 64

slide-53
SLIDE 53

Semidefinite programing (SDP)

conic form problem with K = Sk

+

min

x

cT x s.t. x1F1 + x2F2 + ... + xnFn + G 0 Ax = b with G, Fi ∈ Sk and A ∈ Rp×n ◮ inequality constraint is a linear matrix inequality (LMI) ◮ if G, Fi are all diagonal, then the LMI is equivalent to a set of n linear inequalities, and SDP reduces to LP

SJTU Ying Cui 53 / 64

slide-54
SLIDE 54

Semidefinite programing (SDP)

common to refer to a problem with linear objective, linear equality and inequality constraints, and several LMI constraints, i.e., min

x

cTx s.t. F (i)(x) = x1F (i)

1

+ x2F (i)

2

+ ... + xnF (i)

n

+ G (i) 0, i = 1, · · · , K Gx h, Ax = b as an SDP, as it is readily transformed to an SDP by forming a large block diagonal LMI min

x

cTx s.t. diag(Gx − h, F (1)(x), · · · , F (K)(x)) 0 Ax = b

SJTU Ying Cui 54 / 64

slide-55
SLIDE 55

Semidefinite programing (SDP)

standard form SDP min

X

tr(CX) s.t. tr(AiX) = bi, i = 1, · · · , p X 0 where C, Ai ∈ Sn ◮ tr(CX) = n

i,j=1 CijXij is the form of a general real-valued

linear function on Sn inequality form SDP min

x

cTx s.t. x1A1 + x2A2 + ... + xnAn B Ax = b where B, Ai ∈ Sk, c ∈ Rn

SJTU Ying Cui 55 / 64

slide-56
SLIDE 56

LP and SOCP as SDP

LP and equivalent SDP LP: min

x

cTx s.t. Ax b SDP : min

x

cTx s.t. diag(Ax − b) 0 (note different interpretations of generalized inequality ) SOCP and equivalent SDP SOCP : min

x

f Tx s.t. ||Aix + bi||2 ≤ cT

i x + di, i = 1, ..., m

SDP : min

x

f Tx s.t. (cT

i x + di)I

Aix + bi (Aix + bi)T cT

i x + di

  • 0, i = 1, ..., m

SJTU Ying Cui 56 / 64

slide-57
SLIDE 57

SDP-examples

eigenvalue minimization min

x∈Rn

λmax(A(x)) where A(x) = A0 + x1A1 + ... + xnAn (with given Ai ∈ Sk) equivalent SDP: min

x∈Rn,t∈R

t s.t. A(x) tI ◮ constraint follows from λmax(A) ≤ t ⇐ ⇒ A tI

SJTU Ying Cui 57 / 64

slide-58
SLIDE 58

SDP-examples

matrix norm minimization min

x∈Rn

||A(x)||2 = (λmax(A(x)T A(x)))1/2 where || · ||2 denotes the spectral norm (maximum singular value) and A(x) = A0 + x1A1 + ... + xnAn (with given Ai ∈ Rp×q) ◮ this is a convex problem, as ||A(x)||2 is a convex function of x equivalent SDP: min

x∈Rn,t∈R

t s.t.

  • tI

A(x) A(x)T tI

  • ◮ constraint follows from

||A||2 ≤ t ⇐ ⇒ ATA t2I, t ≥ 0 ⇐ ⇒ tI A AT tI

  • SJTU

Ying Cui 58 / 64

slide-59
SLIDE 59

Vector optimization

general vector optimization problem min

x

(w.r.t.K) f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m hi(x) = 0, i = 1, ..., p with vector objective f0 : Rn → Rq minimized w.r.t. proper cone K ⊆ Rq, and constraint functions fi : Rn → R, hi : Rn → R ◮ f0(x) K f0(y): x ‘better than or equal’ in value to y w.r.t. K ◮ f0(x) and f0(y) may not be comparable convex vector optimization problem min

x

(w.r.t.K) f0(x) s.t. fi(x) ≤ 0, i = 1, ..., m Ax = b with f0 K-convex, fi convex and hi affine

SJTU Ying Cui 59 / 64

slide-60
SLIDE 60

Optimal and Pareto optimal points

set of achievable objective values O = {f0(x)|x ∈ X} ⊆ Rq ◮ f0(x) is the minimum value of O (f0(x) K f0(y), ∀y ∈ X):

◮ x is optimal and f0(x) is the optimal value ◮ most vector optimization problems do not have an optimal point (value), but this does occur in some special cases

◮ f0(x) is a minimal value of O (if there exists y ∈ X such that f0(y) K f0(x), then f0(y) = f0(x))

◮ x is Pareto optimal and f0(x) is a Pareto optimal value ◮ a vector optimization problem can have many Pareto optimal points (values)

O f0(x⋆) Figure 4.7 The set O of achievable values for a vector optimization with

  • bjective values in R2, with cone K = R2

+, is shown shaded. In this case,

the point labeled f0(x⋆) is the optimal value of the problem, and x⋆ is an

  • ptimal point. The objective value f0(x⋆) can be compared to every other

achievable value f0(y), and is better than or equal to f0(y). (Here, ‘better than or equal to’ means ‘is below and to the left of’.) The lightly shaded region is f0(x⋆)+K, which is the set of all z ∈ R2 corresponding to objective values worse than (or equal to) f0(x⋆). O f0(xpo) Figure 4.8 The set O of achievable values for a vector optimization problem with objective values in R2, with cone K = R2

+, is shown shaded. This

problem does not have an optimal point or value, but it does have a set of Pareto optimal points, whose corresponding values are shown as the dark- ened curve on the lower left boundary of O. The point labeled f0(xpo) is a Pareto optimal value, and xpo is a Pareto optimal point. The lightly shaded region is f0(xpo) − K, which is the set of all z ∈ R2 corresponding to objective values better than (or equal to) f0(xpo).

SJTU Ying Cui 60 / 64

slide-61
SLIDE 61

Scalarization

find Pareto optimal points for a general vector optimization problem by choosing λ ≻K ∗ 0 and solving scalar problem min

x

λTf0(x) s.t. fi(x) ≤ 0, i = 1, ..., m hi(x) = 0, i = 1, ..., p

  • ptimal point for the scalar problem is Pareto optimal for vector
  • ptimization problem

◮ can find (almost) all Pareto optimal points for a convex vector

  • ptimization problem by varying λ ≻K ∗ 0

O f0(x1) λ1 f0(x2) λ2 f0(x3) Figure 4.9 Scalarization. The set O of achievable values for a vector opti- mization problem with cone K = R2

+. Three Pareto optimal values f0(x1),

f0(x2), f0(x3) are shown. The first two values can be obtained by scalar- ization: f0(x1) minimizes λT

1 u over all u ∈ O and f0(x2) minimizes λT 2 u,

where λ1, λ2 ≻ 0. The value f0(x3) is Pareto optimal, but cannot be found by scalarization.

SJTU Ying Cui 61 / 64

slide-62
SLIDE 62

Multicriterion optimization

vector optimization problem with K = Rq

+

f0(x) = (F1(x), ..., Fq(x)) ◮ want all q different scalar objectives Fi’s to be small ◮ feasible x∗ is optimal if f0(x∗) f0(y) for all feasible y

◮ if there exists an optimal point, the objectives are noncompeting

◮ feasible xpo is Pareto optimal if that there exists feasible y such that f0(y) f0(xpo) implies f0(xpo) = f0(y)

◮ if there are multiple Pareto optimal values, there is a trade-off between the objectives

SJTU Ying Cui 62 / 64

slide-63
SLIDE 63

Multicriterion optimization-examples

regularized least-squares min

x

(w.r.t.R2

+)

(||Ax − b||2

2, ||x||2 2)

F1(x) = Ax − b2

2

F2(x) = x2

2

5 10 15 5 10 15 Figure 4.11 Optimal trade-off curve for a regularized least-squares problem. The shaded set is the set of achievable values (Ax−b2

2, x2 2). The optimal

trade-off curve, shown darker, is the lower left part of the boundary.

example for A ∈ R100×10, b ∈ R100; heavy line is formed by Pareto

  • ptimal points

SJTU Ying Cui 63 / 64

slide-64
SLIDE 64

Multicriterion optimization-examples

risk return trade-off in portfolio optimization min

x

(w.r.t.R2

+)

(−pTx, xT x) s.t. 1Tx = 1, x 0 ◮ x ∈ Rn is investment portfolio; xi is fraction invested in asset i ◮ p ∈ Rn is vector of relative asset price changes; modeled as a random variable with mean p and covariance ◮ pTx = Er is expected return; xT x = var r is return var.

mean return 0% 10% 20% 0% 5% 10% 15% standard deviation of return allocation x(1) x(2) x(3) x(4) 0% 10% 20% 0.5 1 Figure 4.12 Top. Optimal risk-return trade-off curve for a simple portfolio

  • ptimization problem.

The lefthand endpoint corresponds to putting all resources in the risk-free asset, and so has zero standard deviation. The righthand endpoint corresponds to putting all resources in asset 1, which has highest mean return. Bottom. Corresponding optimal allocations.

SJTU Ying Cui 64 / 64