Convex Optimization 1. Introduction Prof. Ying Cui Department of - - PowerPoint PPT Presentation

convex optimization
SMART_READER_LITE
LIVE PREVIEW

Convex Optimization 1. Introduction Prof. Ying Cui Department of - - PowerPoint PPT Presentation

Convex Optimization 1. Introduction Prof. Ying Cui Department of Electrical Engineering Shanghai Jiao Tong University 2020 SJTU Ying Cui 1 / 18 Outline Mathematical optimization Least-squares problems Linear programming Convex


slide-1
SLIDE 1

Convex Optimization

  • 1. Introduction
  • Prof. Ying Cui

Department of Electrical Engineering Shanghai Jiao Tong University

2020

SJTU Ying Cui 1 / 18

slide-2
SLIDE 2

Outline

Mathematical optimization Least-squares problems Linear programming Convex optimization Nonlinear optimization Outline of textbook

SJTU Ying Cui 2 / 18

slide-3
SLIDE 3

Mathematical optimization

Optimization problem min

x

f0(x) s.t. fi(x) ≤ bi, i = 1, · · · , m. ◮ optimization variable: x = (x1, · · · , xn) ∈ Rn ◮ objective function: f0 : Rn → R ◮ (inequality) constraint functions: fi : Rn → R, i = 1, · · · , m ◮ limits or bounds for the constraints: bi : R, i = 1, · · · , m ◮ optimal or a solution: x⋆ has the smallest objective value among all vectors that satisfy the constraints

◮ ∀z ∈ Rn with fi(z) ≤ bi, i = 1, · · · , m, we have f0(z) ≥ f0(x∗)

SJTU Ying Cui 3 / 18

slide-4
SLIDE 4

Mathematical optimization

Classes of optimization problems ◮ linear optimization problems: f0, f1, · · · , fm are linear

◮ fi(αx + βy) = αfi(x) + βfi(y) for all x, y ∈ Rn and all α, β ∈ R

◮ nonlinear optimization problems ◮ convex optimization problems: f0, f1, · · · , fm are convex

◮ fi(αx + βy) ≤ αfi(x) + βfi(y) for all x, y ∈ Rn and all α, β ∈ R with α + β = 1, α, β ≥ 0

◮ nonconvex optimization problem ◮ convex optimization is a generalization of linear optimization

◮ convexity is more general than linearity: inequality replaces more restrictive equality and inequality must hold only for certain values of α and β

◮ “the greatest watershed in optimization isn’t between linearity and nonlinearity, but convexity and nonconvexity”

◮ stated by Ralph Tyrrell Rockafellar, in his 1993 SIAM survey paper [Roc93]

◮ first formal argument that convex optimization problems are easier to solve than general nonlinear optimization problems

◮ made by Nemirovski and Yudin, in their 1983 book [NY83]

SJTU Ying Cui 4 / 18

slide-5
SLIDE 5

Mathematical optimization

Applications ◮ optimization is an abstraction of the problem of making the best possible choice from a set of candidate choices

◮ variable x represents the choice made ◮ constraints fi(x) ≤ bi, i = 1, · · · , m represent firm requirements or specifications limiting the possible choices ◮ objective value f0(x) represents the cost of choosing x (−f0(x) represents the value or utility of choosing x) ◮ a solution x⋆ corresponds to a choice with minimum cost (or maximum utility), among all possible choices

◮ many practical problems involving decision making, system design, analysis, and operation can be cast in the form of an

  • ptimization problem or some variation of it

◮ optimization has become an important tool in many areas

◮ e.g., electronic design automation, automatic control systems, and optimal design problems arising in civil, chemical, mechanical, and aerospace engineering

SJTU Ying Cui 5 / 18

slide-6
SLIDE 6

Mathematical optimization

Examples ◮ portfolio optimization: seek the best way to invest some capital in a set of assets

◮ variables: amounts invested in different assets ◮ constraints: budget, max./min. investment per asset, minimum return ◮ objective: overall risk or return variance

◮ device sizing in electronic circuits: choose the width and length of each device in an electronic circuit

◮ variables: device widths and lengths ◮ constraints: manufacturing limits, timing requirements, maximum area ◮ objective: power consumption

◮ data fitting: find a model that best fits some observed data

◮ variables: model parameters ◮ constraints: prior information, parameter limits ◮ objective: measure of misfit or prediction error

SJTU Ying Cui 6 / 18

slide-7
SLIDE 7

Mathematical optimization

Solving optimization problems ◮ effectiveness of solution methods varies considerably, and depends on factors, e.g.,

◮ forms of objective and constraint functions, numbers of variables and constraints, and special structures (e.g., sparsity)

◮ a problem is sparse if each constraint function depends on

  • nly a small number of the variables

◮ the general optimization problem is very difficult to solve, and approaches to it involve some compromise, e.g.,

◮ very long computation time, or possibility of not finding the solution

◮ a few problem classes can be solved reliably and efficiently

◮ least-squares problems, linear programming problems, convex

  • ptimization problems

SJTU Ying Cui 7 / 18

slide-8
SLIDE 8

Least-squares problems

A least-squares problem is an optimization problem with no constraints and an objective which is a sum of squares of terms of the form aT

i x − bi:

min

x

||Ax − b||2

2 = k

  • i=1

(aT

i x − bi)2

= (Ax − b)T (Ax − b) = xT ATAx − 2xT ATb + bTb A ∈ Rk×n (k ≥ n), aT

i , i = 1, · · · , k are the rows of A, rankA = n,

b ∈ Rk and x ∈ Rn Solving least-squares problems ◮ analytical solution: x = (ATA)−1ATb

◮ ∇f0(x) = 2ATAx − 2ATb = 0 = ⇒ x = (AT A)−1ATb

◮ reliable and efficient algorithms and software ◮ computation time proportional to n2k, less if structured (e.g., A is sparse, i.e., has far fewer than kn nonzero entries) ◮ a mature technology

SJTU Ying Cui 8 / 18

slide-9
SLIDE 9

Least-squares problems

Using least-squares ◮ basis for regression analysis, optimal control, and many parameter estimation and data fitting methods ◮ least-squares problems are easy to recognize

◮ verify that the objective is a quadratic function of a positive semidefinite form, i.e., xTPx + qTx + r, P ∈ Sn

+

◮ several standard techniques are used to increase flexibility

◮ e.g., including weights, adding regularization terms

◮ weighted least-squares: k

i=1 wi(aT i x − bi)2

◮ weights wi chosen to reflect differing levels of concern about sizes of terms aT

i x − bi

◮ regularization: k

i=1(aT i x − bi)2 + ρ n i=1 x2 i

◮ parameter ρ chosen to give right trade-off between making k

i=1(aT i x − bi)2 small, while keeping n i=1 x2 i not too big

SJTU Ying Cui 9 / 18

slide-10
SLIDE 10

Linear programming

Linear programming is a class of optimization problems where the

  • bjective and all constraint functions are linear:

min

x

cTx s.t. aT

i x ≤ bi,

i = 1, · · · , m vectors c, a1, · · · , am ∈ Rn, scalars b1, · · · , bm ∈ R and x ∈ Rn Solving linear programs ◮ no analytical formula for solution ◮ reliable and efficient algorithms (e.g., simplex method and interior-point methods) and software ◮ computation time proportional to n2m (assuming m ≥ n), less with structure (e.g., problem is sparse, i.e., each constraint function depends on only a small number of the variables) ◮ a mature technology

SJTU Ying Cui 10 / 18

slide-11
SLIDE 11

Linear programming

Using linear programming ◮ not as easy to recognize as least-squares problems ◮ a few standard tricks used to convert problems into linear programs

◮ e.g., problems involving l1-norm or l∞-norm, piecewise-linear functions

◮ Chebyshev approximation problem: min

x

max

i=1,··· ,k |aT i x − bi|

vectors a1, · · · , am ∈ Rn and scalars b1, · · · , bm ∈ R

◮ equivalent problem: min

x,t

t s.t. aT

i x − t ≤ bi,

i = 1, · · · , k − aT

i x − t ≤ −bi,

i = 1, · · · , k

SJTU Ying Cui 11 / 18

slide-12
SLIDE 12

Convex optimization

Convex optimization is a class of optimization problems where the

  • bjective and all constraint functions are convex:

min

x

f0(x) s.t. fi(x) ≤ bi, i = 1, · · · , m ◮ functions f0, · · · , fm : Rn → R are convex, i.e., satisfy fi(αx + βy) ≤ αfi(x) + βfi(y) for all x, y ∈ Rn and all α, β ∈ R with α + β = 1, α ≥ 0, β ≥ 0 ◮ least-squares problem and linear programming problem are both special cases

SJTU Ying Cui 12 / 18

slide-13
SLIDE 13

Convex optimization

Solving convex optimization problems ◮ in general no analytical formula for solution ◮ reliable and efficient algorithms

◮ interior-point methods: almost always 10-100 iterations, each with computation time proportional to max{n3, n2m, F}, where F is cost of evaluating first and second derivatives of f0, · · · , fm

◮ almost a technology Using convex optimization ◮ often difficult to recognize ◮ many tricks for transforming problems into convex form ◮ surprisingly many problems can be solved via convex

  • ptimization

SJTU Ying Cui 13 / 18

slide-14
SLIDE 14

Nonlinear optimization

Nonlinear optimization (or programming) is the term used to describe an optimization problem that is not linear, but not known to be convex. ◮ no effective methods for solving general nonconvex problems Local optimization ◮ seek a locally optimal solution, which minimizes the objective function among feasible points near it ◮ widely used in applications where there is value in finding a good point, if not the very best ◮ local optimization methods

◮ advantages: can be fast, can handle large problems, and are widely applicable (only require differentiability of objective and constraint functions) ◮ disadvantages: require initial guess (which can greatly affect the objective value of the local solution obtained), provide no information about distance to (global) optimum, and are sensitive to algorithm parameter values

SJTU Ying Cui 14 / 18

slide-15
SLIDE 15

Nonlinear optimization

Global optimization ◮ seek a globally optimal solution, which minimizes the

  • bjective function among all feasible points

◮ used for problems with a small number of variables, where computing time is not critical, and the value of finding the true global solution is very high ◮ global optimization methods

◮ advantages: guarantee global optimality ◮ disadvantages: worst-case complexity exponential with problem sizes n and m

SJTU Ying Cui 15 / 18

slide-16
SLIDE 16

Nonlinear optimization

Role of convex optimization in nonconvex problems ◮ initialization or iterations for local optimization

◮ use the exact solution to an approximate convex formulation of a nonconvex problem as the initial point ◮ successive convex approximation (SCA): successively approximate a nonconvex problem with convex problems

◮ convex heuristics for nonconvex optimization

◮ approximate a nonconvex term with a convex one

◮ approximate x0 with x1 (Eg. 6.4)

◮ randomized algorithms: draw candidates from a probability distribution, and take best one as approximate solution

◮ semidefinite relaxation & Gaussian randomization (Ex. 11.23)

◮ bounds for global optimization

◮ constraint relaxation: solve relaxed problem with nonconvex constraints replaced with looser but convex constraints ◮ Lagrangian relaxation: solve (convex) Lagrangian dual problem ◮ relaxation provides a lower bound on the optimal value of a nonconvex problem

SJTU Ying Cui 16 / 18

slide-17
SLIDE 17

Outline of textbook

◮ theory (Chp.2-Chp.5): cover basic definitions, concepts, and results from convex analysis and convex optimization

◮ convex sets, convex functions, convex optimization problems, duality ◮ how to recognize and formulate convex optimization problems

◮ applications (Chp.6-Chp.8): describe a variety of applications

  • f convex optimization, in areas like probability and statistics,

computational geometry, and data fitting

◮ approximation and fitting, statistical estimation, geometric problems ◮ how to apply convex optimization in practice

◮ algorithms (Chp.9-Chp.11): describe numerical methods for solving convex optimization problems, focusing on Newton’s algorithm and interior-point methods

◮ unconstrained minimization, equality constrained minimization, interior-point methods ◮ how to solve convex optimization problems

SJTU Ying Cui 17 / 18

slide-18
SLIDE 18

Brief history of convex optimization

◮ theory (convex analysis): 1900–1970 ◮ algorithms

◮ 1947: simplex algorithm for linear programming (Dantzig) ◮ 1960s: early interior-point methods (Fiacco & McCormick, Dikin, ...) ◮ 1970s: ellipsoid method and other subgradient methods ◮ 1980s: polynomial-time interior-point methods for linear programming ◮ late 1980s–now: polynomial-time interior-point methods for nonlinear convex optimization (Nesterov & Nemirovski 1994)

◮ applications

◮ before 1990: mostly in operations research, few in engineering ◮ since 1990: many new applications in engineering (control, signal processing, communications, circuit design, ...), new problem classes (semidefinite and second-order cone programming, robust optimization,...)

SJTU Ying Cui 18 / 18