Nonlinear Programming Models Fabio Schoen 2008 - - PowerPoint PPT Presentation

nonlinear programming models
SMART_READER_LITE
LIVE PREVIEW

Nonlinear Programming Models Fabio Schoen 2008 - - PowerPoint PPT Presentation

Nonlinear Programming Models Fabio Schoen 2008 http://gol.dsi.unifi.it/users/schoen Nonlinear Programming Models p. Introduction Nonlinear Programming Models p. NLP problems min f ( x ) x S R n Standard form: min f ( x ) h i


slide-1
SLIDE 1

Nonlinear Programming Models

Fabio Schoen 2008

http://gol.dsi.unifi.it/users/schoen

Nonlinear Programming Models – p.

slide-2
SLIDE 2

Introduction

Nonlinear Programming Models – p.

slide-3
SLIDE 3

NLP problems

min f(x) x ∈ S ⊆ Rn Standard form: min f(x) hi(x) = 0 i = 1, m gj(x) ≤ 0 j = 1, k Here S = {x ∈ Rn : hi(x) = 0∀ i, gj(x) ≤ 0∀ j}

Nonlinear Programming Models – p.

slide-4
SLIDE 4

Local and global optima

A global minimum or global optimum is any x⋆ ∈ S such that x ∈ S⇒f(x) ≥ f(x⋆) A point ¯ x is a local optimum if ∃ ε > 0 such that x ∈ S ∩ B(¯ x, ε)⇒f(x) ≥ f(¯ x) where B(¯ x, ε) = {x ∈ Rn : x − ¯ x ≤ ε} is a ball in Rn. Any global optimum is also a local optimum, but the opposite is generally false.

Nonlinear Programming Models – p.

slide-5
SLIDE 5

Convex Functions

A set S ⊆ Rn is convex if x, y ∈ S⇒λx + (1 − λ)y ∈ S for all choices of λ ∈ [0, 1]. Let Ω ⊆ Rn: non empty convex set. A function f : Ω → R is convex iff f(λx + (1 − λ)y) ≤ λf(x) + (1 − λ)f(y) for all x, y ∈ Ω, λ ∈ [0, 1]

Nonlinear Programming Models – p.

slide-6
SLIDE 6

Convex Functions

x y

Nonlinear Programming Models – p.

slide-7
SLIDE 7

Properties of convex functions

Every convex function is continuous in the interior of Ω. It might be discontinuous, but only on the frontier. If f is continuously differentiable then it is convex iff f(y) ≥ f(x) + (y − x)T∇f(x) for all y ∈ Ω

Nonlinear Programming Models – p.

slide-8
SLIDE 8

Convex functions

y x

Nonlinear Programming Models – p.

slide-9
SLIDE 9

If f is twice continuously differentiable ⇒f it is convex iff its Hessian matrix is positive semi-definite: ∇2f(x) := ∂2f ∂xi∂xj

  • then ∇2f(x) 0 iff

vT∇2f(x)v ≥ 0 ∀ v ∈ Rn

  • r, equivalently, all eigenvalues of ∇2f(x) are non negative.

Nonlinear Programming Models – p.

slide-10
SLIDE 10

Example: an affine function is convex (and concave) For a quadratic function (Q: symmetric matrix): f(x) = 1 2xTQx + bTx + c we have ∇f(x) = Qx + b ∇2f(x) = Q ⇒f is convex iff Q 0

Nonlinear Programming Models – p. 1

slide-11
SLIDE 11

Convex Optimization Problems

min f(x) x ∈ S is a convex optimization problem iff S is a convex set and f is convex on S. For a problem in standard form min f(x) hi(x) = 0 i = 1, m gj(x) ≤ 0 j = 1, k if f is convex, hi(x) are affine functions, gj(x) are convex functions, then the problem is convex.

Nonlinear Programming Models – p. 1

slide-12
SLIDE 12

Maximization

Slight abuse in notation: a problem max f(x) x ∈ S is called convex iff S is a convex set and f is a concave function (not to be confused with minimization of a concave function, (or maximization of a convex function) which are NOT a convex

  • ptimization problem)

Nonlinear Programming Models – p. 1

slide-13
SLIDE 13

Convex and non convex optimization

Convex optimization “is easy”, non convex optimization is usually very hard. Fundamental property of convex optimization problems: every local optimum is also a global optimum (will give a proof later) Minimizing a positive semidefinite quadratic function on a polyhedron is easy (polynomially solvable); if even a single eigenvalue of the hessian is negative ⇒the problem becomes NP–hard

Nonlinear Programming Models – p. 1

slide-14
SLIDE 14

Convex functions: examples

Many (of course not all . . . ) functions are convex! affine functions aTx + b quadratic functions 1

2xTQx + bTx + c with Q = QT, Q 0

any norm is a convex function x log x (however log x is concave) f is convex if and only if ∀ x0, d ∈ Rn, its restriction to any line: φ(α) = f(x0 + αd), is a convex function a linear non negative combination of convex functions is convex g(x, y) convex in x for all y ⇒

  • g(x, y) dy convex

Nonlinear Programming Models – p. 1

slide-15
SLIDE 15

more examples . . .

maxi{aT

i x + b} is convex

f, g: convex ⇒max{f(x), g(x)} is convex fa convex functions for any a ∈ A (a possibly uncountable set) ⇒supa∈A fa(x) is convex f convex ⇒f(Ax + b) let S ⊆ Rn be any set ⇒f(x) = sups∈S x − s is convex Trace(ATX) =

i,j AijXij is convex (it is linear!)

log det X−1 is convex over the set of matrices X ∈ Rn×n : X ≻ 0 λmax(X) (the largest eigenvalue of a matrix X)

Nonlinear Programming Models – p. 1

slide-16
SLIDE 16

Data Approximation

Nonlinear Programming Models – p. 1

slide-17
SLIDE 17

Table of contents

norm approximation maximum likelihood robust estimation

Nonlinear Programming Models – p. 1

slide-18
SLIDE 18

Norm approximation

Problem: min

x Ax − b

where A, b: parameters. Usually the system is over-determined, i.e. b ∈ Range(A). For example, this happens when A ∈ Rm×n with m > n and A has full rank. r := Ax − b: “residual”.

Nonlinear Programming Models – p. 1

slide-19
SLIDE 19

Examples

r = √ rTr: least squares (or “regression”) r = √ rTPr with P ≻ 0: weighted least squares r = maxi |ri|: minimax, or ℓ∞ or di Tchebichev approximation r =

i |ri|: absolute or ℓ1 approximation

Possible (convex) additional constraints: maximum deviation from an initial estimate: x − xest ≤ ǫ simple bounds ℓi ≤ xi ≤ ui

  • rdering: x1 ≤ x2 ≤ · · · ≤ xn

Nonlinear Programming Models – p. 1

slide-20
SLIDE 20

Example: ℓ1 norm

Matrix A ∈ R100×30 10 20 30 40 50 60 70 80

  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 norm 1 residuals

Nonlinear Programming Models – p. 2

slide-21
SLIDE 21

ℓ∞ norm

2 4 6 8 10 12 14 16 18 20

  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 ∞ norm residuals

Nonlinear Programming Models – p. 2

slide-22
SLIDE 22

ℓ2 norm

2 4 6 8 10 12 14 16 18

  • 5
  • 4
  • 3
  • 2
  • 1

1 2 3 4 5 norm 2 residuals

Nonlinear Programming Models – p. 2

slide-23
SLIDE 23

Variants

min

i h(yi − aT i x) where h: convex function:

h linear–quadratic h(z) =

  • z2

|z| ≤ 1 2|z| − 1 |z| > 1 “dead zone”: h(z) =

  • |z| ≤ 1

|z| − 1 |z| > 1 logarithmic barrier: h(z) =

  • − log(1 − z2)

|z| < 1 ∞ |z| ≥ 1

Nonlinear Programming Models – p. 2

slide-24
SLIDE 24

comparison

  • 0.5

0.5 1 1.5 2 2.5 3 3.5 4

  • 2
  • 1.5
  • 1
  • 0.5

0.5 1 1.5 2 norm 1(x) norm 2(x) linquad(x) deadzone(x) logbarrier(x)

Nonlinear Programming Models – p. 2

slide-25
SLIDE 25

Maximum likelihood

Given a sample X1, X2, . . . , Xk and a parametric family of probability density functions L(·; θ), the maximum likelihood estimate of θ given the sample is ˆ θ = arg max

θ

L(X1, . . . , Xk; θ) Example: linear measures with and additive i.i.d. (independent identically dsitributed) noise: Xi = aT

i θ + εi

(1)

where εi iid random variables with density p(·): L(X1 . . . , Xk; θ) =

k

  • i=1

p(Xi − aT

i θ)

Nonlinear Programming Models – p. 2

slide-26
SLIDE 26

Max likelihood estimate - MLE

(taking the logarithm, which does not change optimum points): ˆ θ = arg max

θ

  • i

log(p(Xi − aT

i θ))

If p is log–concave ⇒this problem is convex. Examples: ε ∼ N(0, σ), i.e. p(z) = (2πσ)−1/2 exp(−z2/2σ2) ⇒MLE is the ℓ2 estimate: θ = arg min Aθ − X2; p(z) = (1/(2a)) exp(−|z|/a) ⇒ℓ1 estimate: ˆ θ = arg minθ Aθ − X1

Nonlinear Programming Models – p. 2

slide-27
SLIDE 27

p(z) = (1/a) exp(−z/a)1{z≥0} (negative exponential)⇒the estimate can be found solving the LP problem: min 1T(X − Aθ) Aθ ≤ X p uniform on [−a, a] ⇒the MLE is any θ such that Aθ − X∞ ≤ a

Nonlinear Programming Models – p. 2

slide-28
SLIDE 28

Ellipsoids

An ellipsoid is a subset of Rn of the form E = {x ∈ Rn : (x − x0)TP −1(x − x0) ≤ 1} where x0 ∈ Rn is the center of the ellipsoid and P is a symmetric positive-definite matrix. Alternative representations: E = {x ∈ Rn : Ax − b2 ≤ 1} where A ≻ 0, or E = {x ∈ Rn : x = x0 + Au | u2 ≤ 1} where A is square and non singular (affine transformation of the unit ball)

Nonlinear Programming Models – p. 2

slide-29
SLIDE 29

Robust Least Squares

Least Squares: ˆ x = arg min

  • i(aT

i x − bi)2 Hp: ai not known,

but it is known that ai ∈ Ei = {¯ ai + Piu : u ≤ 1} where Pi = P T

i 0. Definition: worst case residuals:

max

ai∈Ei

  • i

(aT

i x − bi)2

A robust estimate of x is the solution of ˆ xr = arg min

x max ai∈Ei

  • i

(aT

i x − bi)2

Nonlinear Programming Models – p. 2

slide-30
SLIDE 30

RLS

It holds: |α + βTy| ≤ |α| + βy then, choosing y⋆ = β/β if α ≥ 0 and y⋆ = −β/β, otherwise if α < 0, then y = 1 and |α + βTy⋆| = |α + βTβ/βsign(α)| = |α| + β then: max

ai∈Ei |(aT i x − bi)|

= max

u≤1 |¯

aT

i x − bi + uTPix|

= |¯ aT

i x − bi| + Pix

Nonlinear Programming Models – p. 3

slide-31
SLIDE 31

. . .

Thus the Robust Least Squares problem reduces to min

  • i

(|¯ aT

i x − bi| + Pix)2

1/2 (a convex optimization problem). Transformation: min

x,t t2

|¯ aT

i x − bi| + Pix

≤ ti ∀ i

i.e.

Nonlinear Programming Models – p. 3

slide-32
SLIDE 32

. . .

min

x,t t2

¯ aT

i x − bi + Pix ≤ ti

−¯ aT

i x + bi + Pix ≤ ti

(Second Order Cone Problem). A norm cone is a convex set C = {(x, t) ∈ Rn+1 : x ≤ t}

Nonlinear Programming Models – p. 3

slide-33
SLIDE 33

Geometrical Problems

Nonlinear Programming Models – p. 3

slide-34
SLIDE 34

Geometrical Problems

projections and distances polyhedral intersection extremal volume ellipsoids classification problems

Nonlinear Programming Models – p. 3

slide-35
SLIDE 35

Projection on a set

Given a set C the projection of x on C is defined as: PC(x) = arg min

z∈C z − x

b c b c b c

⊕ ⊕ ⊕

Nonlinear Programming Models – p. 3

slide-36
SLIDE 36

Projection on a convex set

If C = {x : Ax = b, fi(x) ≤ 0, i = 1, m} where fi: convex ⇒C is a convex set and the problem PC(x) = arg min x − z Az = b fi(z) ≤ i = 1, m is convex.

Nonlinear Programming Models – p. 3

slide-37
SLIDE 37

Distance between convex sets

dist(C(1), C(2)) = min

x∈C(1),y∈C(2) x − y

Nonlinear Programming Models – p. 3

slide-38
SLIDE 38

Distance between convex sets

If C(j) = {x : A(j)x = b(j), f (j)

i

≤ 0} then the minimum distance can be found through a convex model: min x(1) − x(2) A(1)x(1) = b(1) A(2)x(2) = b(2) f (1)

i

x(1) ≤ f (2)

i

x(2) ≤

Nonlinear Programming Models – p. 3

slide-39
SLIDE 39

Polyhedral intersection

1: polyhedra described by means of linear inequalities: P1 = {x : Ax ≤ b}, P2 = {x : Cx ≤ d}

Nonlinear Programming Models – p. 3

slide-40
SLIDE 40

Polyhedral intersection

P1 P2 = ∅? It is a linear feasibility problem: Ax ≤ b, Cx ≤ d P1 ⊆ P2? Just check sup{cT

k x : Ax ≤ b} ≤ dk

∀ k (solution of a finite number of LP’s)

Nonlinear Programming Models – p. 4

slide-41
SLIDE 41

Polyhedral intersection (2)

2: polyhedra (polytopes) described through vertices: P1 = conv{v1, . . . , vk}, P2 = conv{w1, . . . , wh} P1 P2 = ∅? Need to find λ1, λk, µ1, µh ≥ 0:

  • i

λi = 1

  • j

µj = 1

  • i

λivi =

  • j

µjwj P1 ⊆ P2? ∀ i = 1, . . . , k check whether ∃ µj ≥ 0:

  • j

µj = 1

  • j

µjwj = vi

Nonlinear Programming Models – p. 4

slide-42
SLIDE 42

Minimal ellipsoid containing k points

Given v1, . . . , vk ∈ Rn find an ellipsoid E = {x : Ax − b ≤ 1} with minimal volume containing the k given points.

* * * * * * * * * * * * * * * * * * *

Nonlinear Programming Models – p. 4

slide-43
SLIDE 43

A = AT ≻ 0. Volume of E is proportional to det A−1 ⇒convex

  • ptimization problem (in the unknowns: A, b):

min log det A−1 A = AT A ≻ Avi − b ≤ 1 i = 1, k

Nonlinear Programming Models – p. 4

slide-44
SLIDE 44
  • Max. ellipsoid contained in a polyhedron

Given P = {x : Ax ≤ b} find an ellipsoid: E = {By + d : y ≤ 1} contained in P with maximum volume.

Nonlinear Programming Models – p. 4

slide-45
SLIDE 45
  • Max. ellipsoid contained in a polyhedron

E ⊆ P ⇔ aT

i (By + d) ≤ bi

∀ y : y ≤ 1 ⇔ sup

y≤1

{aT

i By + aT i d} ≤ bi

∀ i ⇔ Bai + aT

i d ≤ bi

max

B,d log det B

B = BT ≻ 0 Bai + aT

i d

≤ bi i = 1, . . .

Nonlinear Programming Models – p. 4

slide-46
SLIDE 46

Difficult variants

These problems are hard: find a maximal volume ellipsoid contained in a polyhedron given by its vertices

*

* * * * * * * * * * * * * * * * *

Nonlinear Programming Models – p. 4

slide-47
SLIDE 47

find a minimal volume ellipsoid containing a polyhedron described as a system of linear inequalities.

Nonlinear Programming Models – p. 4

slide-48
SLIDE 48

It is already a difficult problem to show whether a given ellipsoid E contains a polyhedron P = {Ax ≤ b}. This problem is still difficult even when the ellipsoid is a sphere: this problem is equivalent to norm maximization in a polyhedron – it is an NP–hard concave optimization problem.

Nonlinear Programming Models – p. 4

slide-49
SLIDE 49

Linear classification (separation)

b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c

Nonlinear Programming Models – p. 4

slide-50
SLIDE 50

Given two point sets X1, . . . , Xk, Y1, . . . , Yh find an hyperplane aTx = t such that: aTXi ≥ 1 i = 1, k aTYj ≤ 1 j = 1, h (LP feasibility problem).

Nonlinear Programming Models – p. 5

slide-51
SLIDE 51

Robust separation

b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c b c

Nonlinear Programming Models – p. 5

slide-52
SLIDE 52

Robust separation

Find a “maximal” separation: max

a:a≤1

  • min

i

aTXi − max

j

aTYj

  • equivalent to the convex problem:

max t1 − t2 aTXi ≥ t1 ∀ i aTYj ≤ t2 ∀ j a ≤ 1

Nonlinear Programming Models – p. 5