CS675: Convex and Combinatorial Optimization Fall 2019 Convex - - PowerPoint PPT Presentation

cs675 convex and combinatorial optimization fall 2019
SMART_READER_LITE
LIVE PREVIEW

CS675: Convex and Combinatorial Optimization Fall 2019 Convex - - PowerPoint PPT Presentation

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions Instructor: Shaddin Dughmi Outline Convex Functions 1 Examples of Convex and Concave Functions 2 Convexity-Preserving Operations 3 Convex Functions A function f : R n


slide-1
SLIDE 1

CS675: Convex and Combinatorial Optimization Fall 2019 Convex Functions

Instructor: Shaddin Dughmi

slide-2
SLIDE 2

Outline

1

Convex Functions

2

Examples of Convex and Concave Functions

3

Convexity-Preserving Operations

slide-3
SLIDE 3

Convex Functions

A function f : Rn → R is convex if the line segment between any points

  • n the graph of f lies above f. i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y)

Convex Functions 1/24

slide-4
SLIDE 4

Convex Functions

A function f : Rn → R is convex if the line segment between any points

  • n the graph of f lies above f. i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) Inequality called Jensen’s inequality (basic form)

Convex Functions 1/24

slide-5
SLIDE 5

Convex Functions

A function f : Rn → R is convex if the line segment between any points

  • n the graph of f lies above f. i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) Inequality called Jensen’s inequality (basic form) f is convex iff its restriction to any line {x + tv : t ∈ R} is convex

Convex Functions 1/24

slide-6
SLIDE 6

Convex Functions

A function f : Rn → R is convex if the line segment between any points

  • n the graph of f lies above f. i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) Inequality called Jensen’s inequality (basic form) f is convex iff its restriction to any line {x + tv : t ∈ R} is convex f is strictly convex if inequality strict when x = y.

Convex Functions 1/24

slide-7
SLIDE 7

Convex Functions

A function f : Rn → R is convex if the line segment between any points

  • n the graph of f lies above f. i.e. if x, y ∈ Rn and θ ∈ [0, 1], then

f(θx + (1 − θ)y) ≤ θf(x) + (1 − θ)f(y) Inequality called Jensen’s inequality (basic form) f is convex iff its restriction to any line {x + tv : t ∈ R} is convex f is strictly convex if inequality strict when x = y. Analogous definition when the domain of f is a convex subset D

  • f Rn

Convex Functions 1/24

slide-8
SLIDE 8

Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently: Line segment between any points on the graph of f lies below f. If x, y ∈ Rn and θ ∈ [0, 1], then f(θx + (1 − θ)y) ≥ θf(x) + (1 − θ)f(y)

Convex Functions 2/24

slide-9
SLIDE 9

Concave and Affine Functions

A function is f : Rn → R is concave if −f is convex. Equivalently: Line segment between any points on the graph of f lies below f. If x, y ∈ Rn and θ ∈ [0, 1], then f(θx + (1 − θ)y) ≥ θf(x) + (1 − θ)f(y) f : Rn → R is affine if it is both concave and convex. Equivalently: Line segment between any points on the graph of f lies on the graph of f. f(x) = a⊺x + b for some a ∈ Rn and b ∈ R.

Convex Functions 2/24

slide-10
SLIDE 10

We will now look at some equivalent definitions of convex functions

First Order Definition

A differentiable f : Rn → R is convex if and only if the first-order approximation centered at any point x underestimates f everywhere. f(y) ≥ f(x) + (▽f(x))⊺(y − x)

Convex Functions 3/24

slide-11
SLIDE 11

We will now look at some equivalent definitions of convex functions

First Order Definition

A differentiable f : Rn → R is convex if and only if the first-order approximation centered at any point x underestimates f everywhere. f(y) ≥ f(x) + (▽f(x))⊺(y − x) Local information → global information If ▽f(x) = 0 then x is a global minimizer of f

Convex Functions 3/24

slide-12
SLIDE 12

Second Order Definition

A twice differentiable f : Rn → R is convex if and only if its Hessian matrix ▽2f(x) is positive semi-definite for all x. (We write ▽2f(x) 0)

Convex Functions 4/24

slide-13
SLIDE 13

Second Order Definition

A twice differentiable f : Rn → R is convex if and only if its Hessian matrix ▽2f(x) is positive semi-definite for all x. (We write ▽2f(x) 0)

Intepretation

Recall definition of PSD: z⊺ ▽2 f(x)z ≥ 0 for all z ∈ Rn When n = 1, this is f′′(x) ≥ 0. More generally, z⊺▽2f(x)z

||z||2

is the second derivative of f along the line {x + tz : t ∈ R}. So if ▽2f(x) 0 then f curves upwards along any line. Moving from x to x + δ z, infitisimal change in gradient is δ ▽2 f(x)z. When ▽2f(x) 0, this is in roughly the same direction as z.

Convex Functions 4/24

slide-14
SLIDE 14

Epigraph

The epigraph of f is the set of points above the graph of f. Formally, epi(f) = {(x, t) : t ≥ f(x)}

Convex Functions 5/24

slide-15
SLIDE 15

Epigraph

The epigraph of f is the set of points above the graph of f. Formally, epi(f) = {(x, t) : t ≥ f(x)}

Epigraph Definition

f is a convex function if and only if its epigraph is a convex set.

Convex Functions 5/24

slide-16
SLIDE 16

Jensen’s Inequality (General Form)

f : Rn → R is convex if and only if For every x1, . . . , xk in the domain of f, and θ1, . . . , θk ≥ 0 such that

i θi = 1, we have

f(

  • i

θixi) ≤

  • i

θif(xi) Given a probability measure D on the domain of f, and x ∼ D, f(E[x]) ≤ E[f(x)]

Convex Functions 6/24

slide-17
SLIDE 17

Jensen’s Inequality (General Form)

f : Rn → R is convex if and only if For every x1, . . . , xk in the domain of f, and θ1, . . . , θk ≥ 0 such that

i θi = 1, we have

f(

  • i

θixi) ≤

  • i

θif(xi) Given a probability measure D on the domain of f, and x ∼ D, f(E[x]) ≤ E[f(x)] Adding noise to x can only increase f(x) in expectation.

Convex Functions 6/24

slide-18
SLIDE 18

Local and Global Optimality

Local minimum

x is a local minimum of f if there is a an open ball B containing x where f(y) ≥ f(x) for all y ∈ B.

Local and Global Optimality

When f is convex, x is a local minimum of f if and only if it is a global minimum.

Convex Functions 7/24

slide-19
SLIDE 19

Local and Global Optimality

Local minimum

x is a local minimum of f if there is a an open ball B containing x where f(y) ≥ f(x) for all y ∈ B.

Local and Global Optimality

When f is convex, x is a local minimum of f if and only if it is a global minimum. This fact underlies much of the tractability of convex optimization.

Convex Functions 7/24

slide-20
SLIDE 20

Sub-level sets

Level sets of f(x, y) =

  • x2 + y2

Sublevel set

The α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

Convex Functions 8/24

slide-21
SLIDE 21

Sub-level sets

Level sets of f(x, y) =

  • x2 + y2

Sublevel set

The α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

Fact

Every sub-level set of a convex function is a convex set. This fact also underlies tractability of convex optimization

Convex Functions 8/24

slide-22
SLIDE 22

Sub-level sets

Level sets of f(x, y) =

  • x2 + y2

Sublevel set

The α-sublevel set of f is {x ∈ domain(f) : f(x) ≤ α}.

Fact

Every sub-level set of a convex function is a convex set. This fact also underlies tractability of convex optimization Note: converse false, but nevertheless useful check.

Convex Functions 8/24

slide-23
SLIDE 23

Other Basic Properties

Continuity

Real-valued convex functions are continuous on the interior of their domain.

Convex Functions 9/24

slide-24
SLIDE 24

Other Basic Properties

Continuity

Real-valued convex functions are continuous on the interior of their domain.

Extended-value extension

If a function f : D → R is convex on its domain, and D is convex, then it can be extended to a convex function on Rn by setting f(x) = ∞ whenever x / ∈ D. This simplifies notation. Resulting function f : D → R ∞ is “convex” with respect to the ordering on R ∞

Convex Functions 9/24

slide-25
SLIDE 25

Outline

1

Convex Functions

2

Examples of Convex and Concave Functions

3

Convexity-Preserving Operations

slide-26
SLIDE 26

Functions on the reals

Affine: ax + b Exponential: eax convex for any a ∈ R Powers: xa convex on R++ when a ≥ 1 or a ≤ 0, and concave for 0 ≤ a ≤ 1 Logarithm: log x concave on R++.

Examples of Convex and Concave Functions 10/24

slide-27
SLIDE 27

Norms

Norms are convex. ||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y|| Uses both norm axioms: triangle inequality, and homogeneity. Applies to matrix norms, such as the spectral norm (radius of induced ellipsoid)

Examples of Convex and Concave Functions 11/24

slide-28
SLIDE 28

Norms

Norms are convex. ||θx + (1 − θ)y|| ≤ ||θx|| + ||(1 − θ)y|| = θ||x|| + (1 − θ)||y|| Uses both norm axioms: triangle inequality, and homogeneity. Applies to matrix norms, such as the spectral norm (radius of induced ellipsoid)

Max

maxi xi is convex max

i (θx + (1 − θ)y)i = max i (θxi + (1 − θ)yi)

≤ max

i

θxi + max

i (1 − θ)yi

= θ max

i

xi + (1 − θ) max

i

yi If i’m allowed to pick the maximum entry of θx and θy independently, I can do only better.

Examples of Convex and Concave Functions 11/24

slide-29
SLIDE 29

Log-sum-exp: log(ex1 + ex2 + . . . + exn) is convex Geometric mean: (n

i=1 xi)

1 n is concave

Log-determinant: log det X is concave Quadratic form: x⊺Ax is convex iff A 0 Other examples in book

f(x, y) = log(ex + ey)

Examples of Convex and Concave Functions 12/24

slide-30
SLIDE 30

Log-sum-exp: log(ex1 + ex2 + . . . + exn) is convex Geometric mean: (n

i=1 xi)

1 n is concave

Log-determinant: log det X is concave Quadratic form: x⊺Ax is convex iff A 0 Other examples in book

f(x, y) = log(ex + ey)

Proving convexity often comes down to case-by-case reasoning, involving: Definition: restrict to line and check Jensen’s inequality Write down the Hessian and prove PSD Express as a combination of other convex functions through convexity-preserving operations (Next)

Examples of Convex and Concave Functions 12/24

slide-31
SLIDE 31

Outline

1

Convex Functions

2

Examples of Convex and Concave Functions

3

Convexity-Preserving Operations

slide-32
SLIDE 32

Nonnegative Weighted Combinations

If f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, then g = w1f1 + w2f2 . . . + wkfk is convex.

Convexity-Preserving Operations 13/24

slide-33
SLIDE 33

Nonnegative Weighted Combinations

If f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, then g = w1f1 + w2f2 . . . + wkfk is convex.

proof (k = 2)

g x + y 2

  • = w1f1

x + y 2

  • + w2f2

x + y 2

  • ≤ w1

f1(x) + f1(y) 2 + w2 f2(x) + f2(y) 2 = g(x) + g(y) 2

Convexity-Preserving Operations 13/24

slide-34
SLIDE 34

Nonnegative Weighted Combinations

If f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, then g = w1f1 + w2f2 . . . + wkfk is convex. Extends to integrals g(x) =

  • y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Convexity-Preserving Operations 13/24

slide-35
SLIDE 35

Nonnegative Weighted Combinations

If f1, f2, . . . , fk are convex, and w1, w2, . . . , wk ≥ 0, then g = w1f1 + w2f2 . . . + wkfk is convex. Extends to integrals g(x) =

  • y w(y)fy(x) with w(y) ≥ 0, and therefore

expectations Ey fy(x).

Worth Noting

Minimizing the expectation of a random convex cost function is also a convex optimization problem! A stochastic convex optimization problem is a convex optimization problem.

Convexity-Preserving Operations 13/24

slide-36
SLIDE 36

Example: Stochastic Facility Location

Average Distance

k customers located at y1, y2, . . . , yk ∈ Rn If I place a facility at x ∈ Rn, average distance to a customer is g(x) =

i 1 k||x − yi||

Convexity-Preserving Operations 14/24

slide-37
SLIDE 37

Example: Stochastic Facility Location

Average Distance

k customers located at y1, y2, . . . , yk ∈ Rn If I place a facility at x ∈ Rn, average distance to a customer is g(x) =

i 1 k||x − yi||

Since distance to any one customer is convex in x, so is the average distance. Extends to probability measure over customers

Convexity-Preserving Operations 14/24

slide-38
SLIDE 38

Implication

Convex functions are a convex cone in the vector space of functions from Rn to R. The set of convex functions is the intersection of an infinite set of homogeneous linear inequalities indexed by x, y, θ f(θx + (1 − θ)y) − θf(x) − (1 − θ)f(y) ≤ 0

Convexity-Preserving Operations 15/24

slide-39
SLIDE 39

Composition with Affine Function

If f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then g(x) = f(Ax + b) is a convex function from Rm to R.

Convexity-Preserving Operations 16/24

slide-40
SLIDE 40

Composition with Affine Function

If f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then g(x) = f(Ax + b) is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐ ⇒ t = g(x) = f(Ax+b) ⇐ ⇒ (Ax+b, t) ∈ graph(f)

Convexity-Preserving Operations 16/24

slide-41
SLIDE 41

Composition with Affine Function

If f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then g(x) = f(Ax + b) is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐ ⇒ t = g(x) = f(Ax+b) ⇐ ⇒ (Ax+b, t) ∈ graph(f) (x, t) ∈ epi(g) ⇐ ⇒ t ≥ g(x) = f(Ax + b) ⇐ ⇒ (Ax + b, t) ∈ epi(f)

Convexity-Preserving Operations 16/24

slide-42
SLIDE 42

Composition with Affine Function

If f : Rn → R is convex, and A ∈ Rn×m, b ∈ Rn, then g(x) = f(Ax + b) is a convex function from Rm to R.

Proof

(x, t) ∈ graph(g) ⇐ ⇒ t = g(x) = f(Ax+b) ⇐ ⇒ (Ax+b, t) ∈ graph(f) (x, t) ∈ epi(g) ⇐ ⇒ t ≥ g(x) = f(Ax + b) ⇐ ⇒ (Ax + b, t) ∈ epi(f) epi(g) is the inverse image of epi(f) under the affine mapping (x, t) → (Ax + b, t)

Convexity-Preserving Operations 16/24

slide-43
SLIDE 43

Examples

||Ax + b|| is convex max(Ax + b) is convex log(ea⊺

1x+b1 + ea⊺ 2x+b2 + . . . + ea⊺ nx+bn) is convex Convexity-Preserving Operations 17/24

slide-44
SLIDE 44

Maximum

If f1, f2 are convex, then g(x) = max {f1(x), f2(x)} is also convex. Generalizes to the maximum of any number of functions, maxk

i=1 fi(x),

and also to the supremum of an infinite set of functions supy fy(x).

Convexity-Preserving Operations 18/24

slide-45
SLIDE 45

Maximum

If f1, f2 are convex, then g(x) = max {f1(x), f2(x)} is also convex. Generalizes to the maximum of any number of functions, maxk

i=1 fi(x),

and also to the supremum of an infinite set of functions supy fy(x). epi g = epi f1

  • epi f2

Convexity-Preserving Operations 18/24

slide-46
SLIDE 46

Example: Robust Facility Location

Maximum Distance

k customers located at y1, y2, . . . , yk ∈ Rn If I place a facility at x ∈ Rn, maximum distance to a customer is g(x) = maxi ||x − yi||

Convexity-Preserving Operations 19/24

slide-47
SLIDE 47

Example: Robust Facility Location

Maximum Distance

k customers located at y1, y2, . . . , yk ∈ Rn If I place a facility at x ∈ Rn, maximum distance to a customer is g(x) = maxi ||x − yi|| Since distance to any one customer is convex in x, so is the worst-case distance.

Convexity-Preserving Operations 19/24

slide-48
SLIDE 48

Example: Robust Facility Location

Maximum Distance

k customers located at y1, y2, . . . , yk ∈ Rn If I place a facility at x ∈ Rn, maximum distance to a customer is g(x) = maxi ||x − yi||

Worth Noting

When a convex cost function is uncertain, minimizing the worst-case cost is also a convex optimization problem! A robust (in the worst-case sense) convex optimization problem is a convex optimization problem.

Convexity-Preserving Operations 19/24

slide-49
SLIDE 49

Other Examples

Maximum eigenvalue of a symmetric matrix A is convex in A max {v⊺Av : ||v|| = 1} Sum of k largest components of a vector x is convex in x max

  • 1S · x : |S| = k
  • Convexity-Preserving Operations

20/24

slide-50
SLIDE 50

Minimization

If f(x, y) is convex and C is convex and nonempty, then g(x) = infy∈C f(x, y) is convex.

Convexity-Preserving Operations 21/24

slide-51
SLIDE 51

Minimization

If f(x, y) is convex and C is convex and nonempty, then g(x) = infy∈C f(x, y) is convex.

Proof (for C = Rk)

epi g is the projection of epi f onto hyperplane y = 0. f(x, y) = x2 + y2 g(x) = x2

Convexity-Preserving Operations 21/24

slide-52
SLIDE 52

Example

Distance from a convex set C f(x) = inf

y∈C ||x − y||

Convexity-Preserving Operations 22/24

slide-53
SLIDE 53

Composition Rules

If g : Rn → Rk and h : Rk → R, then f = h ◦ g is convex if gi are convex, and h is convex and nondecreasing in each argument. gi are concave, and h is convex and nonincreasing in each argument.

Proof (n = k = 1)

f′′(x) = h′′(g(x))g′(x)2 + h′(g(x))g′′(x)

Convexity-Preserving Operations 23/24

slide-54
SLIDE 54

Perspective

If f is convex then g(x, t) = tf(x/t) is also convex.

Proof

epi g is inverse image of epi f under the perspective function.

Convexity-Preserving Operations 24/24