Introduction to Convex Optimization Xuezhi Wang Computer Science - - PowerPoint PPT Presentation

introduction to convex optimization
SMART_READER_LITE
LIVE PREVIEW

Introduction to Convex Optimization Xuezhi Wang Computer Science - - PowerPoint PPT Presentation

Convexity Unconstrained Convex Optimization Constrained Optimization Introduction to Convex Optimization Xuezhi Wang Computer Science Department Carnegie Mellon University 10701-recitation, Jan 29 Introduction to Convex Optimization


slide-1
SLIDE 1

Convexity Unconstrained Convex Optimization Constrained Optimization

Introduction to Convex Optimization

Xuezhi Wang

Computer Science Department Carnegie Mellon University

10701-recitation, Jan 29

Introduction to Convex Optimization

slide-2
SLIDE 2

Convexity Unconstrained Convex Optimization Constrained Optimization

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-3
SLIDE 3

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-4
SLIDE 4

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convex Sets

Definition For x, x′ ∈ X it follows that λx + (1 − λ)x′ ∈ X for λ ∈ [0, 1] Examples

Empty set ∅, single point {x0}, the whole space Rn Hyperplane: {x | a⊤x = b}, halfspaces {x | a⊤x ≤ b} Euclidean balls: {x | ||x − xc||2 ≤ r} Positive semidefinite matrices: Sn

+ = {A ∈ Sn|A 0} (Sn is

the set of symmetric n × n matrices)

Introduction to Convex Optimization

slide-5
SLIDE 5

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convexity Preserving Set Operations

Convex Set C, D Translation {x + b | x ∈ C} Scaling {λx | x ∈ C} Affine function {Ax + b | x ∈ C} Intersection C ∩ D Set sum C + D = {x + y | x ∈ C, y ∈ D}

Introduction to Convex Optimization

slide-6
SLIDE 6

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-7
SLIDE 7

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convex Functions

dom f is convex, λ ∈ [0, 1] λf(x) + (1 − λ)f(y) ≥ f(λx + (1 − λ)y) First-order condition: if f is differentiable, f(y) ≥ f(x) + ∇f(x)⊤(y − x) Second-order condition: if f is twice differentiable, ∇2f(x) 0 Strictly convex: ∇2f(x) ≻ 0 Strongly convex: ∇2f(x) dI with d > 0

Introduction to Convex Optimization

slide-8
SLIDE 8

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convex Functions

Below-set of a convex function is convex: f(λx + (1 − λ)y) ≤ λf(x) + (1 − λ)f(y) hence λx + (1 − λ)y ∈ X for x, y ∈ X Convex functions don’t have local minima: Proof by contradiction: linear interpolation breaks local minimum condition Convex Hull: Conv(X) = {¯ x | ¯ x = αixi where αi ≥ 0 and αi = 1} Convex hull of a set is always a convex set

Introduction to Convex Optimization

slide-9
SLIDE 9

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convex Functions examples

  • Exponential. eax convex on R, any a ∈ R
  • Powers. xa convex on R++ when a ≥ 1 or a ≤ 0, and

concave for 0 ≤ a ≤ 1. Powers of absolute value. |x|p for p ≥ 1, convex on R.

  • Logarithm. log x concave on R++.
  • Norms. Every norm on Rn is convex.

f(x) = max{x1, ..., xn} convex on Rn Log-sum-exp. f(x) = log(ex1 + ... + exn) convex on Rn.

Introduction to Convex Optimization

slide-10
SLIDE 10

Convexity Unconstrained Convex Optimization Constrained Optimization Convex Sets Convex Functions

Convexity Preserving Function Operations

Convex function f(x), g(x) Nonnegative weighted sum: af(x) + bg(x) Pointwise Maximum: f(x) = max{f1(x), ..., fm(x)} Composition with affine function: f(Ax + b) Composition with nondecreasing convex g: g(f(x))

Introduction to Convex Optimization

slide-11
SLIDE 11

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-12
SLIDE 12

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Gradient Descent

given a starting point x ∈ domf. repeat

  • 1. ∆x := −∇f(x)
  • 2. Choose step size t via exact or backtracking line search.
  • 3. update. x := x + t∆x.

Until stopping criterion is satisfied.

Key idea

Gradient points into descent direction Locally gradient is good approximation of objective function

Gradient Descent with line search

Get descent direction Unconstrained line search Exponential convergence for strongly convex objective

Introduction to Convex Optimization

slide-13
SLIDE 13

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Convergence Analysis

Assume ∇f is L-Lipschitz continuous, then gradient descent with fixed step size t ≤ 1/L has convergence rate O(1/k) i.e., to get f(x(k)) − f(x∗) ≤ ǫ, need O(1/ǫ) iterations Assume strong convexity holds for f, i.e., ∇2f(x) dI and ∇f is L-Lipschitz continuous, then gradient descent with fixed step size t ≤ 2/(d + L) has convergence rate O(ck), where c ∈ (0, 1), i.e., to get f(x(k)) − f(x∗) ≤ ǫ, need O(log(1/ǫ)) iterations

Introduction to Convex Optimization

slide-14
SLIDE 14

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-15
SLIDE 15

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Newton’s method

Convex objective function f Nonnegative second derivative ∂2

xf(x) 0

Taylor expansion f(x + δ) = f(x) + δ⊤∂xf(x) + 1 2δ⊤∂2

xf(x)δ + O(δ3)

Minimize approximation & iterate til converged x ← x − [∂2

xf(x)]−1∂xf(x)

Introduction to Convex Optimization

slide-16
SLIDE 16

Convexity Unconstrained Convex Optimization Constrained Optimization First-order Methods Newton’s Method

Convergence Analysis

Two Convergence regimes

As slow as gradient descent outside the region where Taylor expansion is good ||∂xf(x∗) − ∂xf(x) − x∗ − x, ∂2

xf(x)|| ≤ γ||x∗ − x||2

Quadratic convergence once the bound holds ||xn+1 − x∗|| ≤ γ||[∂2

xf(xn)]−1|| ||xn − x∗||2

Introduction to Convex Optimization

slide-17
SLIDE 17

Convexity Unconstrained Convex Optimization Constrained Optimization Primal and dual problems KKT conditions

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-18
SLIDE 18

Convexity Unconstrained Convex Optimization Constrained Optimization Primal and dual problems KKT conditions

Constrained Optimization

Primal problem: min

x∈Rnf(x)

subject to hi(x) ≤ 0, i = 1, . . . , m lj(x) = 0, j = 1, . . . , r Lagrangian: L(x, u, v) = f(x) +

m

  • i=1

uihi(x) +

r

  • j=1

vjlj(x) where u ∈ Rm, v ∈ Rr, and u ≥ 0. Lagrange dual function: g(u, v) = min

x∈Rn L(x, u, v)

Introduction to Convex Optimization

slide-19
SLIDE 19

Convexity Unconstrained Convex Optimization Constrained Optimization Primal and dual problems KKT conditions

Constrained Optimization

Dual problem: max

u,v g(u, v)

subject to u ≥ 0 Dual problem is a convex optimization problem, since g is always concave (even if primal problem is not convex) The primal and dual optimal values always satisfy weak duality: f ∗ ≥ g∗ Slater’s condition: for convex primal, if there is an x such that h1(x) < 0, ..., hm(x) < 0 and l1(x) = 0, ..., lr(x) = 0 then strong duality holds: f ∗ = g∗.

Introduction to Convex Optimization

slide-20
SLIDE 20

Convexity Unconstrained Convex Optimization Constrained Optimization Primal and dual problems KKT conditions

Outline

1

Convexity Convex Sets Convex Functions

2

Unconstrained Convex Optimization First-order Methods Newton’s Method

3

Constrained Optimization Primal and dual problems KKT conditions

Introduction to Convex Optimization

slide-21
SLIDE 21

Convexity Unconstrained Convex Optimization Constrained Optimization Primal and dual problems KKT conditions

KKT conditions

If x∗, u∗, v∗ are primal and dual solutions, with zero duality gap (strong duality holds), then x∗, u∗, v∗ satisfy the KKT conditions: stationarity: 0 ∈ ∂f(x) + ui∂hi(x) + vj∂lj(x) complementary slackness: uihi(x) = 0 for all i primal feasibility: hi(x) ≤ 0, lj(x) = 0 for all i, j dual feasibility: ui ≥ 0 for all i Proof: f(x∗) = g(u∗, v∗) = min

x∈Rn f(x) +

  • u∗

i hi(x) +

  • v∗

j lJ(x)

≤ f(x∗) +

  • u∗

i hi(x∗) +

  • v∗

j lj(x∗)

≤ f(x∗) Hence all these inequalities are actually equalities

Introduction to Convex Optimization

slide-22
SLIDE 22

Appendix For Further Reading

For Further Reading I

Boyd and Vandenberghe Convex Optimization.

Introduction to Convex Optimization