The How of (Sub)Gradient Note: Subdifferential is intersection of - - PowerPoint PPT Presentation

the how of sub gradient
SMART_READER_LITE
LIVE PREVIEW

The How of (Sub)Gradient Note: Subdifferential is intersection of - - PowerPoint PPT Presentation

The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is therefore convex and closed August 31, 2018 82 / 402 The How of (Sub)Gradient Note: Subdifferential is intersection of infinite half-spaces and is


slide-1
SLIDE 1

The How of (Sub)Gradient

Note: Subdifferential is intersection of infinite half-spaces and is therefore convex

and closed

August 31, 2018 82 / 402

slide-2
SLIDE 2

The How of (Sub)Gradient

Note: Subdifferential is intersection of infinite half-spaces and is thereforea closed convex seteven iffis NOT convex.

August 31, 2018 82 / 402

slide-3
SLIDE 3

First peek into subgradient calculus: Function Convexity First

Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum:Iff 1,f 2, . . . ,f m are convex, then f(x) =max {f1(x),f 2(x), ...,f m(x)} is

In Quiz 1, problem 1, m=2 f1 = ||x||_1 f2 = ||x||_infinity

August 31, 2018 83 / 402

convex

slide-4
SLIDE 4

First peek into subgradient calculus: Function Convexity First

Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum:Iff 1,f 2, . . . ,f m are convex, then f(x) =max f (

1 2 m

x),f (x), . . . ,f ( { } x) is also convex. For example:

August 31, 2018 83 / 402

(invoking convexity of f1...fm)

▶ Sum ofrlargest components ofx∈ ℜ

n f(x) =x [1] +x [2] +. . .+x [r], wherex [1] is thei th

largest component ofx, is

Proof: Either from first principles Or Inspect intersection of epigraphs of f1...fm Will our proof of convexity hold for an infinite (possibly even uncountable) number of indices i (which had a finite set of values 1...m above)? ANS: Yes!!

slide-5
SLIDE 5

First peek into subgradient calculus: Function Convexity First

Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum:Iff 1,f 2, . . . ,f m are convex, then f(x) =max f (x),f (x),...,

1 2 m

{ } f (x) is also convex. For example:

▶ Sum ofrlargest components ofx∈ ℜ

n f(x) =x [1] +x [2] +. . .+x [r], wherex [1] is thei th

largest component ofx, is a convex function.

Pointwise supremum:Iff(x,y)is convex inxfor everyy∈S, theng(x) =sup

y

f(x,y)

∈S

S is a set of possibly

August 31, 2018 83 / 402

infinite number of indices

is convex by a proof similar to

that on the board: RHS will have sup over y instead

  • f max over i

Similarly, LHS will also have sup over y instead of max over i

slide-6
SLIDE 6

First peek into subgradient calculus: Function Convexity First

Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum:Iff 1,f 2, . . . ,f m are convex, then f(x) =max f (x),f (x),...,

1 2 m

{ } f (x) is also convex. For example:

▶ Sum ofrlargest components ofx∈ ℜ

n f(x) =x [1] +x [2] +. . .+x [r], wherex [1] is thei th

largest component ofx, is a convex function.

y∈S f(x,y)

Pointwise supremum:Iff(x,y)is convex inxfor everyy∈S, theng(x) =sup is convex. For example:

▶ The function that returns the maximum eigenvalue of a symmetric matrixX,viz.,

λmax(X) =sup

∥Xy∥2 is y∈S ∥y∥2

a convex function obtained as supremum

August 31, 2018 83 / 402

  • ver an infinite number of y with ||y||_2 = 1
  • ver the function ||Xy||_2
slide-7
SLIDE 7

First peek into subgradient calculus: Function Convexity First

Following functions are convex, but may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability? Pointwise maximum:Iff 1,f 2, . . . ,f m are convex, then f(x) =max f (x),f (x),...,

1 2 m

{ } f (x) is also convex. For example:

▶ Sum ofrlargest components ofx∈ ℜ

n f(x) =x [1] +x [2] +. . .+x [r], wherex [1] is thei th

largest component ofx, is a convex function.

y∈S f(x,y)

Pointwise supremum:Iff(x,y)is convex inxfor everyy∈S, theng(x) =sup is convex. For example:

▶ The function that returns the maximum eigenvalue of a symmetric matrixX,viz.,

λmax(X) =sup ∥Xy∥2is a convex function of the symmetrix matrixX.

y∈S ∥y∥2

If X is symmetrix, max eigenvalue of X^TX is squared of max

August 31, 2018 83 / 402

eigenvalue of X

slide-8
SLIDE 8

Basic Subgradient Calculus: Illustration for pointwiseMaximum

Convex hull o

Finite pointwise maximum: iff(x) =max i=1...mfi(x), then ∂f(x) = subdifferential of f_i(x) at points x where f(x) = f_i(x)

(that is points where there is a unique/unambiguous maximizer, the subdifferential of f(x) is the subdifferential

  • f that unique maximizer)

f subdifferentials of f_i(x) for all i s.t f(x) = f_i(x) (that is points where there is a unique/unambiguous maximizer, the subdifferential of f(x) is the subdifferential

  • f that unique maximizer)

Includes union

August 31, 2018 84 / 402

slide-9
SLIDE 9

Basic Subgradient Calculus: Illustration for pointwiseMaximum

Finite pointwise maximum: iff(x) =max i=1...mfi(x), then ( ∪

i(x)=f(x

∂f(x) =conv ∂fi(x) ) , which is the convex hull of union of subdifferentials of

i:f )

all active functions atx. General pointwise maximum: iff(x) =max s∈Sfs(x),then under some regularity conditions (onS,f s),∂f(x) =

closure of convex hull

  • f union of subdifferentials

Additional operation that ensures the subdifferential to be closed

August 31, 2018 84 / 402

slide-10
SLIDE 10

Basic Subgradient Calculus: Illustration for pointwiseMaximum

Finite pointwise maximum: iff(x) =max i=1...mfi(x), then ( ∪ ∂f(x) =conv ∂fi(x) ) , which is the convex hull of union of subdifferentials of

i:fi(x)=f(x)

all active functions atx. under some regularity conditions (onS,f

s),∂f(x) =cl

conv General pointwise maximum: iff(x) =max s∈Sfs(x),then { ( ∪

s:fs(x)=f(x)

∂fs(x) ) }

August 31, 2018 84 / 402

slide-11
SLIDE 11

Subgradient of∥x∥1

Assumex∈ ℜ

  • n. Then

∥x∥1 = max over 2^n functions each corresponding to s^Tx

August 31, 2018 85 / 402

slide-12
SLIDE 12

Subgradient of∥x∥ 1

s∈{ −1,+1} n

Assumex∈ ℜ

  • n. Then

∥x∥1 =max xTswhich is a pointwise maximum of2 LetS ∗ ⊆{−1,+1} n be the set ofssuch that for eachs∈S

nfunctions ∗, the value ofx Tsis the

same max value. Thus,∂∥x∥ 1 =conv ∪

s∈S ∗

( ) s .

August 31, 2018 85 / 402

slide-13
SLIDE 13

More Subgradient Calculus: Function Convexity first

Following functions are again convex, but again, may not be differentiable everywhere. How does one compute their subgradients at points of non-differentiability?

n

i=1 i i i

Nonnegative weighted sum:f= α f is convex if eachf for 1≤i≤nis convex and αi ≥0,1≤i≤n.

Composition with affine function:f(Ax+b)is convex iffis convex. For example:

m

i=1

The log barrier for linear inequalities,f(x) =− log

T i i

(b −a x ), is convex since−log(x)is convex.

▶ Any norm of an affine function,f(x) =||Ax+b||, is convex.

if A is m x n, then f() is defined on R^n whereas f(Ax+b) is defined on R^m

August 31, 2018 86 / 402