Developing Tools for Convexity Analysis of f ( x 1 , x 2 ,.. x n ) - - PowerPoint PPT Presentation

developing tools for convexity analysis of f x 1 x 2 x n
SMART_READER_LITE
LIVE PREVIEW

Developing Tools for Convexity Analysis of f ( x 1 , x 2 ,.. x n ) - - PowerPoint PPT Presentation

Developing Tools for Convexity Analysis of f ( x 1 , x 2 ,.. x n ) Instructor: Prof. Ganesh Ramakrishnan Prof. Ganesh Ramakrishnan (IIT Bombay) From to n : CS709 26/12/2016 1/ 210 Summary of Optimization Principles for Univariate


slide-1
SLIDE 1

Developing Tools for Convexity Analysis of f(x1,x 2,..xn)

Instructor: Prof. Ganesh Ramakrishnan

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 1/ 210

slide-2
SLIDE 2

Summary of Optimization Principles for Univariate Functions

Detailed slides athttps://www.cse.iitb.ac.in/~cs709/notes/enotes/ 2-08-01-2018-univariateprinciples.pdf, video athttps://tinyurl.com/yc4d2aqg and Section 4.1.1 (pages 213 to 214) of the notes at https://www.cse.iitb.ac.in/~cs709/notes/BasicsOfConvexOptimization.pdf.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 2/ 210

slide-3
SLIDE 3

Maximum and Minimum values of univariate functions

Letf:D→ ℜ. Nowfhas Anabsolute maximum(or global maximum) value at pointc∈Dif f(x)≤f(c),∀x∈D Anabsolute minimum(or global minimum) value atc∈Dif f(x)≥f(c),∀x∈D Alocal maximum valueatcif there is an open intervalIcontainingcin which local minimum v f(c)≥f(x),∀x∈I A alueatcif there is an open intervalIcontainingcin which f(c)≤f(x),∀x∈I Alocal extreme valueatc, iff(c)is either a local maximum or local minimum value off in an open intervalIwithc∈I

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 3/ 210

slide-4
SLIDE 4

First Derivative Test & Extreme Value Theorem

First derivative test for local extreme value off, whenfis differentiable at the extremum.

f'(x) = 0 for all local extremevalues

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 5 / 210

slide-5
SLIDE 5

First Derivative Test & Extreme Value Theorem

If f(c)is a local extreme value and if f is differentiable at x=c, then f

′(c) = 0.

First derivative test for local extreme value off, whenfis differentiable at the extremum.

Claim

The Extreme Value Theorem

Function has global extremes if (a) it iscontinuous (b) the domain is bounded (c) the domain is closed

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 5 / 210

slide-6
SLIDE 6

First Derivative Test & Extreme Value Theorem

If f(c)is a local extreme value and if f is differentiable at x=c, then f

′(c) = 0.

First derivative test for local extreme value off, whenfis differentiable at the extremum.

Claim

The Extreme Value Theorem

Claim

A continuous function f(x)on a closed and bounded interval[a,b]attains a minimum value f(c)for some c∈[a,b]and a maximum value f(d)for some d∈[a,b]. That is, a continuous function on a closed, bounded interval attains a minimum and a maximum value. We must point out that either or both of the valuescanddmay be attained at the end points

  • f the interval[a,b].

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 5 / 210

slide-7
SLIDE 7

Taylor’s Theorem andn th degree polynomialapproximation

f and its first n derivatives f′,f ′′, . . . ,f (n) arecontinuous the closed interval[a,b], differentiable on(a,b), th Then th degree polynomial approximation of a function is used to prove a generalization of the mean value theorem, called theTaylor’s theorem.

Claim

The Taylor’s theorem states that if

  • n

and en there exists a numberc∈(a,b) such that

1 1 2! n!

′ ′′ 2 (n) n

f(b) =f(a) +f (a)(b−a) + f (a)(b−a) +.. .+ f (a)(b−a) + 1 (n+ 1)! f(n+1)(c)(b−a) n+1

Mean Value Theorem = Taylor’s theorem with n=

approximation involves dropping last term

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 6 / 210

slide-8
SLIDE 8

Mean Value, Taylor’s Theorem and words of caution

Note that ifffails to be differentiable at even one number in the interval, then the conclusion

  • f the mean value theorem may be false. For example, iff(x) =x 2/3, thenf ′(x) =

2 3

3

√ x and the

theorem does not hold in the interval[−3,3], sincefis not differentiable at0as can be seen in Figure 1.

F r

  • mF

ℜig t

  • u

ℜre

n

:1 C: S 7 9

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 7 / 210

slide-9
SLIDE 9

Sufficient Conditions for Increasing and decreasing functions

A functionfis said to be ... increasingon an intervalIin its domainDiff(t)<f(x)whenevert<x. decreasingon an intervalI∈Diff(t)>f(x)whenevert<x. Consequently:

Claim

LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:

1 if f′(x)>0for all x∈int(I), then f is

(strictly) increasing

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 8 / 210

slide-10
SLIDE 10

Sufficient Conditions for Increasing and decreasing functions

A functionfis said to be ... increasingon an intervalIin its domainDiff(t)<f(x)whenevert<x. decreasingon an intervalI∈Diff(t)>f(x)whenevert<x. Consequently:

Claim

LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:

1 2 3

if f′(x)>0for all x∈int(I), then f is increasing onI; if f′(x)<0for all x∈int(I), then f is decreasing onI; if f′(x) = 0for all x∈int(I), iff, f is constantonI.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 8 / 210

slide-11
SLIDE 11

Illustration of Sufficient Conditions

,2]a 0]and[2 We Figure 2 illustrates the intervals in(−∞,∞)on which the functionf(x) = 3x

4 + 4x3 −36x 2 is

decreasing and increasing. First we note thatf(x)is differentiable everywhere on(−∞,∞) and computef ′(x) = 12(x3 +x 2 −6x) = 12(x−2)(x+ 3)x, which is negative in the intervals (−∞,−3]and[0 nd positive in the intervals[−3, ,∞).

  • bserve thatfis

decreasing in the intervals(−∞,−3]and[0,2]and while it is increasing in the intervals[−3,0] and[2,∞).

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 9 / 210

slide-12
SLIDE 12

Necessary conditions for increasing/decreasing function

The conditions for increasing and decreasing properties off(x)stated so far are

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 10 / 210

slide-13
SLIDE 13

Necessary conditions for increasing/decreasing function

The conditions for increasing and decreasing properties off(x)stated so far are not necesssary.

Figure 3:

Figure 3 shows that for the functionf(x) =x 5, thoughf(x)is increasing in(−∞,∞),f

′(0) = 0.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 10 / 210

slide-14
SLIDE 14

Another sufficient condition for increasing/decreasing function

Thus, a modified sufficient condition for a functionfto be increasing/decreasing on an interval Ican be stated as follows:

f'(.) > 0 everywhere except at a finite number of points where f'(.) = 0

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 11 / 210

slide-15
SLIDE 15

Another sufficient condition for increasing/decreasing function

Thus, a modified sufficient condition for a functionfto be increasing/decreasing on an interval Ican be stated as follows:

Claim

LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:

1 if f′(x)≥0for all x∈int(I), and if f

′(x) = 0at only finitely many x∈I, then f is

increasing onI;

2 if f′(x)≤0for all x∈int(I), and if f

′(x) = 0at only finitely many x∈I, then f is

decreasing onI. For example, the derivative of the functionf(x) = 6x 5 −15x 4 + 10x3 vanishes at0, and1and f′(x)>0elsewhere. Sof(x)is increasing on(−∞,∞).

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 11 / 210

slide-16
SLIDE 16

Necessary conditions for increasing/decreasing function (contd.)

We have a slightly different necessary condition..

Claim

LetIbe an interval, and suppose f is continuous onIand differentiable in int(I). Then:

1 2

if f is increasing onI, then f ′(x)≥0for all x∈int(I); if f is decreasing onI, then f ′(x)≤0for all x∈int(I).

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 12 / 210

slide-17
SLIDE 17

Critical Point

This concept will help us derive the general condition for local extrema.

Definition

[Critical Point]: A point c in the domainDof f is called a critical point of f if either f ′(c) = 0

  • r f′(c)does not exist.

The following general condition for local extrema extends the result in theorem 1 to general non-differentiable functions.

Claim

If f(c)is a local extreme value, then c is a critical number of f. The converse of above statement does not hold (see Figure 3);0is a critical number (f′(0) = 0), althoughf(0)is not a local extreme value.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 13 / 210

slide-18
SLIDE 18

Critical Point and Local Extreme Value

Given a critical pointc, the following test helps determine iff(c)is a local extreme value:

Procedure

[Local Extreme Value]: Let c be an isolated critical point of f

1 2

f(c)is a local minimum if f(x)is decreasing in an interval[c−ϵ

1,c]and

increasing in an interval[c,c+ϵ

2]withϵ 1,ϵ 2>0.

f(c)is a local maximum if f(x)is increasing in an interval[c−ϵ

1,c]and

decreasing in an interval[c,c+ϵ

2]withϵ 1,ϵ 2>0.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 14 / 210

slide-19
SLIDE 19

First Derivative Test: Critical Point and Local Extreme Value

As an example, the functionf(x) = 3x 5 −5x 3 has the derivativef ′(x) = 15x2(x+ 1)(x−1). The critical points are

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 15 / 210

slide-20
SLIDE 20

First Derivative Test: Critical Point and Local Extreme Value

As an example, the functionf(x) = 3x 5 −5x 3 has the derivativef ′(x) = 15x2(x+ 1)(x−1). The critical points are0,1and−1. Of the three, the sign off

′(x)changes at1and−1, which

are local minimum and maximum respectively. The sign does not change at0, which is therefore not a local supremum.

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

F r

  • mFig

t

  • ure

n:4

C: S 7 9 ℜ ℜ 26/12/2016 15 / 210

slide-21
SLIDE 21

First Derivative Test: Critical Point and Local Extreme Value

f(x) = As another example, consider the function { −xifx≤0 1ifx> 0 Then,

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 16 / 210

slide-22
SLIDE 22

First Derivative Test: Critical Point and Local Extreme Value

As another example, consider the function f(x) = −xifx≤0 1ifx> 0 Then, f′(x) = { { −1ifx< 0 0ifx>0 Note thatf(x)is discontinuous atx= 0, and thereforef

′(x)is not defined atx= 0. All

numbersx≥0are critical numbers.f(0) = 0is a local minimum, whereasf(x) = 1is a local minimum as well as a local maximum∀x>0.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 16 / 210

slide-23
SLIDE 23

Strict Convexity and Extremum

A differentiable functionfis said to bestrictly convex(orstrictly concave up) on an open intervalI,iff,f

′(x)is increasing onI.

Recall the graphical interpretation of the first derivativef ′(x);f ′(x)>0implies thatf(x)is increasing atx. Similarly,f ′(x)is increasing when

Sufficient condition ==> f''(x) > 0 Sufficient condition ==> f''(x) >= and f''(x) vanishes at a finite no.

  • f points

Necessary condition ==> f''(x) >=0

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 17 / 210

slide-24
SLIDE 24

Strict Convexity and Extremum

ufficient condition for the strict convexity of a function: f′′(x)≥0,∀x∈I. A differentiable functionfis said to bestrictly convex(orstrictly concave up) on an open intervalI,iff,f

′(x)is increasing onI.

Definition (for a differentiable function)

Recall the graphical interpretation of the first derivativef ′(x);f ′(x)>0implies thatf(x)is increasing atx. Similarly,f ′(x)is increasing whenf ′′(x)>0. This gives us a s

Claim

If at all points in an open intervalI, f(x)is doubly differentiable and if f ′′(x)>0,∀x∈I, then the slope of the function is always increasing with x and the graph is strictly convex. This is illustrated in Figure 5. On the other hand, if the function is strictly convex and doubly differentiable inI, then

Necessary conditon for strict convexity for a differentiable function

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 17 / 210

slide-25
SLIDE 25

Strict Convexity and Extremum (Illustrated)

x1 x2

Figure 5:

The function in [x1,x2] lies completely (strictly) below the line segment joining x1 to x2

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 18 / 210

slide-26
SLIDE 26

Strict Convexity and Extremum: Slopeless interpretation (SI)

Claim

A function f is strictly convex on an openintervalI, iff f(ax1 + (1−a)x 2)<af(x 1) + (1−a)f(x 2)(1) whenver x1,x 2 ∈I, x 1̸=x 2 and0<a<1.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 19 / 210

slide-27
SLIDE 27

Strict Concavity

A differentiable functionfis said to bestrictly concaveon an open intervalI ,iff,f

′(x)is

decreasing onI. Recall from theorem 4, the graphical interpretation of the first derivativef ′(x);f ′(x)<0 implies thatf(x)is decreasing atx. Similarly,f ′(x)is (strictly) monotonically decreasing when

f''(x) < 0

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 20 / 210

slide-28
SLIDE 28

Strict Concavity

On the other hand, if the function is strictly concave and doubly differentiable inI, then f′′(x)≤0,∀x∈I. This is illustrated in Figure 6.

Figure 6:

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 21 / 210

slide-29
SLIDE 29

Strict Concavity (slopeless interpretation)

There is also a slopeless interpretation of concavity as stated below:

Claim

A differentiable function f is strictly concave on an open intervalI, iff f(ax1 + (1−a)x 2)>af(x 1) + (1−a)f(x 2)(2) whenver x1,x 2 ∈I, x 1̸=x 2 and0<a<1. The proof is similar to that for the slopeless interpretation of convexity.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 22 / 210

slide-30
SLIDE 30

Convex & Concave Regions and Inflection Point

Study the functionf(x) =x 3 −x+ 2. It’s slope decreases asxincreases to0(f

′′(x)<0) and

then the slope increases beyondx= 0(f

′′(x)>0). The point0,where thef ′′(x)changes sign

is called theinflection point; the graph is strictly concave forx<0and strictly convex for x>0. See Figure 7.

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

F r

  • mFig

t

  • ure

n:7

C: S 7 9 ℜ ℜ 26/12/2016 23 / 210

slide-31
SLIDE 31

Convex & Concave Regions and Inflection Point

Along similar lines, study the functionf(x) =

20 1 5 12 6 2 7 4 7 3 15 2

x − x + x − x .

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 24 / 210

slide-32
SLIDE 32

First Derivative Test: Restated using Strict Convexity

Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.

Expect convexity around a point of(local) minimum And Expect concavity around a point of(local) maximum

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 25 / 210

slide-33
SLIDE 33

First Derivative Test: Restated using Strict Convexity

Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.

Procedure

[First derivative test in terms of strict convexity]: Let c be a critical number of f and f′(c) = 0. Then,

1 f(c)is a local minimum if Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 25 / 210

slide-34
SLIDE 34

First Derivative Test: Restated using Strict Convexity

Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.

Procedure

[First derivative test in terms of strict convexity]: Let c be a critical number of f and f′(c) = 0. Then,

1 f(c)is a local minimum if the graph of f(x)is strictly convex on an open

interval containing c.

2

sufficient condition for local min

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 25 / 210

f(c)is a local maximum if the graph of f(x)is strictly concave on an open interval containing c. sufficient condition for local max

Intuitively, relaxing strictness should give you sufficient conditions for local min/max ==> Revising with proofs for R^n case

slide-35
SLIDE 35

Strict Convexity: Restated using Second Derivative

If the second derivativef ′′(c)exists, then the strict convexity conditions for the critical number can be stated in terms of the sign of off ′′(c), making use of previous results. This is called the second derivative test.

Procedure

[Second derivative test]: Let c be a critical number of f where f′(c) = 0and f ′′(c)exists. If f′′(c)>0then f(c)is a local minimum. If f′′(c)<0then f(c)is a local maximum.

1 2 3 If f′′(c) = 0then f(c)could be a local maximum, a local minimum, neither

  • r both. That is, the test fails.

strict convexity

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 26 / 210

slide-36
SLIDE 36

Convexity, Minima and Maxima: Illustrations

Study the functionsf(x) =x 4,f(x) =−x 4 andf(x) =x 3: Iff(x) =x 4, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is a local minimum. Iff(x) =−x 4, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is a local maximum. Iff(x) =x 3, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is neither a local minimum nor a local maximum.(0,0)is an inflection point in this case.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 27 / 210

slide-37
SLIDE 37

Convexity, Minima and Maxima: Illustrations (contd.)

Study the functions:f(x) =x+ 2sinxandf(x) =x+

1 x : 3 3

Iff(x) =x+ 2sinx, thenf

′(x) = 1 + 2cosx.f ′(x) = 0forx= 2π, 4π, which are

the critical numbers.f ′′

( )

2π 3 3

=−2sin = √ − 3< 0⇒f

3

( )

3 2π

= 2π+ √ 3 is a l

  • cal

′′ 4π 3

( ) √ = 3> 0⇒f

3

( )

4π 4π 3

√ = − 3is a local maximum value. On the other hand,f minimum value. Iff(x) =x+

x 1 ′

, thenf (x) = 1

1 x2

− . The critical numbers arex =±1. Note thatx=0is not a critical number, even thoughf ′(0)does not exist, because0is not in the domain of

x3 ′′ 2 ′′

f.f (x) = .f ( −1) =−2<0and thereforef(−1) =−2is a local maximum. f′′(1) = 2>0and thereforef(1) = 2is a local minimum.

Fromℜtoℜ n : CS709

  • Prof. Ganesh Ramakrishnan (IIT Bombay)

26/12/2016 28 / 210