Developing Tools for Convexity Analysis of f(x1,x 2,..xn)
Instructor: Prof. Ganesh Ramakrishnan
Fromℜtoℜ n : CS709
- Prof. Ganesh Ramakrishnan (IIT Bombay)
26/12/2016 1/ 210
Developing Tools for Convexity Analysis of f ( x 1 , x 2 ,.. x n ) - - PowerPoint PPT Presentation
Developing Tools for Convexity Analysis of f ( x 1 , x 2 ,.. x n ) Instructor: Prof. Ganesh Ramakrishnan Prof. Ganesh Ramakrishnan (IIT Bombay) From to n : CS709 26/12/2016 1/ 210 Summary of Optimization Principles for Univariate
Instructor: Prof. Ganesh Ramakrishnan
Fromℜtoℜ n : CS709
26/12/2016 1/ 210
Detailed slides athttps://www.cse.iitb.ac.in/~cs709/notes/enotes/ 2-08-01-2018-univariateprinciples.pdf, video athttps://tinyurl.com/yc4d2aqg and Section 4.1.1 (pages 213 to 214) of the notes at https://www.cse.iitb.ac.in/~cs709/notes/BasicsOfConvexOptimization.pdf.
Fromℜtoℜ n : CS709
26/12/2016 2/ 210
Letf:D→ ℜ. Nowfhas Anabsolute maximum(or global maximum) value at pointc∈Dif f(x)≤f(c),∀x∈D Anabsolute minimum(or global minimum) value atc∈Dif f(x)≥f(c),∀x∈D Alocal maximum valueatcif there is an open intervalIcontainingcin which local minimum v f(c)≥f(x),∀x∈I A alueatcif there is an open intervalIcontainingcin which f(c)≤f(x),∀x∈I Alocal extreme valueatc, iff(c)is either a local maximum or local minimum value off in an open intervalIwithc∈I
Fromℜtoℜ n : CS709
26/12/2016 3/ 210
First derivative test for local extreme value off, whenfis differentiable at the extremum.
f'(x) = 0 for all local extremevalues
Fromℜtoℜ n : CS709
26/12/2016 5 / 210
If f(c)is a local extreme value and if f is differentiable at x=c, then f
′(c) = 0.
First derivative test for local extreme value off, whenfis differentiable at the extremum.
Claim
The Extreme Value Theorem
Function has global extremes if (a) it iscontinuous (b) the domain is bounded (c) the domain is closed
Fromℜtoℜ n : CS709
26/12/2016 5 / 210
If f(c)is a local extreme value and if f is differentiable at x=c, then f
′(c) = 0.
First derivative test for local extreme value off, whenfis differentiable at the extremum.
Claim
The Extreme Value Theorem
Claim
A continuous function f(x)on a closed and bounded interval[a,b]attains a minimum value f(c)for some c∈[a,b]and a maximum value f(d)for some d∈[a,b]. That is, a continuous function on a closed, bounded interval attains a minimum and a maximum value. We must point out that either or both of the valuescanddmay be attained at the end points
Fromℜtoℜ n : CS709
26/12/2016 5 / 210
f and its first n derivatives f′,f ′′, . . . ,f (n) arecontinuous the closed interval[a,b], differentiable on(a,b), th Then th degree polynomial approximation of a function is used to prove a generalization of the mean value theorem, called theTaylor’s theorem.
Claim
The Taylor’s theorem states that if
and en there exists a numberc∈(a,b) such that
1 1 2! n!
′ ′′ 2 (n) n
f(b) =f(a) +f (a)(b−a) + f (a)(b−a) +.. .+ f (a)(b−a) + 1 (n+ 1)! f(n+1)(c)(b−a) n+1
Mean Value Theorem = Taylor’s theorem with n=
approximation involves dropping last term
Fromℜtoℜ n : CS709
26/12/2016 6 / 210
Note that ifffails to be differentiable at even one number in the interval, then the conclusion
2 3
3
√ x and the
theorem does not hold in the interval[−3,3], sincefis not differentiable at0as can be seen in Figure 1.
F r
ℜig t
ℜre
n
:1 C: S 7 9
26/12/2016 7 / 210
A functionfis said to be ... increasingon an intervalIin its domainDiff(t)<f(x)whenevert<x. decreasingon an intervalI∈Diff(t)>f(x)whenevert<x. Consequently:
Claim
LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:
1 if f′(x)>0for all x∈int(I), then f is
(strictly) increasing
Fromℜtoℜ n : CS709
26/12/2016 8 / 210
A functionfis said to be ... increasingon an intervalIin its domainDiff(t)<f(x)whenevert<x. decreasingon an intervalI∈Diff(t)>f(x)whenevert<x. Consequently:
Claim
LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:
1 2 3
if f′(x)>0for all x∈int(I), then f is increasing onI; if f′(x)<0for all x∈int(I), then f is decreasing onI; if f′(x) = 0for all x∈int(I), iff, f is constantonI.
Fromℜtoℜ n : CS709
26/12/2016 8 / 210
,2]a 0]and[2 We Figure 2 illustrates the intervals in(−∞,∞)on which the functionf(x) = 3x
4 + 4x3 −36x 2 is
decreasing and increasing. First we note thatf(x)is differentiable everywhere on(−∞,∞) and computef ′(x) = 12(x3 +x 2 −6x) = 12(x−2)(x+ 3)x, which is negative in the intervals (−∞,−3]and[0 nd positive in the intervals[−3, ,∞).
decreasing in the intervals(−∞,−3]and[0,2]and while it is increasing in the intervals[−3,0] and[2,∞).
Fromℜtoℜ n : CS709
26/12/2016 9 / 210
The conditions for increasing and decreasing properties off(x)stated so far are
Fromℜtoℜ n : CS709
26/12/2016 10 / 210
The conditions for increasing and decreasing properties off(x)stated so far are not necesssary.
Figure 3:
Figure 3 shows that for the functionf(x) =x 5, thoughf(x)is increasing in(−∞,∞),f
′(0) = 0.
Fromℜtoℜ n : CS709
26/12/2016 10 / 210
Thus, a modified sufficient condition for a functionfto be increasing/decreasing on an interval Ican be stated as follows:
f'(.) > 0 everywhere except at a finite number of points where f'(.) = 0
Fromℜtoℜ n : CS709
26/12/2016 11 / 210
Thus, a modified sufficient condition for a functionfto be increasing/decreasing on an interval Ican be stated as follows:
Claim
LetIbe an interval and suppose f is continuous onIand differentiable on int(I). Then:
1 if f′(x)≥0for all x∈int(I), and if f
′(x) = 0at only finitely many x∈I, then f is
increasing onI;
2 if f′(x)≤0for all x∈int(I), and if f
′(x) = 0at only finitely many x∈I, then f is
decreasing onI. For example, the derivative of the functionf(x) = 6x 5 −15x 4 + 10x3 vanishes at0, and1and f′(x)>0elsewhere. Sof(x)is increasing on(−∞,∞).
Fromℜtoℜ n : CS709
26/12/2016 11 / 210
We have a slightly different necessary condition..
Claim
LetIbe an interval, and suppose f is continuous onIand differentiable in int(I). Then:
1 2
if f is increasing onI, then f ′(x)≥0for all x∈int(I); if f is decreasing onI, then f ′(x)≤0for all x∈int(I).
Fromℜtoℜ n : CS709
26/12/2016 12 / 210
This concept will help us derive the general condition for local extrema.
Definition
[Critical Point]: A point c in the domainDof f is called a critical point of f if either f ′(c) = 0
The following general condition for local extrema extends the result in theorem 1 to general non-differentiable functions.
Claim
If f(c)is a local extreme value, then c is a critical number of f. The converse of above statement does not hold (see Figure 3);0is a critical number (f′(0) = 0), althoughf(0)is not a local extreme value.
Fromℜtoℜ n : CS709
26/12/2016 13 / 210
Given a critical pointc, the following test helps determine iff(c)is a local extreme value:
Procedure
[Local Extreme Value]: Let c be an isolated critical point of f
1 2
f(c)is a local minimum if f(x)is decreasing in an interval[c−ϵ
1,c]and
increasing in an interval[c,c+ϵ
2]withϵ 1,ϵ 2>0.
f(c)is a local maximum if f(x)is increasing in an interval[c−ϵ
1,c]and
decreasing in an interval[c,c+ϵ
2]withϵ 1,ϵ 2>0.
Fromℜtoℜ n : CS709
26/12/2016 14 / 210
As an example, the functionf(x) = 3x 5 −5x 3 has the derivativef ′(x) = 15x2(x+ 1)(x−1). The critical points are
Fromℜtoℜ n : CS709
26/12/2016 15 / 210
As an example, the functionf(x) = 3x 5 −5x 3 has the derivativef ′(x) = 15x2(x+ 1)(x−1). The critical points are0,1and−1. Of the three, the sign off
′(x)changes at1and−1, which
are local minimum and maximum respectively. The sign does not change at0, which is therefore not a local supremum.
F r
t
n:4
C: S 7 9 ℜ ℜ 26/12/2016 15 / 210
f(x) = As another example, consider the function { −xifx≤0 1ifx> 0 Then,
Fromℜtoℜ n : CS709
26/12/2016 16 / 210
As another example, consider the function f(x) = −xifx≤0 1ifx> 0 Then, f′(x) = { { −1ifx< 0 0ifx>0 Note thatf(x)is discontinuous atx= 0, and thereforef
′(x)is not defined atx= 0. All
numbersx≥0are critical numbers.f(0) = 0is a local minimum, whereasf(x) = 1is a local minimum as well as a local maximum∀x>0.
Fromℜtoℜ n : CS709
26/12/2016 16 / 210
A differentiable functionfis said to bestrictly convex(orstrictly concave up) on an open intervalI,iff,f
′(x)is increasing onI.
Recall the graphical interpretation of the first derivativef ′(x);f ′(x)>0implies thatf(x)is increasing atx. Similarly,f ′(x)is increasing when
Sufficient condition ==> f''(x) > 0 Sufficient condition ==> f''(x) >= and f''(x) vanishes at a finite no.
Necessary condition ==> f''(x) >=0
Fromℜtoℜ n : CS709
26/12/2016 17 / 210
ufficient condition for the strict convexity of a function: f′′(x)≥0,∀x∈I. A differentiable functionfis said to bestrictly convex(orstrictly concave up) on an open intervalI,iff,f
′(x)is increasing onI.
Definition (for a differentiable function)
Recall the graphical interpretation of the first derivativef ′(x);f ′(x)>0implies thatf(x)is increasing atx. Similarly,f ′(x)is increasing whenf ′′(x)>0. This gives us a s
Claim
If at all points in an open intervalI, f(x)is doubly differentiable and if f ′′(x)>0,∀x∈I, then the slope of the function is always increasing with x and the graph is strictly convex. This is illustrated in Figure 5. On the other hand, if the function is strictly convex and doubly differentiable inI, then
Necessary conditon for strict convexity for a differentiable function
Fromℜtoℜ n : CS709
26/12/2016 17 / 210
x1 x2
Figure 5:
The function in [x1,x2] lies completely (strictly) below the line segment joining x1 to x2
Fromℜtoℜ n : CS709
26/12/2016 18 / 210
Claim
A function f is strictly convex on an openintervalI, iff f(ax1 + (1−a)x 2)<af(x 1) + (1−a)f(x 2)(1) whenver x1,x 2 ∈I, x 1̸=x 2 and0<a<1.
Fromℜtoℜ n : CS709
26/12/2016 19 / 210
A differentiable functionfis said to bestrictly concaveon an open intervalI ,iff,f
′(x)is
decreasing onI. Recall from theorem 4, the graphical interpretation of the first derivativef ′(x);f ′(x)<0 implies thatf(x)is decreasing atx. Similarly,f ′(x)is (strictly) monotonically decreasing when
f''(x) < 0
Fromℜtoℜ n : CS709
26/12/2016 20 / 210
On the other hand, if the function is strictly concave and doubly differentiable inI, then f′′(x)≤0,∀x∈I. This is illustrated in Figure 6.
Figure 6:
Fromℜtoℜ n : CS709
26/12/2016 21 / 210
There is also a slopeless interpretation of concavity as stated below:
Claim
A differentiable function f is strictly concave on an open intervalI, iff f(ax1 + (1−a)x 2)>af(x 1) + (1−a)f(x 2)(2) whenver x1,x 2 ∈I, x 1̸=x 2 and0<a<1. The proof is similar to that for the slopeless interpretation of convexity.
Fromℜtoℜ n : CS709
26/12/2016 22 / 210
Study the functionf(x) =x 3 −x+ 2. It’s slope decreases asxincreases to0(f
′′(x)<0) and
then the slope increases beyondx= 0(f
′′(x)>0). The point0,where thef ′′(x)changes sign
is called theinflection point; the graph is strictly concave forx<0and strictly convex for x>0. See Figure 7.
F r
t
n:7
C: S 7 9 ℜ ℜ 26/12/2016 23 / 210
Along similar lines, study the functionf(x) =
20 1 5 12 6 2 7 4 7 3 15 2
x − x + x − x .
Fromℜtoℜ n : CS709
26/12/2016 24 / 210
Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.
Expect convexity around a point of(local) minimum And Expect concavity around a point of(local) maximum
Fromℜtoℜ n : CS709
26/12/2016 25 / 210
Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.
Procedure
[First derivative test in terms of strict convexity]: Let c be a critical number of f and f′(c) = 0. Then,
1 f(c)is a local minimum if Fromℜtoℜ n : CS709
26/12/2016 25 / 210
Thefirst derivative testfor local extrema can be restated in terms of strict convexity and concavity of functions.
Procedure
[First derivative test in terms of strict convexity]: Let c be a critical number of f and f′(c) = 0. Then,
1 f(c)is a local minimum if the graph of f(x)is strictly convex on an open
interval containing c.
2
sufficient condition for local min
Fromℜtoℜ n : CS709
26/12/2016 25 / 210
f(c)is a local maximum if the graph of f(x)is strictly concave on an open interval containing c. sufficient condition for local max
Intuitively, relaxing strictness should give you sufficient conditions for local min/max ==> Revising with proofs for R^n case
If the second derivativef ′′(c)exists, then the strict convexity conditions for the critical number can be stated in terms of the sign of off ′′(c), making use of previous results. This is called the second derivative test.
Procedure
[Second derivative test]: Let c be a critical number of f where f′(c) = 0and f ′′(c)exists. If f′′(c)>0then f(c)is a local minimum. If f′′(c)<0then f(c)is a local maximum.
1 2 3 If f′′(c) = 0then f(c)could be a local maximum, a local minimum, neither
strict convexity
Fromℜtoℜ n : CS709
26/12/2016 26 / 210
Study the functionsf(x) =x 4,f(x) =−x 4 andf(x) =x 3: Iff(x) =x 4, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is a local minimum. Iff(x) =−x 4, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is a local maximum. Iff(x) =x 3, thenf ′(0) = 0andf ′′(0) = 0and we can see thatf(0)is neither a local minimum nor a local maximum.(0,0)is an inflection point in this case.
Fromℜtoℜ n : CS709
26/12/2016 27 / 210
Study the functions:f(x) =x+ 2sinxandf(x) =x+
1 x : 3 3
Iff(x) =x+ 2sinx, thenf
′(x) = 1 + 2cosx.f ′(x) = 0forx= 2π, 4π, which are
the critical numbers.f ′′
2π
( )
2π 3 3
=−2sin = √ − 3< 0⇒f
3
( )
3 2π
= 2π+ √ 3 is a l
′′ 4π 3
( ) √ = 3> 0⇒f
3
( )
4π 4π 3
√ = − 3is a local maximum value. On the other hand,f minimum value. Iff(x) =x+
x 1 ′
, thenf (x) = 1
1 x2
− . The critical numbers arex =±1. Note thatx=0is not a critical number, even thoughf ′(0)does not exist, because0is not in the domain of
x3 ′′ 2 ′′
f.f (x) = .f ( −1) =−2<0and thereforef(−1) =−2is a local maximum. f′′(1) = 2>0and thereforef(1) = 2is a local minimum.
Fromℜtoℜ n : CS709
26/12/2016 28 / 210