Functions of Several Variables A function of several variables is - - PDF document

functions of several variables
SMART_READER_LITE
LIVE PREVIEW

Functions of Several Variables A function of several variables is - - PDF document

Functions of Several Variables A function of several variables is just what it sounds like. It may be viewed in at least three different ways. We will use a function of two variables as an example. z = f ( x, y ) may be viewed as a function of


slide-1
SLIDE 1

Functions of Several Variables

A function of several variables is just what it sounds like. It may be viewed in at least three different ways. We will use a function of two variables as an example.

  • z = f(x, y) may be viewed as a function of the two independent

variables x, y.

  • It may be viewed as a function defined at different points (x, y)

in the plane.

  • It may be viewed as a function whose domain is the set of

vectors < x, y > or xi + yj.

Limits of Functions of Several Variables

We define a limit of a function of several variables essentially the same way we define a limit for an ordinary function: Definition 1 (Limit). limx→c f(x) = L if ∀ǫ > 0, ∃δ > 0 such that |f(x) − L| < ǫ whenever 0 < |x − c| < δ. Definition 2 (Limit). limx→c f(x) = L if ∀ǫ > 0, ∃δ > 0 such that |f(x) − L| < ǫ whenever 0 < |x − c| < δ.

Properties of Limits

Rule of Thumb: If a property of limits makes sense when translated to refer to a limit of a function of several variables, then it is valid for a function of several variables. For example, the limit of a sum will be the sum of the limits, the limit

  • f a difference will be the difference of the limits, the limit of a product

will be the product of the limits and the limit of a quotient will be the quotient of the limits, provided the latter limit exists.

Continuity

The definition of continuity for a function of several variables is es- sentially the same as the definition for an ordinary function. Definition 3 (Continuity). A function f is continuous at c if limx→c f(x) = f(c). Definition 4 (Continuity for a Function of Several Variables). A func- tion f is continuous at c if limx→c f(x) = f(c). As with ordinary functions, functions of several variables will generally be continuous except where there’s an obvious reason for them not to be.

1

slide-2
SLIDE 2

2

Partial Derivatives

For a function of several variables, we have partial derivatives with respect to each of its variables. The definition is based on the definition

  • f an ordinary derivative.

Definition 5 (Derivative). Let f : R → R. d f dx(x) = limh→0 f(x + h) − f(x) h . Definition 6 (Partial Derivative). Let f : R2 → R. ∂f ∂x(x, y) = limh→0 f(x + h, y) − f(x, y) h , ∂f ∂y (x, y) = limh→0 f(x, y + h) − f(x, y) h . The obvious generalizations hold for functions with more than two independent variables.

Calculation of Partial Derivatives

Effectively, we calculate the partial derivative of a function with respect to one of its independent variables by acting as if the other independent variables were actually constants.

Notation

The following notations for the partial derivatives of a function z = f(x, y) are equivalent. fx = ∂f ∂x = ∂z ∂x = f1 = D1f = Dxf fy = ∂f ∂y = ∂z ∂y = f2 = D2f = Dyf

Higher Order Derivatives

Since a partial derivative is itself a function of several variables, it has its own partial derivatives. (fx)y = fxy = f12 = ∂ ∂y ∂f ∂x

  • = ∂2f

∂y∂x = ∂2z ∂y∂x (fy)x = fyx = f21 = ∂ ∂x ∂f ∂y

  • = ∂2f

∂x∂y = ∂2z ∂x∂y

Changing the Order of Differentiation

Theorem 1 (Clairaut’s Theorem). If fxy and fyx are both continuous

  • n a disk containing (a, b), then fxy(a, b) = fyx(a, b).
slide-3
SLIDE 3

3

Proof.

Let φ(h) = f(x + h, y + h) − f(x, y + h) − f(x + h, y) + f(x, y). The motivation comes from writing either fxy or fyx as a limit. We may write φ(h) = α(y+h)−α(y), where α(t) = f(x+h, t)−f(x, t). The Mean Value Theorem implies α(y + h) − α(y) = α′(t)h for some t between y and y + h. Since α′(t) = f2(x + h, t) − f2(x, t), we have φ(h) = [f2(x + h, t) − f2(x, t)]h. If we write β(s) = f2(s, t), then f2(x+h, t)−f2(x, t) = β(x+h)−β(x).

Clairault’s Theorem

β(s) = f2(s, t), f2(x + h, t) − f2(x, t) = β(x + h) − β(x). By the Mean Value Theorem, β(x + h) − β(x) = β′(s)h for some s between x and x + h. Since β′(s) = f21(s, t), we get f2(x + h, t) − f2(x, t) = f21(s, t)h, so φ(h) = f21(s, t)h2. Thus φ(h) h2 = f21(s, t) → f21(x, y) as h → 0, since f21 is continuous at (x, y). A similar calculation shows φ(h) h2 = f12(s, t) → f12(x, y) as h → 0, showing f12(x, y) = f21(x, y).

  • Tangent Planes

Consider a surface z = f(x, y) and suppose we are interested in the plane tangent to the surface at the point (a, b, c), where c = f(a, b). Since ∂z ∂x represents about how much z will change if x changes by 1 and y is fixed, here, and elsewhere as we look at tangent planes, tangent plane approximations and differentials, the partial derivative shown really means the partial derivative’s value at the relevant point, in this case (a, b), it seems reasonable to expect the vector < 1, 0, ∂z ∂x > to be tangent to the surface. Similarly, it is reasonable to expect the vector < 0, 1, ∂z ∂y > to be tangent to the surface.

Tangent Planes

slide-4
SLIDE 4

4

We thus expect n =

  • i

j k 1 ∂z ∂x 1 ∂z ∂y

  • = −∂z

∂xi − ∂z ∂yj + k to be a normal vector to the tangent plane. We thus take n =< −∂z ∂x, −∂z ∂y, 1 >. We thus get < −∂z ∂x, −∂z ∂y, 1 > · < x−a, y−b, z−c >= 0 as an equation for the tangent plane, or −∂z ∂x(x − a) − ∂z ∂y(y − b) + (z − c) = 0, or z − c = ∂z ∂x(x − a) + ∂z ∂y(y − b). This should be reminiscent of the Point-Slope Formula for the equation

  • f a line.

Tangent Hyperplanes

It generalizes to y − b = n

i=1

∂y ∂xi (xi − ai) as an equation for the hyperplane tangent to the hypersurface y = f(x1, x2, . . . , xn) at the point (a1, a2, . . . , an, b).

Tangent Plane Approximations and Differentials

If we take z − c = ∂z ∂x(x − a) + ∂z ∂y(y − b) and solve for z, we get z = c + ∂z ∂x(x − a) + ∂z ∂y(y − b) This should be reminiscent of the Tangent Line Approximation for or- dinary functions. We may use this formula to approximate f(x, y) at a point (x, y) close to a point (a, b). Definition 7 (Differentials). dx = ∆x = x − a dy = ∆y = y − b dz = ∂z ∂x(x − a) + ∂z ∂y(y − b)

Differentials

slide-5
SLIDE 5

5

We may use the differential dz to approximate the change ∆z = ∆f

  • f a function f(x, y) if the independent variables x and y change by

amounts dx and dy. This generalizes in the obvious way to functions of more than two variables.

Differentiability

Recall that for an ordinary function y = f(x) which was differen- tiable at a point, we found dy − ∆y ∆x → 0 as ∆x → 0. We take the analogue of this as a definition of differentiability for func- tions of several variables. We state the definition for the case of a function of two variables; the variation for more variables should be

  • bvious.

Definition 8 (Differentiable). We say a function f(x, y) is differen- tiable at a point if dz − ∆z

  • (∆x)2 + (∆y)2 → 0 as
  • (∆x)2 + (∆y)2 → 0.

Differentiability

Recall

  • (∆x)2 + (∆y)2 is the distance between (x, y) and the point

in question. Effectively, we are defining a function of several variables to be differ- entialbe when an approximation using differentials is reasonable. We still need a reasonable way of determining whether a function is

  • differentiable. This is given by the following theorem.

Differentiability

Theorem 2. If both partial derivatives of a function z = f(x, y) are continuous in some open disc {(x, y) : (x − a)2 + (y − b)2 < r} centered at (a, b), then f(x, y) is differentiable at (a, b).

  • Proof. We need to show

dz − ∆z

  • (∆x)2 + (∆y)2 → 0 as
  • (∆x)2 + (∆y)2 →

0. We may write ∆z−dz = f(x, y)−f(a, b)− ∂z ∂x(x − a) + ∂z ∂y(y − b)

  • =

f(x, y) − f(a, y) − ∂z ∂x(x − a) + f(a, y) − f(a, b) − ∂z ∂y(y − b).

Proof

slide-6
SLIDE 6

6

By the Mean Value Theorem, f(x, y) − f(a, y) = ∂z ∂x(x∗, y)(x − a) for some x∗ between a and x if x is close enough to a. Similarly, f(a, y) − f(a, b) = ∂z ∂y(a, y∗)(y − b) for some y∗ between b and y if y is close enough to b. We thus get ∆z − dz = ∂z ∂x(x∗, y)(x − a) − ∂z ∂x(x − a) + ∂z ∂y(a, y∗)(y − b)− ∂z ∂y(y−b) = ∂z ∂x(x∗, y) − ∂z ∂x

  • (x−a)+

∂z ∂x(a, y∗) − ∂z ∂y

  • (y−b).

Proof

Since both |x − a|

  • (x − a)2 + (y − b)2 ≤ 1 and

|y − b|

  • (x − a)2 + (y − b)2 ≤ 1,

we have | ∂z ∂x(x∗, y) − ∂z ∂x

  • (x − a)|
  • (x − a)2 + (y − b)2

  • ∂z

∂x(x∗, y) − ∂z ∂x

  • → 0

and | ∂z ∂y(a, y∗) − ∂z ∂y

  • (y − b)|
  • (x − a)2 + (y − b)2

  • ∂z

∂y(a, y∗) − ∂z ∂y

  • → 0

since both partial derivatives are continuous near (a, b).

  • The Chain Rule

For an ordinary function, if y = f(u) and u = g(x), making y = f ◦ g(x) a composite function, we can differentiate with respect to x using the Chain Rule: dy dx = dy du du dx. Suppose we have a function z = f(x, y), but x = g(t) and y = h(t), making z = f(g(t), h(t)) a composite function of t. We can come up with a variation of the Chain Rule, which holds under appropriate

  • conditions. The conditions we will assume are that all the relevant

derivatives exist and are continuous near t and all the relevant partial derivatives exist and are continuous near (f(t), g(t)). By the definition of a derivative, dz dt = limk→0 f(g(t + k), h(t + k)) − f(g(t), h(t)) k .

The Chain Rule

slide-7
SLIDE 7

7

We can rewrite the numerator as f(g(t+k), h(t+k))−f(g(t), h(t)) = [f(g(t+k), h(t+k))−f(g(t), h(t+k))]+[f(g(t), h(t+k))−f(g(t), h(t))]. Using the Mean Value Theorem, the first difference may be written: f(g(t+k), h(t+k))−f(g(t), h(t+k)) = f1(u, h(t+k))[g(t+k)−g(t)], where u is between g(t + k) and g(t). But, also by the Mean Value Theorem, g(t + k) − g(t) = g′(t∗)k, where t∗ is between t and t + k. We thus have f(g(t + k), h(t + k)) − f(g(t), h(t + k)) = f1(u, h(t + k))g′(t∗)k Similarly, f(g(t), h(t + k)) − f(g(t), h(t)) = f2(g(t), v)h′(t∗∗)k, where v is between h(t) and h(t + k) and t∗∗ is between t and t + k.

The Chain Rule

We thus get dz dt = limk→0 f1(u, h(t + k))g′(t∗)k + f2(g(t), v)h′(t∗∗)k k = limk→0 f1(u, h(t + k))g′(t∗) + f2(g(t), v)h′(t∗∗) = f1(g(t), h(t))g′(t) + f2(g(t), h(t))h′(t). Using Leibniz’ Notation, this may be written as: dz dt = ∂z ∂x dx dt + ∂z ∂y dy dt . This is one variation of the Chain Rule.

Partial Derivatives Via the Chain Rule

Suppose z = f(x, y), while x = g(s, t) and y = h(s, t). Then z = f(g(s, t), h(s, t)) can be thought of as a function of s and t. We might then want to calculate the partial derivatives ∂z ∂s and ∂z ∂t . By the nature of partial differentiation, the Chain Rule we just derived can be adjusted to give formulas for these partial derivatives. ∂z ∂s = ∂z ∂x ∂x ∂s + ∂z ∂y ∂y ∂s ∂z ∂t = ∂z ∂x ∂x ∂t + ∂z ∂y ∂y ∂t If we have functions involving more than two variables, this may be adjusted in the hopefully obvious way.

Implicit Differentiation

The Chain Rule may be used to derive a formula for implicit differ- entiation.

slide-8
SLIDE 8

8

Theorem 3 (Implicit Differentiation). If a differentiable function y = f(x) is defined implicitly by an equation F(x, y) = 0, then dy dx = −Fx Fy = − ∂F ∂x ∂F ∂y . Note: We have assumed y = f(x) is differentiable. We are not here dealing with how one knows whether such a function is differentiable. In general, if such a function is not differentiable, it will be relatively

  • bvious.

Implicit Differentiation

  • Proof. Using the Chain Rule, dF

dx = ∂F ∂x dx dx + ∂F ∂y dy dx = ∂F ∂x + ∂F ∂y dy dx. Since F(x, y) = 0, it follows that dF dx = 0, so ∂F ∂x + ∂F ∂y dy dx = 0. Solving for dy dx, we get ∂F ∂y dy dx = −∂F ∂x , so dy dx = − ∂F ∂x ∂F ∂y .

  • Directional Derivatives

Consider a function z = f(x, y) and its graph, which will be a surface. The partial derivative ∂z ∂x may be thought of as representing how fast the surface is rising above one’s head if one is walking on the xy-plane in the direction of the x-axis. Similarly, the partial derivative ∂z ∂y may be thought of as representing how fast the surface is rising above one’s head if one is walking on the xy-plane in the direction of the y-axis.

Directional Derivative

For a given unit vector u, we define the directional derivative Duz to represent how fast the surface is rising above one’s head if one is walking on the xy-plane in the direction of u. Definition 9 (Directional Derivative). Let f : Rn → R and let u ∈ Rn be a unit vector. Let g(t) = f(x + ut). Duf(x) = g′(0) is called the directional derivative of f at x in the direction of u.

slide-9
SLIDE 9

9

Note that if n = 1, then the directional derivative is the same as the

  • rdinary derivative, while the directional derivatives in the directions
  • f the coordinate axes are the same as the partial derivatives.

The Del Operator and the Gradient

Definition 10 (Del Operator). ▽ = ∂ ∂x, ∂ ∂y

  • Note this is really just a symbolic entity. By itself, it is meaningless,

but we use it as a mneumonic device. Definition 11 (Gradient). grad f = ▽f = ∂f ∂x, ∂f ∂y

  • The gradient turns out to be convenient when calculating directional
  • derivatives. It also generalizes to higher dimensions.

Calculating Directional Derivatives

Theorem 4. If all the partial derivatives of z = f(x) are continuous is some open ball centered at x, then Duf(x) = (▽f) · u. This theorem gives us a convenient way to calculate any directional derivative of a function and also shows that it is sufficient to be able to calculate all the partial derivatives.

Proof

We will prove the theorem for R2, but a similar proof will work for higher dimensions; only the notation would get messier.

  • Proof. Consider a function f(x, y) and a unit vector u =< a, b >. Let

z = g(t) be defined by letting z = f(x, y), where x = x0+at, y = y0+bt. By definition, Duf(x0, y0) = g′(0). By the Chain Rule, g′(t) = dz dt = ∂z ∂x dx dt + ∂z ∂y dy dt = (▽z) · u. Evaluating this at 0 gives the result.

  • Maximum Value of the Directional Derivative

Duf = (▽f) · u = |▽f| |u| cos θ, where θ is the angle between ▽f and u. Since −1 ≤ cos θ ≤ 1, the maximal value obviously occurs when θ = 0 and cos θ = 1, in other words, when u is in the same direction as ▽f. There’s a catch: This depends on the property u · v = |u||v| cos θ, which we’ve seen for R2 and R3, but whose very meaning is unclear for higher dimensions.

slide-10
SLIDE 10

10

Cauchy-Schwarz Inequality

We can give u·v = |u||v| cos θ meaning through the Cauchy-Schwarz Inequality u · v ≤ |u||v|. We will show the Cauchy-Schwarz Inequality holds in any dimension, with equality holding if and only if one vector is a multiple of the other. Consider a vector u−tv. Certainly (u−tv)·(u−tv) ≥ 0, with equality holding if and only if u is a multiple t of v or v = 0. Since (u−tv)·(u−tv) = u·u−2tu·v +t2vv = |v|2t2 −2u·vt+|u|2, we get |v|2t2 − 2u · vt + |u|2 ≥ 0.

Cauchy-Schwarz Inequality

It follows that the quadratic equation |v|2t2 − 2u · vt + |u|2 = 0 in t can’t have more than one solution, so the discriminant (−2u · v)2 − 4|v|2|u|2 can’t be positive. In other words, (−2u · v)2 − 4|v|2|u|2 ≤ 0, so 4(u · v)2 − 4|v|2|u|2 ≤ 0, so (u · v)2 − |v|2|u|2 ≤ 0, so (u · v)2 ≤ |v|2|u|2, so u · v ≤ |u||v|. Equality clearly holds if and only if either u − tv = 0 or if v = 0, in

  • ther words, if and only if either u is a scalar multiple of v or if v = 0.

Cauchy-Schwarz and Directional Derivatives

Since |u · v| ≤ |u||v|, it follows that −1 ≤ u · v |u||v| ≤ 1. We may thus define the angle θ between u and v by θ = arccos u · v |u||v|

  • .

It follows that u · v = |u||v| cos θ, so the argument we used before about the directional derivative being maximal in the direction of the gradient can legitimately be used.

Tangent Planes and Gradients

Recall the formula for the plane tangent to the surface z = f(x, y) at a point (a, b): z − c = ∂z ∂x(x − a) + ∂z ∂y(y − b). Using the language of gradients, this could be written in the form z − c = (▽f)· < x − a, y − b > or z − c = (▽f) · (x − x0), where x =< x, y > and x0 =< a, b >. Since one standard form for the equation of a plane is z − z0 = n · (x − x0), with n being a normal to the plane, it follows that ▽f is normal to the tangent plane.

Tangent Planes for Surfaces Defined Implicitly

slide-11
SLIDE 11

11

Suppose a surface is the graph of an equation φ(x, y, z) = 0. At most points (where there is a tangent plane and the tangent plane isn’t vertical), a portion of the surface near the point can be considered the graph of a function z = f(x, y) defined implicitly by the equation φ(x, y, z) = 0 along with some side conditions. By the formula for implicit differentiation, ∂z ∂x = − ∂φ ∂x ∂φ ∂z and ∂z ∂y = − ∂φ ∂y ∂φ ∂z , so the equation of the tangent plane may be written z − c = − ∂φ ∂x ∂φ ∂z (x − a) − ∂φ ∂y ∂φ ∂z (y − b). Simplifying: ∂φ ∂z (z − c) = −∂φ ∂x(x − a) − ∂φ ∂y (y − b), ∂φ ∂x(x − a) + ∂φ ∂y (y − b) + ∂φ ∂z (z − c) = 0. This can also be written in the form (▽φ)· < x − a, y − b, z − c >= 0.