Multivariable Calculus
Multivariable Calculus
The world is not one-dimensional, and calculus doesn’t stop with a single independent variable.
The ideas of partial derivatives and multiple integrals are not too different from their single-
variable counterparts, but some of the details about manipulating them are not so obvious. Some
are downright tricky.
8.1 Partial Derivatives
The basic idea of derivatives and of integrals in two, three, or more dimensions follows the same
pattern as for one dimension. They’re just more complicated.
The derivative of a function of one variable is defined as
df (x)
f (x + ∆x) − f (x)
= lim
(8.1)
dx
∆x→0
∆x
You would think that the definition of a derivative of a function of x and y would then be defined
as
∂f (x, y)
f (x + ∆x, y) − f (x, y)
= lim
(8.2)
∂x
∆x→0
∆x
and more-or-less it is. The ∂ notation instead of d is a reminder that there are other coordinates
floating around that are temporarily being treated as constants.
In order to see why I used the phrase “more-or-less,” take a very simple example: f (x, y) =
y. Use the preceding definition, and because y is being held constant, the derivative ∂f /∂x = 0.
What could be easier?
I don’t like these variables so I’ll switch to a different set of coordinates, x and y :
y = x + y
and
x = x
What is ∂f /∂x now?
f (x, y) = y = y − x = y − x
Now the derivative of f with respect to x is −1, because I’m keeping the other coordinate fixed.
Or is the derivative still zero because x = x and I’m taking ∂f /∂x and why should that change
just because I’m using a different coordinate system?
The problem is that the notation is ambiguous. When you see ∂f /∂x it doesn’t tell you
what to hold constant. Is it to be y or y or yet something else? In some contexts the answer is
clear and you won’t have any difficulty deciding, but you’ve already encountered cases for which
the distinction is crucial. In thermodynamics, when you add heat to a gas to raise its temperature
does this happen at constant pressure or at constant volume or with some other constraint? The
specific heat at constant pressure is not the same as the specific heat at constant volume; it is
necessarily bigger because during an expansion some of the energy has to go into the work of
changing the volume. This sort of derivative depends on type of process that you’re using, and
for a classical ideal gas the difference between the two molar specific heats obeys the equation
cp − cv = R
James Nearing, University of Miami
1
8—Multivariable Calculus
2
If the gas isn’t ideal, this equation is replaced by a more complicated and general one, but the
same observation applies, that the two derivatives dQ/dT aren’t the same.
In thermodynamics there are so many variables in use that there is a standard notation for
a partial derivative, indicating exactly which other variables are to be held constant.
∂U
∂U
and
∂V
T
∂V
P
represent the change in the internal energy of an object per change in volume during processes in
which respectively the temperature and the pressure are held constant. In the previous example
with the function f = y, this says
∂f
∂f
= 0
and
= −1
∂x y
∂x y
This notation is a way to specify the direction in the x-y plane along which you’re taking the
derivative.
8.2 Chain Rule
For functions of one variable, the chain rule allows you to differentiate with respect to still another
variable: y a function of x and x a function of t allows
dy
dy dx
=
(8.3)
dt
dx dt
You can derive this simply from the definition of a derivative.
∆y
y x(t + ∆t) − y x(t)
=
∆t
∆t
y x(t + ∆t) − y x(t)
x(t + ∆t) − x(t)
∆y
∆x
=
.
=
.
x(t + ∆t) − x(t)
∆t
∆x
∆t
Take the limit of this product as ∆t → 0. Necessarily then you have that ∆x → 0 too (unless the
derivative doesn’t exist anyway). The second factor is then the definition of the derivative dx/dt,
and the first factor is the definition of dy/dx. The Leibnitz notation as written in Eq. (8.3) leads
you to the required proof.
What happens with more variables? Roughly the same thing but with more manipulation,
the same sort of manipulation that you use in deriving the product rule for derivatives of one
variable (as in problem 1.44).
d
Compute
f x(t), y(t)
dt
8—Multivariable Calculus
3
Back to the ∆’s. The manipulation is much like the preceding except that you have to add and
subtract a term in the second line.
∆f
f x(t + ∆t), y(t + ∆t) − f x(t), y(t)
=
∆t
∆t
f x(t + ∆t), y(t + ∆t) − f x(t), y(t + ∆t) + f x(t), y(t + ∆t) − f x(t), y(t)
=
∆t
f x(t + ∆t), y(t + ∆t) − f x(t), y(t + ∆t)
x(t + ∆t) − x(t)
=
.
x(t + ∆t) − x(t)
∆t
f x(t), y(t + ∆t) − f x(t), y(t)
y(t + ∆t) − y(t)
+
.
y(t + ∆t) − y(t)
∆t
∆f
∆x
∆f
∆y
=
.
+
.
∆x
∆t
∆y
∆t
In the first factor of the first term, ∆f /∆x, the variable x is changed but y is not. In the first
factor of the second term, the reverse holds true. The limit of this expression is then
∆f
df
∂f
dx
∂f
dy
lim
=
=
+
(8.4)
∆t→0 ∆t
dt
∂x y dt
∂y x dt
If these manipulations look familiar, then you probably studied section 1.5. That case is like this
one, with the special values x ≡ y ≡ t.
Example: (When you want to check out an equation, you should construct an example so
that it reveals a lot of structure without requiring a lot of calculation.)
f (x, y) = Axy2,
and
x(t) = Ct3,
y(t) = Dt2
First do it using the chain rule.
df
∂f
dx
∂f
dy
=
+
dt
∂x y dt
∂y x dt
= Ay2
3Ct2 + 2Axy
2Dt
= A(Dt2)2
3Ct2 + 2A(Ct3)(Dt2)
2Dt
= 7ACD2t6
Now repeat the calculation by first substituting the values of x and y and then differentiating.
df
d
=
A(Ct3)(Dt2)2
dt
dt
d
=
ACD2t7
dt
= 7ACD2t6
What if f also has an explicit t in it: f t, x(t), y(t) ? That simply adds another term.
Remember, dt/dt = 1.
df
∂f
∂f
dx
∂f
dy
=
+
+
(8.5)
dt
∂t
x,y
∂x y,t dt
∂y x,t dt
8—Multivariable Calculus
4
Sometimes you see the chain rule written in a slightly different form. You can change
coordinates from (x, y) to (r, φ), switching from rectangular to polar. You can switch from
(x, y) to a system such as (x , y ) = (x + y, x − y). The function can be expressed in the new
coordinates explicitly. Solve for x, y in terms of r, φ or x , y and then differentiate with respect
to the new coordinate. OR you can use the chain rule to differentiate with respect to the new
variable.
∂f
∂f
∂x
∂f
∂y
=
+
∂x
y
∂x y ∂x y
∂y x ∂x y
(8.6)
This is actually not a different equation from Eq. (8.4). It only looks different because in addition
to t there’s another variable that you have to keep constant: t → x , and y is constant.
Example: When you switch from rectangular to plane polar coordinates what is ∂f /∂φ in
terms of the x and y derivatives?
x = r cos φ,
y = r sin φ,
so
∂f
∂f
∂x
∂f
∂y
=
+
∂φ r
∂x y ∂φ r
∂y x ∂φ r
∂f
∂f
=
(−r sin φ) +
(r cos φ)
∂x y
∂y x
If f (x, y) = x2 + y2 this better be zero, because I’m finding how f changes when r is held
fixed. Check it out; it is. The equation (8.6) presents the form that is most important in many
applications.
Example: What is the derivative of y with respect to φ at constant x?
∂y
∂y
∂r
∂y
∂φ
=
+
∂φ x
∂r φ ∂φ x
∂φ r ∂φ x
sin φ
1
= [sin φ] . r
+ [r cos φ] . 1 = r
(8.7)
cos φ
cos φ
r∆φ
∆y
∆φ
φ
You see a graphical interpretation of the calculation in this diagram: φ changes by ∆φ, so
the coordinate moves up by ∆y (x is constant). The angle between the lines ∆y and r∆φ is φ
itself. This means that ∆y ÷ r∆φ = 1/ cos φ, and that is precisely the preceding equation for
∂y/∂φ
.
x
In doing the calculation leading to Eq. (8.7), do you see how to do the calculation for
∂r/∂φ
? Differentiate the equation x = r cos φ with respect to φ.
x
∂x
∂r
∂ cos φ
∂r
x = r cos φ →
= 0 =
cos φ + r
=
cos φ − r sin φ
∂φ x
∂φ x
∂φ
x
∂φ x
8—Multivariable Calculus
5
Solve for the unknown derivative and you have the result.
Another example: f (x, y) = x2 − 2xy. The transformation between rectangular and polar
coordinates is x = r cos φ, y = r sin φ. What is ∂f /∂x
?
r
∂f
∂f
∂x
∂f
∂y
∂y
=
+
= (2x − 2y) + (−2x)
∂x r
∂x y ∂x r
∂y x ∂x r
∂x r
∂y
∂y/∂φ
r cos φ
=
r =
= − cot φ
(8.8)
∂x
−
r
∂x/∂φ
r sin φ
r
(Remember problem 1.49?) Put these together and
∂f
= (2x − 2y) + (−2x)(− cot φ) = 2x − 2y + 2x cot φ
(8.9)
∂x r
The brute-force way to do this is to express the function f explicitly in terms of the variables x
and r, eliminating y and φ.
y = r sin φ =
r2 − x2,
then
∂f
∂
=
x2 − 2x
r2 − x2
∂x r
∂x
r
1
−2 r2 − x2 + 2x2
= 2x − 2
r2 − x2 − 2x √
(−x) = 2x +
√
(8.10)
r2 − x2
r2 − x2
You can see that this is the same as the the equation (8.9) if you look at the next-to-last form
of equation (8.10).
x
r cos φ
√
=
= cot φ
r2 − x2
r2 − r2 cos2 φ
∆y
Is this result reasonable? Look at what happens to y when you change x
by a little bit. Constant r is a circle, and if φ puts the position over near the
right side (ten or twenty degrees), a little change in x causes a big change in y
∆x
as shown by the rectangle. As drawn, ∆y/∆x is big and negative, sort of like
the (negative) cotangent of φ as in Eq. (8.8).
8.3 Differentials
For a function of a single variable you can write
df
df =
dx
(8.11)
dx
and read (sort of) that the infinitesimal change in the function f is the slope times the infinitesimal
change in x. Does this really make any sense? What is an infinitesimal change? Is it zero? Is
dx a number or isn’t it? What’s going on?
It is possible to translate this intuitive idea into something fairly simple and that makes
perfectly good sense. Once you understand what it really means you’ll be able to use the intuitive
idea and its notation with more security.
8—Multivariable Calculus
6
Let g be a function of two variables, x and h.
df (x)
1
g(x, h) =
h
has the property that
f (x + h) − f (x) − g(x, h) −→ 0
as h → 0
dx
h
That is, the function g(x, h) approximates very well the change in f as you go from x to x + h.
The difference between g and ∆f = f (x + h) − f (x) goes to zero so fast that even after you’ve
divided by h the difference goes to zero.
The usual notation is to use the symbol dx instead of h and to call the function df instead*
of g.
df (x, dx) = f (x) dx
has the property that
1
(8.12)
f (x + dx) − f (x) − df (x, dx) −→ 0
as dx → 0
dx
In this language dx is just another variable that can go from −∞ to +∞ and df is just a specified
function of two variables. The point is that this function is useful because when the variable dx
is small df provides a very good approximation to the increment ∆f in f .
What is the volume of the peel on an orange? The volume of a sphere is V = 4πr3/3, so
its differential is dV = 4πr2 dr. If the radius of the orange is 3 cm and the thickness of the peel
is 2 mm, the volume of the peel is
dV = 4πr2 dr = 4π(3 cm)2(0.2 cm) = 23 cm3
The whole volume of the orange is 4 π(3 cm)3 = 113 cm3, so this peel is about 20% of the
3
volume.
Differentials in Several Variables
The analog of Eq. (8.11) for several variables is
∂f
∂f
df = df (x, y, dx, dy) =
dx +
dy
(8.13)
∂x y
∂y x
Roughly speaking, near a point in the x-y plane, the value of the function f changes as a linear
function of the coordinates as you move a (little) distance away. This function df describes this
change to high accuracy. It bears the same relation to Eq. (8.4) that (8.11) bears to Eq. (8.3).
For example, take the function f (x, y) = x2 + y2. At the point (x, y) = (1, 2), the
differential is
df (1, 2, dx, dy) = (2x)
dx + (2y)
dy = 2dx + 4dy
(1,2)
(1,2)
so that
f (1.01, 1.99) ≈ f (1, 2) + df (1, 2, .01, −.01) = 12 + 22 + 2(.01) + 4(−.01) = 4.98
compared to the exact answer, 4.9802.
* Who says that a variable in algebra must be only a single letter? You would never write a
computer program that way. d Fred 2 d Fred = 2 Fred is perfectly sensible.
8—Multivariable Calculus
7
The equation analogous to (8.12) is
df (x, y, dx, dy)
has the property that
1 f(x + dx, y + dy) − f(x, y) − df(x, y, dx, dy) −→ 0
as dr → 0
(8.14)
dr
where dr =
dx2 + dy2 is the distance to (x, y). It’s not that you will be able to do a lot more
with this precise definition than you could with the intuitive idea. You will however be able to
work with a better understanding of you’re actions. When you say that “dx is an infinitesimal”
you can understand that this means simply that dx is any number but that the equations using
it are useful only for very small values of that number.
You can’t use this notation for everything as the notation for the derivative demonstrates.
The symbol “df /dx” does not mean to divide a function by a length; it refers to a well-defined
limiting process. This notation is however constructed so that it provides an intuitive guide, and
even if you do think of it as the function df divided by the variable dx, you get the right answer.
Why should such a thing as a differential exist? It’s essentially the first terms after the
constant in the power series representation of the original function: section 2.5. But how to tell
if such a series works anyway? I’ve been notably cavalier about proofs. The answer is that there
is a proper theorem guaranteeing Eq. (8.14) works. It is that if both partial derivatives exist in
the neighborhood of the expansion point and if these derivatives are continuous there, then the
differential exists and has the value that I stated in Eq. (8.13). It has the properties stated in
Eq. (8.14). For all this refer to one of many advanced calculus texts, such as Apostol’s.*
8.4 Geometric Interpretation
For one variable, the picture of the differential is simple. Start with a graph of the function and
at a point (x, y) = (x, f (x)), find the straight line that best approximates the function in the
immediate neighborhood of that point. Now set up a new coordinate system with origin at this
(x, y) and call the new coordinates dx and dy. In this coordinate system the straight line passes
through the origin and the slope is the derivative df (x)/dx. The equation for the straight line is
then Eq. (8.11), describing the differential.
df (x)
dy =
dx
dx
dy
y
dx
x
For two variables, the picture parallels this one. At a point (x, y, z) = (x, y, f (x, y)) find
the plane that best approximates the function in the immediate neighborhood of that point. Set
up a new coordinate system with origin at this (x, y, z) and call the new coordinates dx, dy, and
dz. The equation for a plane that passes through this origin is α dx + β dy + γ dz = 0, and for
* Mathematical Analysis, Addison-Wesley
8—Multivariable Calculus
8
this best approximating plane, the equation is nothing more than the equation for the differential,
Eq. (8.13).
dz
∂f (x, y)
∂f (x, y)
dy
dz =
dx +
dy
∂x
y
∂y
x
dx
The picture is a bit harder to draw, but with a little practice you can do it.
For the case of three independent variables, I’ll leave the sketch to you.
Examples
The temperature on the surface of a heated disk is given to be T (r, φ) = T0 + T1 1 − r2/a2 ,
where a is the radius of the disk and T0 and T1 are constants. If you start at position x = c < a,
y = 0 and move parallel to the y-axis at speed v0 what is the rate of change of temperature that
you feel?
Use Eq. (8.4), and the relation r =
x2 + y2.
dT
∂T
dr
∂T
dφ
∂T
∂r
dx
∂r
dy
=
+
=
+
dt
∂r
φ dt
∂φ r dt
∂r
φ
∂x y dt
∂y x dt
r
y
c2 + v2t2
0
v2t
= −2T
.
0
1
v0 = −2T1
a2
x2 + y2
a2
c2 + v2t2
0
v2t
= −2T
0
1 a2
As a check, the dimensions are correct (are they?). At time zero, this vanishes, and that’s
what you should expect because at the beginning of the motion you’re starting to move in the
direction perpendicular to the direction in which the temperature is changing. The farther you
go, the more nearly parallel to the direction of the radius you’re moving. If you are moving exactly
parallel to the radius, this time-derivative is easier to calculate; it’s then almost a problem in a
single variable.
dT
dT dr
r
V
≈
≈ −
0t
2T1
v0 ≈ −2T1
v0
dt
dr dt
a2
a2
So the approximate and the exact calculation agree. In fact they agree so well that you should try
to find out if this is a lucky coincidence or if there some special aspect of the problem that you
might have seen from the beginning and that would have made the whole thing much simpler.
8.5 Gradient
The equation (8.13) for the differential has another geometric interpretation. For a function such
as f (x, y) = x2 + 4y2, the equations representing constant values of f describe curves in the
x-y plane. In this example, they are ellipses. If you start from any fixed point in the plane and
start to move away from it, the rate at which the value of f changes will depend on the direction
in which you move. If you move along the curve defined by f = constant then f won’t change
at all. If you move perpendicular to that direction then f may change a lot.
8—Multivariable Calculus
9
The gradient of f at a point is the vector pointing in
the direction in which f is increasing most rapidly, and
the component of the gradient along that direction is the
derivative of f with respect to the distance in that direction.
To relate this to the partial derivatives that we’ve been using, and to understand how to
compute and to use the gradient, return to Eq. (8.13) and write it in vector form. Use the
common notation for the basis: ˆ
x and ˆ
y. Then let
∂f
∂f
dr = dx ˆ
x + dy ˆ
y
and
G =
ˆ
x +
ˆ
y
(8.15)
∂x y
∂y x
The equation for the differential is now
df = df (x, y, dx, dy) = G . dr
(8.16)
G
θ
dr
Because you know the properties of the dot product, you know that this is G dr cos θ and it
is largest when the directions of dr and of G are the same. It’s zero when they are perpendicular.
You also know that df is zero when dr is in the direction along the curve where f is constant.
The vector G is therefore perpendicular to this curve. It is in the direction in which f is changing
most rapidly. Also because df = G dr cos 0, you see that G is the derivative of f with respect
to distance along that direction. G is the gradient.
For the example f (x, y) = x2 + 4y2, G = 2x ˆ
x + 8y ˆ
y. At each point in the x-y plane
it provides a vector showing the steepness of f at that point and the direction in which f is
changing most rapidly.
Notice that the gradient vectors are twice as long where the ellipses are closest together as
they are at the ends where the ellipses are farthest apart. The function changes more rapidly in
the y-direction.
The United States Coast and Geodetic Survey makes a large number of maps, and hikers
are particularly interested in the contour maps. They show curves indicating the lines of constant
altitude. As the highest altitude in Florida is less than 100 meters, denizens of that state may
never have seen one of these maps, but they’re important where there are mountains.
8—Multivariable Calculus
10
The gravitational potential energy of a mass m near the Earth’s surface is mgh. This
divided by the mass is the gravitational potential, gh. These lines of constant altitude are
then lines of constant potential, equipotentials of the gravitational field. Walk along such an
equipotential and you are doing no work against gravity, just walking on the level.
8.6 Electrostatics
The electric field can be described in terms of a gradient. For a single point charge at the origin
the electric field is
kq
E(x, y, z) =
ˆ
r
r2
where ˆ
r is the unit vector pointing away from the origin and r is the distance to the origin. This
vector can be written as a gradient. Because this E is everywhere pointing away from the origin,
it’s everywhere perpendicular to the sphere centered at the origin.
kq
E = −grad r
You can verify this a several ways. The first is to go straight to the definition of a gradient.
(There’s a blizzard of minus signs in this approach, so have a little patience. It will get better.)
This function is increasing most rapidly in the direction moving toward the origin. (1/r) The
derivative with respect to distance in this direction is −d/dr, so −d/dr(1/r) = +1/r2. The
direction of greatest increase is along −ˆ
r, so grad (1/r) = −ˆ
r(1/r2). But the relation to the
electric field has another −1 in it, so
kq
kq
−grad
= +ˆ
r
r
r2
There’s got to be a better way.
Yes, instead of insisting that you move in the direction in which the function is increasing
most rapidly, simply move in the direction in which it is changing most rapidly. The derivative
with respect to distance in that direction is the component in that direction and the plus or
minus signs take care of themselves. The derivative with respect to r of (1/r) is −1/r2. That
is the component in the direction ˆ
r, the direction in which you took the derivative. This says
grad (1/r) = −ˆ
r(1/r2). You get the same result as before but without so much fussing. This
also makes it look more like the familiar ordinary derivative in one dimension.
Still another way is from the Stallone-Schwarzenegger brute force school of computing.
Put everything in rectangular coordinates and do the partial derivatives using Eqs. (8.15) and
(8.6).
∂(1/r)
∂(1/r)
∂r
1 ∂
1
x
=
= −
x2 + y2 + z2 = −
∂x
y,z
∂r
θ,φ
∂x y,z
r2 ∂x
r2
x2 + y2 + z2
8—Multivariable Calculus
11
Repeat this for y and z with similar results and assemble the output.
kq
kq x ˆ
x + y ˆ
y + z ˆ
z
kq r
kq
−grad
=
=
=
ˆ
r
r
r2
x2 + y2 + z2
r2 r
r2
The symbol
is commonly used for the gradient operator. This vector operator will appear
in several other places, the curl of a vector field will be the one you see most often.
∂
∂
∂
= ˆ
x
+ ˆ
y
+ ˆ
z
(8.17)
∂x
∂y
∂z
From Eq. (8.15) you have
grad f =
f
(8.18)
8.7 Plane Polar Coordinates
When doing integrals in the plane there are many coordinate systems to choose from, but rect-
angular and polar coordinates are the most common. You can find the element of area with a
simple sketch: The lines (or curves) of constant coordinate enclose an area that is, for small
enough increments in the coordinates, a rectangle. Then you just multiply the sides. In one case
∆x . ∆y and in the other case ∆r . r∆φ.
φ + dφ
y + dy
φ
y
r
r + dr
x x + dx
Vibrating Drumhead
A circular drumhead can vibrate in many complicated ways. The simplest and lowest frequency
mode is approximately
z(r, φ, t) = z0 1 − r2/R2 cos ωt
(8.19)
where R is the radius of the drum and ω is the frequency of oscillation. (The shape is more
accurately described by Eq. (4.22) but this approximation is pretty good for a start.) The kinetic
2
energy density of the moving drumhead is u = 1 σ ∂z/∂t . That is, in a small area ∆A, the
2
kinetic energy is ∆K = u∆A and the limit as ∆A → 0 of ∆K/∆A is the area-energy-density.
In the same way, σ is the area mass density, dm/dA.
What is the total kinetic energy because of this oscillation? It is
u dA =
u d2r. To
evaluate it, use polar coordinates and integrate over the area of the drumhead. The notation d2r
is another notation for dA just as d3r is used for a piece of volume.
8—Multivariable Calculus
12
R
2π
σ
2
u dA =
r dr
dφ
z20 (1 − r2/R2)ω sin ωt
0
0
2
σ
R
=
2πz2
dr r 1 − r2/R2 2
2
0 ω2 sin2 ωt
0
1
r=R
= σπz2
(8.20)
0 ω2 sin2 ωt
d(r2) 1 − r2/R2 2
2 r=0
r=R
1
= σπz20ω2 sin2 ωt R2 1 1 − r2/R2 3(−1)
2
3
0
1
=
σR2πz2
6
0 ω2 sin2 ωt
See problem 8.10 and following for more on this.*
8.8 Cylindrical, Spherical Coordinates
The three common coordinate systems used in three dimensions are rectangular, cylindrical, and
spherical coordinates, and these are the ones you have to master. When you need to use prolate
spheroidal coordinates you can look them up.
z
θ
r
r
z
y
x
φ
φ
−∞ < x < ∞
0 < r < ∞
0 < r < ∞
−∞ < y < ∞
0 < φ < 2π
0 < θ < π
−∞ < z < ∞
−∞ < z < ∞
0 < φ < 2π
The surfaces that have constant values of these coordinates are planes in rectangular
coordinates; planes and cylinders in cylindrical; planes, spheres, and cones in spherical. In every
one of these cases the constant-coordinate surfaces intersect each other at right angles, hence
the name “orthogonal coordinate” systems. In spherical coordinates I used the coordinate θ as
the angle from the z-axis and φ as the angle around the axis. In mathematics books these are
typically reversed, so watch out for the notation. On the globe of the Earth, φ is like the longitude
and θ like the latitude except that longitude goes 0 to 180◦ East and 0 to 180◦ West from the
Greenwich meridian instead of zero to 2π. Latitude is 0 to 90◦ North or South from the equator
instead of zero to π from the pole. Except for the North-South terminology, latitude is 90◦ − θ.
The volume elements for these systems come straight from the drawings, just as the area
elements do in plane coordinates. In every case you can draw six surfaces, bounded by constant
coordinates, and surrounding a small box. Because these are orthogonal coordinates you can
compute the volume of the box easily as the product of its three edges.
* For some animations showing the these oscillations and others, check out
www.physics.miami.edu/nearing/mathmethods/drumhead-animations.html
8—Multivariable Calculus
13
In the spherical case, one side is ∆r. Another side is r∆θ. The third side is not r∆φ; it
is r sin θ∆φ. The reason for the factor sin θ is that the arc of the circle made at constant r and
constant θ is not in a plane passing through the origin. It is in a plane parallel to the x-y plane,
so it has a radius r sin θ.
rectangular
cylindrical
spherical
volume
d3r =
dx dy dz
r dr dφ dz
r2 sin θ dr dθ dφ
area
d2r =
dx dy
r dφ dz or r dφ dr
r2 sin θ dθ dφ
Examples of Multiple Integrals
Even in rectangular coordinates integration can be tricky.
That’s because you have to pay
attention to the limits of integration far more closely than you do for simple one dimensional
integrals. I’ll illustrate this with two dimensional rectangular coordinates first, and will choose a
problem that is easy but still shows what you have to look for.
An Area
Find the area in the x-y plane between the curves y = x2/a and y = x.
√
a
x
a
ay
(A)
dx
dy 1
and
(B)
dy
dx 1
0
x2/a
0
y
y
y
x
x
In the first instance I fix x and add the pieces of dy in the strip indicated. The lower limit
of the dy integral comes from the specified equation of the lower curve. The upper limit is the
value of y for the given x at the upper curve. After that the limits on the sum over dx comes
from the intersection of the two curves: y = x = x2/a gives x = a for that limit.
In the second instance I fix y and sum over dx first. The left limit is easy, x = y, and the
upper limit comes from solving y = x2/a for x in terms of y. When that integral is done, the
remaining dy integral starts at zero and goes up to the intersection at y = x = a.
Now do the integrals.
a
a2
a3
a2
(A)
dx x − x2/a =
−
=
0
2
3a
6
a
√
a3/2
a2
a2
(B)
dy
ay − y = a1/2
−
=
0
3/2
2
6
8—Multivariable Calculus
14
If you would care to try starting this calculation from the beginning, without drawing any pictures,
be my guest.
b
a
A Moment of Inertia
The moment of inertia about an axis is
r2 dm. Here, r
⊥
⊥ is the perpendicular distance to the
axis. What is the moment of inertia of a uniform sheet of mass M in the shape of a right triangle
of sides a and b? Take the moment about the right angled vertex. The area mass density,
σ = dm/dA is 2M/ab. The moment of inertia is then
a
b(a−x)/a
a
b(a−x)/a
(x2 + y2)σ dA =
dx
dy σ(x2 + y2) =
dx σ x2y + y3/3
0
0
0
0
a
b
3
1
b
=
dx σ x2 (a − x) +
(a − x)3
0
a
3
a
b
a4
a4
1
b3 a4
= σ
−
+
a
3
4
3
a3 4
1
M
=
σ ba3 + ab3 =
a2 + b2
12
6
The dimensions are correct. For another check take the case where a = 0, reducing this to
M b2/6. But wait, this now looks like a thin rod, and I remember that the moment of inertia of
a thin rod about its end is M b2/3. What went wrong? Nothing. Look again more closely. Show
why this limiting answer ought to be less than M b2/3.
Volume of a Sphere
What is the volume of a sphere of radius R? The most obvious approach would be to use spherical
coordinates. See problem 8.16 for that. I’ll use cylindrical coordinates instead. The element of
volume is dV = r drdφdz, and the integrals can be done a couple of ways.
√
√
R
2π
+
R2−r2
+R
2π
R2−z2
d3r =
r dr
dφ
√
dz =
dz
dφ
r dr
(8.21)
0
0
−
R2−r2
−R
0
0
You can finish these now, see problem 8.17.
A Surface Charge Density
An example that appears in electrostatics: The surface charge density, dq/dA, on a sphere of
radius R is σ(θ, φ) = σ0 sin2 θ cos2 φ. What is the total charge on the sphere?
The element of area is R2 sin θ dθ dφ, so the total charge is
σ dA,
π
2π
+1
2π
Q =
sin θ dθ R2
dφ σ0 sin2 θ cos2 φ = R2
d cos θ σ0 1 − cos2 θ
dφ cos2 φ
0
0
−1
0
8—Multivariable Calculus
15
The mean value of cos2 is 1/2. so the φ integral gives π. For the rest, it is
+1
1
4
σ0πR2 cos θ −
cos3 θ
=
σ0πR2
3
−
3
1
Limits of Integration
Sometimes the trickiest part of multiple integrals is determining the limits of integration. Espe-
cially when you have to change the order of integration, the new limits may not be obvious. Are
there any special techniques or tricks to doing this? Yes, there is one, perhaps obscure, method
that you may not be accustomed to.
Draw Pictures.
If you have an integral such as the first one, you have to draw a picture of the integration
domain to switch limits.
√
y
√
√
1
2−y2
1
x
2
2−x2
dy
dx f (x, y)
dx
dy +
dx
dy f (x, y)
0
y
x
0
0
1
0
(8.22)
Of course, once you’ve drawn the picture you may realize that simply interchanging the order of
integration won’t help, but that polar coordinates may.
√2
π/4
r dr
dφ
0
0
8.9 Vectors: Cylindrical, Spherical Bases
When you describe vectors in three dimensions are you restricted to the basis ˆ
x, ˆ
y, ˆ
z? In a
different coordinate system you should use basis vectors that are adapted to that system. In
rectangular coordinates these vectors have the convenient property that they point along the
direction perpendicular to the plane where the corresponding coordinate is constant. They also
point in the direction in which the other two coordinates are constant. E.g. the unit vector ˆ
x
points perpendicular to the plane of constant x (the y-z plane); it also point along the line where
y and z are constant.
ˆ
r
z
ˆ
z
ˆ
z
θ
ˆ
r
r
φ
ˆ
y
ˆ
φ
ˆ
ˆ
x
ˆ
r
θ
y
z
φ
x
φ
Do the same thing for cylindrical coordinates. The unit vector ˆ
z points perpendicular to
the x-y plane. The unit vector ˆ
r points perpendicular to the cylinder r = constant. The unit
8—Multivariable Calculus
16
vector ˆ
φ points perpendicular to the plane φ = constant and along the direction for which r and
z are constant. The conventional right-hand rule specifies ˆ
z = ˆ
r × ˆ
φ.
For spherical coordinates ˆ
r points perpendicular to the sphere r = constant. The ˆ
φ vector
is perpendicular to the plane φ = constant and points along the direction where r = constant
and θ = constant and toward increasing coordinate φ. Finally ˆ
θ is perpendicular to the cone
θ = constant and again, points toward increasing θ. Then ˆ
φ = ˆ
r × ˆ
θ, and on the Earth, these
vectors ˆ
r, ˆ
θ, and ˆ
φ are ˆ
up,
ˆ
South, and
ˆ
East.
Solenoid
A standard solenoid is cylindrical coil of wire, so that when the wire carries a current it produces
a magnetic field. To describe this field, it seems that cylindrical coordinates are advised. Until
you know something about the field the most general thing that you can write is
B(r, φ, z) = ˆ
r Br(r, φ, z) + ˆ
φ Bφ(r, φ, z) + ˆ
z Bz(r, φ, z)
In a real solenoid that’s it; all three of these components are present. If you have an ideal,
infinitely long solenoid, with the current going strictly around in the ˆ
φ direction, (found only in
textbooks) the use of Maxwell’s equations and appropriately applied symmetry arguments will
simplify this to ˆ
z Bz(r).
Gravitational Field
The gravitational field of the Earth is simple, g = −ˆ
r GM/r2, pointing straight toward the center
of the Earth. Well no, not really. The Earth has a bulge at the equator; its equatorial diameter is
about 43 km larger than its polar diameter. This changes the g-field so that it has a noticeable ˆ
θ
component. At least it’s noticeable if you’re trying to place a satellite in orbit or to send a craft
to another planet.
A better approximation to the gravitational field of the Earth is
GM
3Q
g = −ˆ
r
− G
ˆ
r 3 cos2 θ − 1 /2 + ˆ
θ cos θ sin θ
(8.23)
r2
r4
The letter Q stands for the quadrupole moment. |Q|
M R2, and it’s a measure of the bulge.
By convention a football (American football) has a positive Q; the Earth’s Q is negative. (What
about a European football?)
Nuclear Magnetic Field
The magnetic field from the nucleus of many atoms (even as simple an atom as hydrogen) is
proportional to
1 2ˆrcos θ + ˆθsin θ
(8.24)
r3
As with the preceding example these are in spherical coordinates, and the component along the
ˆ
φ direction is zero. This field’s effect on the electrons in the atom is small but detectable.
The magnetic properties of the nucleus are central to the subject of nuclear magnetic resonance
(NMR), and that has its applications in magnetic resonance imaging* (MRI).
* In medicine MRI was originally called NMR, but someone decided that this would disconcert
the patients.
8—Multivariable Calculus
17
8.10 Gradient in other Coordinates
The equation for the gradient computed in rectangular coordinates is Eq. (8.15) or (8.18). How
do you compute it in cylindrical or spherical coordinates? You do it the same way that you got
Eq. (8.15) from Eq. (8.13). The coordinates r, φ, and z are just more variables, so Eq. (8.13)
is simply
∂f
∂f
∂f
df = df (r, φ, z, dr, dφ, dz) =
dr +
dφ +
dz
(8.25)
∂r
φ,z
∂φ r,z
∂z
r,φ
All that’s left is to write dr in these coordinates, just as in Eq. (8.15).
dr = ˆ
r dr + ˆ
φ r dφ + ˆ
z dz
(8.26)
The part in the ˆ
φ direction is the displacement of dr in that direction. As φ changes by a small
amount the distance moved is not dφ; it is r dφ. The equation
df = df (r, φ, z, dr, dφ, dz) = grad f . dr
combined with the two equations (8.25) and (8.26) gives grad f as
∂f
1 ∂f
∂f
grad f = ˆ
r
+ ˆ
φ
+ ˆ
z
=
f
(8.27)
∂r
r ∂φ
∂z
Notice that the units work out right too.
In spherical coordinates the procedure is identical. All that you have to do is to identify
what dr is.
dr = ˆ
r dr + ˆ
θ r dθ + ˆ
φ r sin θ dφ
Again with this case you have to look at the distance moved when the coordinates changes by
a small amount. Just as with cylindrical coordinates this determines the gradient in spherical
coordinates.
∂f
1 ∂f
1
∂f
grad f = ˆ
r
+ ˆ
θ
+ ˆ
φ
=
f
(8.28)
∂r
r ∂θ
r sin θ ∂φ
The equations (8.15), (8.27), and (8.28) define the gradient (and correspondingly
) in
three coordinate systems.
8.11 Maxima, Minima, Saddles
With one variable you can look for a maximum or a minimum by taking a derivative and setting
it to zero. For several variables you do it several times so that you will get as many equations as
you have unknown coordinates.
Put this in the language of gradients:
f = 0. The derivative of f vanishes in every
direction as you move from such a point. As examples,
f (x, y) = x2 + y2,
or
= −x2 − y2,
or
= x2 − y2
For all three of these the gradient is zero at (x, y) = (0, 0); the first has a minimum there, the
second a maximum, and the third neither — it is a “saddle point.” Draw a picture to see the
reason for the name. The generic term for all three of these is “critical point.”
8—Multivariable Calculus
18
An important example of finding a minimum is “least square fitting” of functions. How
close are two functions to each other? The most commonly used, and in every way the simplest,
definition of the distance (squared) between f and g on the interval a < x < b is
b
2
dx f (x) − g(x)
(8.29)
a
This means that a large deviation of one function from the other in a small region counts more
than smaller deviations spread over a larger domain. The square sees to that. As a specific
example, take a function f on the interval 0 < x < L and try to fit it to the sum of a couple of
trigonometric functions. The best fit will be the one that minimizes the distance between f and
the sum. (Take f to be a real-valued function for now.)
L
πx
2
2πx
D2(α, β) =
dx
f (x) − α sin
− β sin
(8.30)
0
L
L
D is the distance between the given function and the sines used to fit it. To minimize the
distance, take derivatives with respect to the parameters α and β.
∂D2
L
πx
2πx
πx
= 2
dx
f (x) − α sin
− β sin
− sin
= 0
∂α
0
L
L
L
∂D2
L
πx
2πx
2πx
= 2
dx
f (x) − α sin
− β sin
− sin
= 0
∂β
0
L
L
L
These two equations determine the parameters α and β.
L
πx
L
πx
α
dx sin2
=
dx f (x) sin
0
L
0
L
L
L
2πx
β
dx sin2 2πx =
dx f (x) sin
0
L
0
L
The other integrals vanish because of the orthogonality of sin πx/L and sin 2πx/L on this
interval. What you get is exactly the coefficients of the Fourier series expansion of f . The
Fourier series is the best fit (in the least square sense) of a sum of orthogonal functions to f .
See section 11.6 for more on this
Is it a minimum? Yes. Look at the coefficients of α2 and β2 in Eq. (8.30). They are
positive; +α2 + β2 has a minimum, not a maximum or saddle point, and there is no cross term
in αβ to mess it up.
The distance function Eq. (8.29) is simply (the square of) the norm in the vector space
sense of the difference of the two vectors f and g. Equations(6.12) and (6.7) here become
shortest distance
to the plane
b
2
f − g 2 = f − g, f − g =
dx f (x) − g(x)
e2
a
e1
8—Multivariable Calculus
19
The geometric meaning of Eq. (8.30) is that e1 and e2 provide a basis for the two dimensional
space
πx
2πx
α e1 + β e2 = α sin
+ β sin
L
L
The plane is the set of all linear combinations of the two vectors, and for a general vector not
in this plane, the shortest distance to the plane defines the vector in the plane that is the best
fit to the given vector. It’s the one that’s closest. Because the vectors e1 and e2 are orthogonal
it makes it easy to find the closest vector. You require that the difference, v − αe1 − βe2 have
only an e3 component. That is Fourier series.
Hessian
In this example leading to Fourier components, it’s pretty easy to see that you’re dealing with a
minimum and not anything else. In other situations it may not be so easy. You may have a lot
of variables. You may have complicated cross terms. Is x2 + xy + y2 a minimum at the origin?
Is x2 + 3xy + y2? (Yes and No respectively.)
When there’s only one variable there is a simple rule that lets you decide. Check the second
derivative. If it’s positive you have a minimum; if it’s negative you have a maximum. If it’s zero
you have more work to do. Is there a similar method for several variables? Yes, and I’ll show it
explicitly for two variables. Once you see how to do it in two dimensions, the generalization to
N is just a matter of how much work you’re willing to do (or how much computer time you can
use).
The Taylor series in two variables, Eq. (2.16), is to second order
∂f
∂f
∂2f
∂2f
∂2f
f (x + dx, y + dy) = f (x, y) +
dx +
dy +
dx2 + 2
dx dy +
dy2 + · · ·
∂x
∂y
∂x2
∂x∂y
∂y2
Write this in a more compact notation in order to emphasize the important parts.
f (r + dr ) − f (r ) =
f . dr + dr, H dr + · · ·
The part with the gradient is familiar, and to have either a minimum or a maximum, that will
have to be zero. The next term introduces a new idea, the Hessian, constructed from all the
second derivative terms. Write these second order terms as a matrix to see what they are, and in
order to avoid a lot of clumsy notation use subscripts as an abbreviation for the partial derivatives.
f
dx
dr, H dr = ( dx
dy )
xx
fxy
where
dr = ˆ
x dx + ˆ
y dy
(8.31)
fyx
fyy
dy
This matrix is symmetric because of the properties of mixed partials. How do I tell from
this whether the function f has a minimum or a maximum (or neither) at a point where the
gradient of f is zero? Eq. (8.31) describes a function of two variables even after I’ve fixed the
values of x and y by saying that
f = 0. It is a quadratic function of dx and dy. Expressed in
the language of vectors this says that f has a minimum if (8.31) is positive no matter what the
direction of dr is — H is positive definite.
Pull back from the problem a step. This is a 2 × 2 symmetric matrix sandwiched inside a
scalar product.
a
b
x
h(x, y) = ( x
y )
(8.32)
b
c
y
8—Multivariable Calculus
20
Is h positive definite? That is, positive for all x, y? If this matrix is diagonal it’s much easier to
see what is happening, so diagonalize it. Find the eigenvectors and use those for a basis.
a
b
x
x
a − λ
b
= λ
requires
det
= 0
b
c
y
y
b
c − λ
λ2 − λ(a + c) + ac − b2 = 0 =⇒ λ = (a + c) ±
(a − c)2 + b2
2
(8.33)
For the applications here all the a, b, c are the real partial derivatives, so the eigenvalues are
real and the only question is whether the λs are positive or negative, because they will be the
(diagonal) components of the Hessian matrix in the new basis. If this is a double root, the matrix
was already diagonal. You can verify that the eigenvalues are positive if a > 0, c > 0, and
4ac > b2, and that will indicate a minimum point.
Geometrically the equation z = h(x, y) from Eq. (8.32) defines a surface. If it is positive
definite the surface is a paraboloid opening upward. If negative definite it is a paraboloid opening
down. The mixed case is a hyperboloid — a saddle.
In this 2 × 2 case you have a quadratic formula to fall back on, and with more variables
there are standard algorithms for determining eigenvalues of matrices, but I’ll leave those to some
other book.
8.12 Lagrange Multipliers
This is an incredibly clever method to handle problems of maxima and minima in several variables
when there are constraints.
An example: “What is the largest rectangle?” obviously has no solution, but “What is the
largest rectangle contained in an ellipse?” does.
Another: Particles are to be placed into states of specified energies. You know the total
number of particles; you know the total energy. All else being equal, what is the most probable
distribution of the number of particles in each state?
I’ll describe this procedure for two variables; it’s the same for more. The problem stated
is that I want to find the maximum (or minimum) of a function f (x, y) given the fact that the
coordinates x and y must lie on the curve φ(x, y) = 0. If you can solve the φ equation for y in
terms of x explicitly, then you can substitute it into f and turn it into a problem in ordinary one
variable calculus. What if you can’t?
Analyze this graphically. The equation φ(x, y) = 0 represents one curve in the plane. The
succession of equations f (x, y) = constant represent many curves in the plane, one for each
constant. Think of equipotentials.
φ = 0
φ = 0
f = 5
f = 5
4
4
1 2 3
1 2 3
f = 0
f = 0
Look at the intersections of the φ-curve and the f -curves. Where they intersect, they will
usually cross each other. Ask if such a crossing could possibly be a point where f is a maximum.
Clearly the answer is no, because as you move along the φ-curve you’re then moving from a point
where f has one value to where it has another.
8—Multivariable Calculus
21
The only way to have f be a maximum at a point on the φ-curve is for them to touch
and not cross. When that happens the values of f will increase as you approach the point from
one side and decrease on the other. That makes it a maximum. In this sketch, the values of f
decrease from 4 to 3 to 2 and then back to to 3, 4, and 5. This point where the curve f = 2
touches the φ = 0 curve is then a minimum of f along φ = 0.
To implement this picture so that you can compute with it, look at the gradient of f
and the gradient of φ. The gradient vectors are perpendicular to the curves f =constant and
φ =constant respectively, and at the point where the curves are tangent to each other these
gradients are in the same direction (or opposite, no matter). Either way one vector is a scalar
times the other.
f = λ φ
(8.34)
In the second picture, the arrows are the gradient vectors for f and for φ. Break this into
components and you have
∂f
∂φ
∂f
∂φ
− λ
= 0,
− λ
= 0,
φ(x, y) = 0
∂x
∂x
∂y
∂y
There are three equations in three unknowns (x, y, λ), and these are the equations to solve for
the position of the maximum or minimum value of f . You’re looking for x and y, so you’ll be
tempted to ignore the third variable λ and to eliminate it. Look again. This parameter, the
Lagrange multiplier, has a habit of being significant.
Examples of Lagrange Multipliers
The first example that I mentioned: What is the largest rectangle that you can inscribe in an
ellipse? Let the ellipse and the rectangle be centered at the origin. The upper right corner of the
rectangle is at (x, y), then the area of the rectangle is
Area = f (x, y) = 4xy,
x2
y2
with constraint φ(x, y) =
+
− 1 = 0
a2
b2
The equations to solve are now
(f − λφ) = 0,
and
φ = 0,
which become
2x
2y
x2
y2
4y − λ
= 0,
4x − λ
= 0,
+
− 1 = 0
(8.35)
a2
b2
a2
b2
√
√
The solutions to these three equations are straight-forward. They are x = a/ 2, y = b/ 2,
λ = 2ab. The maximum area is then 4xy = 2ab. The Lagrange multiplier turns out to be the
required area. Does this reduce to the correct result for a circle?
The second example said that you have several different allowed energies, typical of what
happens in quantum mechanics. If the total number of particles and the total energy are given,
how are the particles distributed among the different energies?
If there are N particles and exactly two energy levels, E1 and E2,
N = n1 + n2,
and
E = n1E1 + n2E2
8—Multivariable Calculus
22
you have two equations in two unknowns and all you have to do is solve them for the numbers
n1 and n2, the number of particles in each state. If there are three or more possible energies the
answer isn’t uniquely determined by just two equations, and there can be many ways that you
can put particles into different energy states and still have the same number of particles and the
same total energy.
If you’re dealing with four particles and three energies, you can perhaps count the possi-
bilities by hand. How many ways can you put four particles in three states? (400), (310), (301),
(220), 211), etc. There’s only one way to get the (400) configuration: All four particles go into
state 1. For (310) there are four ways to do it; any one of the four particles can be in the second
state and the rest in the first. Keep going. If you have 1020 particles you have to find a better
way.
If you have a total of N particles and you place n1 of them in the first state, the number
of ways that you can do that is N for the first particle, (N − 1) for the second particle, etc.
= N (N − 1)(N − 2) · · · (N − n1 + 1) = N !/(N − n1)!. This is over-counting because you don’t
care which one went into the first state first, only that it’s there. There are n1! rearrangements
of these n1 particles, so you have to divide by that to get the number of ways that you can get
this number of particles into state 1: N !/n1!(N − n1)! For example, N = 4, n1 = 4 as in the
(400) configuration in the preceding paragraph is 4!/0!4! = 1, or 4!/3!1! = 4 as in the (310)
configuration.
Once you’ve got n1 particles into the first state you want to put n2 into the second state
(out of the remaining N − n1). Then on to state 3.
The total number of ways that you can do this is the product of all of these numbers. For
three allowed energies it is
N !
(N − n
(N − n
N !
.
1)!
.
1 − n2)!
=
(8.36)
n1!(N − n1)! n2!(N − n1 − n2)! n3!(N − n1 − n2 − n3)!
n1!n2!n3!
There’s a lot of cancellation and the final factor in the denominator is one because of the constraint
n1 + n2 + n3 = N .
Lacking any other information about the particles, the most probable configuration is the
one for which Eq. (8.36) is a maximum. This calls for Lagrange multipliers because you want
to maximize a complicated function of several variables subject to constraints on N and on E.
Now all you have to do is to figure out out to differentiate with respect to integers. Answer: If
N is large you will be able to treat these variables as continuous and to use standard calculus to
manipulate them.
For large n, recall Stirling’s formula, Eq. (2.20),
√
√
n! ∼
2πn nne−n
or its log:
ln(n!) ∼ ln
2πn + n ln n − n
(8.37)
This, I can differentiate. Maximizing (8.36) is the same as maximizing its logarithm, and that’s
easier to work with.
maximize f = ln(N !) − ln(n1!) − ln(n2!) − ln(n3!)
subject to n1 + n2 + n3 = N
and
n1E1 + n2E2 + n3E3 = E
There are two constraints here, so there are two Lagrange multipliers.
f − λ1(n1 + n2 + n3 − N ) − λ2(n1E1 + n2E2 + n3E3 − E) = 0
8—Multivariable Calculus
23
√
For f , use Stirling’s approximation, but not quite. The term ln
2πn is negligible. For n as
small as 106, it is about 6 × 10−7 of the whole. Logarithms are much smaller than powers. That
means that I can use
3
− n ln(n ) + n
− λ1n − λ2n E
= 0
=1
This is easier than it looks because each derivative involves only one coordinate.
∂
→ − ln n1 − 1 + 1 − λ1 − λ2E1 = 0, etc.
∂n1
This is
n = e−λ1−λ2E ,
= 1, 2, 3
There are two unknowns here, λ1 and λ2. There are two equations, for N and E, and the
parameter λ1 simply determines an overall constant, e−λ1 = C.
3
3
C
e−λ2E = N,
and
C
E e−λ2E = E
=1
=1
The quantity λ2 is usually denoted β in this type of problem, and it is related to temperature
by β = 1/kT where as usual the Lagrange multiplier is important on its own. It is usual to
manipulate these results by defining the “partition function”
3
Z(β) =
e−βE
(8.38)
=1
In terms of this function Z you have
N dZ
C = N/Z,
and
E = −
(8.39)
Z dβ
For a lot more on this subject, you can refer to any one of many books on thermodynamics or
statistical physics. There for example you can find the reason that β is related to the temperature
and how the partition function can form the basis for computing everything there is to compute
in thermodynamics. Especially there you will find that more powerful versions of the same ideas
will arise when you allow the total energy and the total number of particles to be variables too.
8.13 Solid Angle
The extension of the concept of angle to three dimensions is called “solid angle.” To explain what
this is, I’ll first show a definition of ordinary angle that’s different from what you’re accustomed
to. When you see that, the extension to one more dimension is easy.
Place an object in the plane somewhere not at the origin. You are at the origin and look
at it. I want a definition that describes what fraction of the region around you is spanned by
this object. For this, draw a circle of radius R centered at the origin and draw all the lines from
8—Multivariable Calculus
24
everywhere on the object to the origin. These lines will intersect the circle on an arc (or even a
set of arcs) of length s. Define the angle subtended by the object to be θ = s/R.
s
A
R
R
Now step up to three dimensions and again place yourself at the origin. This time place
a sphere of radius R around the origin and draw all the lines from the three dimensional object
to the origin. This time the lines intersect the sphere on an area of size A. Define the solid
angle subtended by the object to be Ω = A/R2. (If you want four or more dimensions, see
problem 8.52.)
For the circle, the circumference is 2πR, so if you’re surrounded, the angle subtended is
2πR/R = 2π radians. For the sphere, the area is 4πR2, so this time if you’re surrounded, the
solid angle subtended is 4πR2/R2 = 4π sterradians. That is the name for this unit.
All very pretty.
Is it useful?
Only if you want to describe radiative transfer, nuclear
scattering, illumination, the structure of the atom, or rainbows. Except for illumination, these
subjects center around one idea, that of a “cross section.”
Cross Section, Absorption
Before showing how to use solid angle to describe scattering, I’ll take a simpler example: ab-
sorption. There is a hole in a wall and I propose to measure its area. Instead of taking a ruler
to it I blindly fire bullets at the wall and see how many go in. The bigger the area, the larger
the fraction that will go into the hole of course, but I have to make this quantitative to make it
useful.
Define the flux of bullets: f = dN/(dt dA). That is, suppose that I’m firing all the bullets
in the same direction, but not starting from the same place. Pick an area ∆A perpendicular to
the stream of bullets and pick a time interval ∆t. How many bullets pass through this area in
this time? ∆N , and that’s proportional to both ∆A and ∆t. The limit of this quotient is the
flux.
∆N
lim
= f
(8.40)
∆t→0 ∆t∆A
∆A→0
Having defined the flux as a kind of density, call the (unknown) area of the hole σ. The rate at
which these bullets enter the hole is proportional to the size of the hole and to the flux of bullets,
R = f σ, where R is the rate of entry and σ is the area of the hole. If I can measure the rate
of absorption R and the flux f , I have measured the area of the hole, σ = R/f . This letter is
8—Multivariable Calculus
25
commonly used for cross sections.
Why go to this complicated trouble for a hole? I probably shouldn’t, but to measure
absorption of neutrons hitting nuclei this is precisely what you do. I can’t use a ruler on a
nucleus, but I can throw things at it. In this example, neutron absorption by nuclei, the value
of the measured absorption cross section can vary from millibarns to kilobarns, where a barn is
10−24 cm2. The radii of nuclei vary by a factor of only about six from hydrogen through uranium
√
( 3 238 = 6.2), so the cross section measured this way has little to do with the geometric area
πr2. It is instead a measure of interaction strength
Cross Section, Scattering
There are many types of cross sections besides absorption, and the next simplest is the scattering
cross section, especially the differential scattering cross section.
dΩ
b + db
θ
b
θ
dσ = 2πb db
The same flux of particles that you throw at an object may not be absorbed, but may
scatter instead. You detect the scattering by using a detector. (You were expecting a catcher’s
mitt?) The detector will have an area ∆A facing the particles and be at a distance r from the
center of scattering. The detection rate will be proportional the the area of the detector, but if
I double r for the same ∆A, the detection rate will go down by a factor of four. The detection
rate is proportional to ∆A/r2, but this is just the solid angle of the detector from the center:
∆Ω = ∆A/r2
(8.41)
The detection rate is proportional to the incoming flux and to the solid angle of the detector.
The proportionality is an effective scattering area, ∆σ.
dσ
dR
∆R = f ∆σ,
so
=
(8.42)
dΩ
f dΩ
This is the differential scattering cross section.
You can compute this if you know something about the interactions involved. The one
thing that you need is the relationship between where the particle comes in and the direction in
which it leaves. That is, the incoming particle is aimed to hit at a distance b (called the impact
parameter) from the center and it scatters at an angle θ, called of course the scattering angle,
from its original direction. Particles that come in at distance between b and b + db from the axis
through the center will scatter into directions between θ and θ + dθ.
The cross section for being sent in a direction between these two angles is the area of
the ring: dσ = 2πb db. Anything that hits in there will scatter into the outgoing angles shown.
How much solid angle is this? Put the z-axis of spherical coordinates to the right, so that θ is
the usual spherical coordinate angle from z. The element of area on the surface of a sphere is
8—Multivariable Calculus
26
dA = r2 sin θdθdφ, so the integral over all the azimuthal angles φ around the ring just gives a
factor 2π. The element of solid angle is then
dA
dΩ =
= 2π sin θdθ
r2
As a check on this, do the integral over all theta to get the total solid angle around a point,
verifying that it is 4π.
Divide the effective area for this scattering by the solid angle, and the result is the differential
scattering cross section.
dσ
2πb db
b
db
=
=
dΩ
2π sin θ dθ
sin θ dθ
If you have θ as a function of b, you can compute this. There are a couple of very minor
modifications that you need in order to complete this development. The first is that the derivative
db/dθ can easily be negative, but both the area and the solid angle are positive. That means
that you need an absolute value here. One other complication is that one value of θ can come
from several values of b. It may sound unlikely, but it happens routinely. It even happens in the
example that comes up in the next section.
dσ
b
db
=
i
i
(8.43)
dΩ
sin θ dθ
i
The differential cross section often becomes much more involved than this, especially the
when it involves nuclei breaking up in a collision, resulting in a range of possible energies of each
part of the debris. In such collisions particles can even be created, and the probabilities and energy
ranges of the results are described by their own differential cross sections. You will wind up with
differential cross sections that look like dσ/dΩ1 dΩ2 . . . dE1 dE2 . . .. These rapidly become so
complex that it takes some elaborate computer programming to handle the information.
8.14 Rainbow
An interesting, if slightly complicated example is the rainbow. Sunlight scatters from small drops
of water in the air and the detector is your eye. The water drops are small enough that I’ll assume
them to be spheres, where surface tension is enough to hold them in this shape for the ordinary
small sizes of water droplets in the air. The first and simplest model uses geometric optics and
Snell’s law to figure out where the scattered light goes. This model ignores the wave nature of
light and it does not take into account the fraction of the light that is transmitted and reflected
at each surface.
β
b
α
β
α
α
sin β = n sin α
θ = (β − α) + (π − 2α) + (β − α)
α
b = R sin β
(8.44)
θ
β
8—Multivariable Calculus
27
The light comes in at the indicated distance b from the axis through the center of the
sphere. It is then refracted, reflected, and refracted. Snell’s law describes the first and third of
these, and the middle one has equal angles of incidence and reflection. The dashed lines are from
the center of the sphere. The three terms in Eq. (8.44) for the evaluation of θ come from the
three places at which the light changes direction, and they are the amount of deflection at each
place. The third equation simply relates b to the radius of the sphere.
From these three equations, eliminate the two variables α and β to get the single relation
between b and θ that I’m looking for. When you do this, you find that the resulting equations are
a bit awkward. It’s sometimes easier to use one of the two intermediate angles as a parameter,
and in this case you will want to use β. From the picture you know that it varies from zero to
π/2. The third equation gives b in terms of β. The first equation gives α in terms of β. The
second equation determines θ in terms of β and the α that you’ve just found.
The parametrized relation between b and θ is then
b = R sin β,
θ = π + 2β − 4 sin−1
1 sin β ,
(0 < β < π/2)
(8.45)
n
or you can carry it through and eliminate β.
b
b
θ = π + 2 sin−1
− 4 sin−1
1
(8.46)
R
n R
The derivative db/dθ = 1 [dθ/db]. Compute this.
dθ
2
4
= √
− √
(8.47)
db
R2 − b2
n2R2 − b2
In the parametrized form this is
db
db/dβ
R cos β
=
=
dθ
dθ/dβ
2 − 4 cos β/
n2 − sin2 β
In analyzing this, it’s convenient to have both forms, as you never know which one will be easier
to interpret. (Have you checked to see if they agree with each other in any special cases?)
R
b
n = 1 to 1.5, left to right
dσ/dΩ
0 0
90
180
0
90
180
θ
θ
These graphs are generated from Eq. (8.45) for eleven values of the index of refraction
equally spaced from 1 to 1.5, and the darker curve corresponds to n = 1.3. The key factor that
8—Multivariable Calculus
28
enters the cross-section calculation, Eq. (8.43), is db/dθ, because it goes to infinity when the
curve has a vertical tangent. For water, with n = 1.33, the b-θ curve has a vertical slope that
occurs for θ a little less than 140◦. That is the rainbow.
To complete this I should finish with dσ/dΩ. The interesting part of the problem is near
the vertical part of the curve. To see what happens near such a point use a power series expansion
near there. Not b(θ) but θ(b). This has zero derivative here, so near the vertical point
θ(b) = θ0 + γ(b − b0)2
At (b0, θ0), Eq. (8.47) gives zero and Eq. (8.46) tells you θ0. The coefficient γ comes from the
second derivative of Eq. (8.46) at b0. What is the differential scattering cross section in this
neighborhood?
1
b = b0 ±
(θ − θ0)/γ,
so
db/dθ = ± 2 γ(θ − θ0)
dσ
b
db
=
i
i
dΩ
sin θ dθ
i
b0 +
(θ − θ0)/γ
1
b0 −
(θ − θ0)/γ
1
=
+
sin θ
2
γ(θ − θ
sin θ
0)
2
γ(θ − θ0)
b0
b0
=
≈
(8.48)
sin θ
γ(θ − θ0)
sin θ0
γ(θ − θ0)
In the final expression, because this is near θ − θ0 and because I’m doing a power series expansion
of the exact solution anyway, I dropped all the θ-dependence except the dominant factors. This
is the only consistent thing to do because I’ve previously dropped higher order terms in the
expansion of θ(b).
Why is this a rainbow? (1) With the sun at your back you see a bright arc of a circle in the
direction for which the scattering cross-section is very large. The angular radius of this circle is
π − θ0 ≈ 42◦. (2) The value of θ0 depends on the index of refraction, n, and that varies slightly
with wavelength. The variation of this angle of peak intensity is
dθ0
dθ0 db0 dn
=
(8.49)
dλ
db0 dn dλ
When you graph Eq. (8.48) note carefully that it is zero on the left of θ0 (smaller θ)
and large on the right. Large scattering angles correspond to the region of the sky underneath
the rainbow, toward the center of the circular arc. This implies that there is much more light
scattered toward your eye underneath the arc of the rainbow than there is above it. Look at your
next rainbow and compare the area of sky below and above the rainbow.
There’s a final point about this calculation. I didn’t take into account the fact that when
light hits a surface, some is transmitted and some is reflected. The largest effect is at the point
of internal reflection, because typically only about two percent of the light is reflected and the
rest goes through. The cross section should be multiplied by this factor to be complete. The
8—Multivariable Calculus
29
detailed equations for this are called the Fresnel formulas and they tell you the fraction of the
light transmitted and reflected at a surface as a function of angle and polarization.
This is far from the whole story about rainbows. Light is a wave, and the geometric optics
approximation that I’ve used doesn’t account for everything. In fact Eq. (8.43) doesn’t apply
to waves, so the whole development has to be redone. To get an idea of some of the other
phenomena associated with the rainbow, see for example
www.usna.edu/Users/oceano/raylee/RainbowBridge/Chapter 8.html
www.philiplaven.com/links.html
8.15 3D Visualization
Wrapping your mind around three dimensional objects is a practiced skill, one that takes time to
master. For an interesting way to enhance this ability, I recommend the Java Applet
www.ausserfern.at/pbeck/blockout/
Exercises
1 For the functions f (x, y) = Axy2 sin(xy), x(t) = Ct3, y(t) = Dt2, compute df /dt two
ways. First use the chain rule, then do explicit substitution and compute it directly.
2 Compute ∂f /∂x
and ∂f /∂y
for
y
x
(a) f (x, y) = x2 − 2xy + y2,
(b) f (x, y) = ln(y/x),
(c) f (x, y) = (y + x)/(y − x)
3 Compute df /dx using the chain rule for
(a) f (x, y) = ln(y/x),
y = x2,
(b) f (x, y) = (y + x)/(y − x),
y = αx,
(c) f (x, y) = sin(xy),
y = 1/x
Also calculate the results by substituting y explicitly and then differentiating, comparing the
results.
4 Let f (x, y) = x2 − 2xy, and the polar coordinates are x = r cos φ, y = r sin φ. Compute
∂f
∂f
∂f
∂f
∂f
∂f
,
,
,
,
,
∂x y
∂y x
∂x r
∂y r
∂x φ
∂y φ
5 Let f (x, y) = x2 − 2xy, and the polar coordinates are x = r cos φ, y = r sin φ. Compute
∂f
∂f
∂f
∂f
∂f
∂f
,
,
,
,
,
∂r
φ
∂φ r
∂r
x
∂φ x
∂r
y
∂φ y
6 For the function f (u, v) = u3 − v3, what is the value at (u, v) = (2, 1)? Approximately what
is its value at (u, v) = (2.01, 1.01)? Approximately what is its value at (u, v) = (2.01, 0.99)?
8—Multivariable Calculus
30
7 Assume the Earth’s atmosphere is uniform density and 10 km high, what is its volume? What
is the ratio of this volume to the Earth’s?
8 For a cube 1 m on a side, what volume of paint will you need in order to paint it to a thickness
of 0.2 mm? Don’t forget to paint all the sides.
9 What is grad r2? Do it in both rectangular and polar coordinates. Two dimensions will do.
Are your results really the same?
10 What is grad αx2 + βy2 . Do this in both rectangular and polar coordinates. For the polar
form, put x and y in terms of r and φ, then refer to Eq. (8.27) for the polar form of the gradient.
Finally, compare the two results.
8—Multivariable Calculus
31
Problems
8.1 Let r =
x2 + y2, x = A sin ωt, y = B cos ωt. Use the chain rule to compute the
derivative with respect to t of ekr. Notice the various checks you can do on the result, verifying
(or disproving) your result.
8.2 Sketch these functions* in plane polar coordinates:
(a) r = a cos φ
(b) r = a sec φ
(c) r = aφ
(d) r = a/φ
(e) r2 = a2 sin 2φ
8.3 The two coordinates x and y are related by f (x, y) = 0. What is the derivative of y with
respect to x under these conditions? [What is df along this curve? And have you drawn a
sketch?] Make up a test function (with enough structure to be a test but still simple enough to
verify your answer independently) and see if your answer is correct. Ans: −(∂f /∂x) (∂f /∂y)
8.4 If x = u + v and y = u − v, show that
∂y
∂y
= −
∂x u
∂x v
Do this by application of the chain rule, Eq. (8.6). Then as a check do the calculation by explicit
elimination of the respective variables v and u.
8.5 If x = r cos φ and y = r sin φ, compute
∂x
∂x
and
∂r φ
∂r y
8.6 What is the differential of f (x, y, z) = ln(xyz).
8.7 If f (x, y) = x3 + y3 and you switch to plane polar coordinates, use the chain rule to evaluate
∂f
∂f
∂2f
∂2f
∂2f
,
,
,
,
∂r
φ
∂φ r
∂r2
φ
∂φ2 r
∂r∂φ
Check one or more of these by substituting r and φ explicitly and doing the derivatives.
8.8 When current I flows through a resistance R the heat produced is I2R. Two terminals are
connected in parallel by two resistors having resistance R1 and R2. Given that the total current
is divided as I = I1 + I2, show that the condition that the total heat generated is a minimum
leads to the relation I1R1 = I2R2. You don’t need Lagrange multipliers to solve this problem,
but try them anyway.
8.9 Sketch the magnetic field represented by Eq. (8.24). I suggest that you start by fixing r
and drawing the B-vectors at various values of θ. It will probably help your sketch if you first
compute the magnitude of B to see how it varies around the circle. Recall, this field is expressed
* See www-groups.dcs.st-and.ac.uk/˜history/Curves/Curves.html for more.
8—Multivariable Calculus
32
in spherical coordinates, though you can take advantage of its symmetry about the z-axis to
make the drawing simpler. Don’t stop with just the field at fixed r as I suggested you begin. The
field fills space, so try to describe it.
8.10 A drumhead can vibrate in more complex modes. One such mode that vibrates at a
frequency higher than that of Eq. (8.19) looks approximately like
z(r, φ, t) = A r 1 − r2/R2 sin φ cos ω2t
(a) Find the total kinetic energy of this oscillating drumhead.
(b) Sketch the shape of the drumhead at t = 0. Compare it to the shape of Eq. (8.19).
At the instant that the total kinetic energy is a maximum, what is the shape of the drumhead?
Ans: π σA2ω2R4 sin2 ω
48
2
2t
8.11 Just at there is kinetic energy in a vibrating drumhead, there is potential energy, and as
the drumhead moves its total potential energy will change because of the slight stretching of the
material. The potential energy density (d P.E./dA) in a drumhead is
1
2
up = T
z
2
T is the tension in the drumhead. It has units of Newtons/meter and it is the force per length
you would need if you cut a small slit in the surface and had to hold the two sides of the slit
together. This potential energy arises from the slight stretching of the drumhead as it moves
away from the plane of equilibrium.
(a) For the motion described by Eq. (8.19) compute the total potential energy. (Naturally, you
will have checked the dimensions first to see if the claimed expression for up is sensible.)
(b) Energy is conserved, so the sum of the total potential energy and the total kinetic energy
from Eq. (8.20) must be a constant. What must the frequency ω be for this to hold? Is this a
plausible result? A more accurate result, from solving a differential equation, is 2.405
T /σR2.
Ans:
6T /σR2 = 2.45
T /σR2
8.12 Repeat the preceding problem for the drumhead mode of problem 8.10. The exact result,
calculated in terms of roots of Bessel functions is 3.832
T /σR2.
Ans: 4
T /σR2
8.13 Sketch the gravitational field of the Earth from Eq. (8.23). Is the direction of the field
plausible? Draw lots of arrows.
8.14 Prove that the unit vectors in polar coordinates are related to those in rectangular coordi-
nates by
ˆ
r = ˆ
x cos φ + ˆ
y sin φ,
ˆ
φ = −ˆ
x sin φ + ˆ
y cos φ
What are ˆ
x and ˆ
y in terms of ˆ
r and ˆ
φ?
8.15 Prove that the unit vectors in spherical coordinates are related to those in rectangular
coordinates by
ˆ
r = ˆ
x sin θ cos φ + ˆ
y sin θ sin φ + ˆ
z cos θ
ˆ
θ = ˆ
x cos θ cos φ + ˆ
y cos θ sin φ − ˆ
z sin θ
ˆ
φ = −ˆ
x sin φ + ˆ
y cos φ
8—Multivariable Calculus
33
8.16 Compute the volume of a sphere using spherical coordinates. Also do it using rectangular
coordinates. Also do it in cylindrical coordinates.
8.17 Finish both integrals Eq. (8.21). Draw sketches to demonstrate that the limits stated there
are correct.
8.18 Find the volume under the plane 2x + 2y + z = 8a and over the triangle bounded by the
lines x = 0, y = 2a, and x = y in the x-y plane. Ans: 8a3
8.19 Find the volume enclosed by the doughnut-shaped surface (spherical coordinates) r =
a sin θ. Ans: π2a3/4
8.20 In plane polar coordinates, compute ∂ˆ
r/∂φ, also ∂ ˆ
φ/∂φ. This means that r is fixed and
you’re finding the change in these vectors as you move around a circle. In both cases express the
answer in terms of the ˆ
r- ˆ
φ vectors. Draw pictures that will demonstrate that your answers are
at least in the right direction. Ans: ∂ ˆ
φ/∂φ = −ˆ
r
8.21 Compute the gradient of the distance from the origin (in three dimensions) in three coor-
dinate systems and verify that they agree.
8.22 Taylor’s power series expansion of a function of several variables was discussed in section
2.5. The Taylor series in one variable was expressed in terms of an exponential in problem 2.30.
Show that the series in three variables can be written as
eh . f (x, y, z)
8.23 The wave equation is (a) below. Change variables to z = x − vt and w = x + vt and show
that in these coordinates this equation is (b) (except for a constant factor). Did you explicitly
note which variables are kept fixed at each stage of the calculation? See also problem 8.53.
∂2u
1 ∂2u
∂2u
(a)
−
= 0
(b)
= 0
∂x2
v2 ∂t2
∂z∂w
8.24 The equation (8.23) comes from taking the gradient of the Earth’s gravitational potential
in an expansion to terms in 1/r3.
GM
GQ
V = −
−
P2(cos θ)
r
r3
where P2(cos θ) = 3 cos2 θ − 1 is the second order Legendre polynomial. Compute g = − V .
2
2
8.25 In problem 2.25 you computed the electric potential at large distances from a pair of
charges, −q at the origin and +q at z = a (r
a). The result was
kqa
V =
P1(cos θ)
r2
8—Multivariable Calculus
34
where P1(cos θ) = cos θ is the first order Legendre polynomial. Compute the electric field from
this potential, E = − V . And sketch it of course.
8.26 In problem 2.26 you computed the electric potential at large distances from a set of three
charges, −2q at the origin and +q at z = ±a (r
a). The result was
kqa2
V =
P2(cos θ)
r3
where P2(cos θ) is the second order Legendre polynomial. Compute the electric field from this
potential, E = − V . And sketch it of course.
8.27 Compute the area of an ellipse having semi-major and semi-minor axes a and b. Compare
your result to that of Eq. (8.35). Ans: πab
8.28 Two equal point charges q are placed at z = ±a. The origin is a point of equilibrium;
E = 0 there. (a) Compute the potential near the origin, writing V in terms of powers of x, y,
and z near there, carrying the powers high enough to describe the nature of the equilibrium point.
Is V maximum, minimum, or saddle point there? It will be easier if you carry the calculation as
far as possible using vector notation, such as |r − aˆ
z| =
(r − aˆ
z)2, and r
a.
(b) Write your result for V near the origin in spherical coordinates also.
Ans:
2q
1 + r2 3 cos2 θ − 1
4π 0a
a2 2
2
8.29 When current I flows through a resistance R the heat produced is I2R. Two terminals are
connected in parallel by three resistors having resistance R1, R2, and R3. Given that the total
current is divided as I = I1 + I2 + I3, show that the condition that the total heat generated
is a minimum leads to the relation I1R1 = I2R2 = I3R3. You can easily do problem 8.8 by
eliminating a coordinate the doing a derivative. Here it’s starting to get sufficiently complex that
you should use Lagrange multipliers. Does λ have any significance this time?
8.30 Given a right circular cylinder of volume V , what radius and height will provide the minimum
total area for the cylinder. Ans: r = (V /2π)1/3, h = 2r
8.31 Sometimes the derivative isn’t zero at a maximum or a minimum. Also, there are two
types of maxima and minima; local and global. The former is one that is max or min in the
immediate neighborhood of a point and the latter is biggest or smallest over the entire domain
of the function. Examine these functions for maxima and minima both inside the domains and
on the boundary.
|x|, (−1 ≤ x ≤ +2)
T0 x2 − y2 /a2, (−a ≤ x ≤ a, −a ≤ y ≤ a)
V0(r2/R2)P2(cos θ), (r ≤ R, 3 dimensions)
8.32 In Eq. (8.39) it is more common to specify N and β = 1/kT , the Lagrange multiplier,
than it is to specify N and E, the total energy. Pick three energies, E , to be 1, 2, and 3 electron
volts. (a) What is the average energy, E/N , as β → ∞ (T → 0)?
8—Multivariable Calculus
35
(b) What is the average energy as β → 0?
(c) What are n1, n2, and n3 in these two cases?
√
8.33 (a) Find the gradient of V , where V = V
x2+y2+z2 /a
0(x2 + y2 + z2)a−2e−
. (b) Find
the gradient of V , where V = V0(x + y + z)a−1e−(x+y+z)/a.
8.34 A billiard ball of radius R is suspended in space and is held rigidly in position. Very small
pellets are thrown at it and the scattering from the surface is completely elastic, with no friction.
Compute the relation between the impact parameter b and the scattering angle θ. Then compute
the differential scattering cross section dσ/dΩ.
Finally compute the total scattering cross section, the integral of this over dΩ.
8.35 Modify the preceding problem so that the incoming object is a ball of radius R1 and the
fixed billiard ball has radius R2.
8.36 Find the differential scattering cross section from a spherical drop of water, but instead
of Snell’s law, use a pre-Snell law: β = nα, without the sines. Is there a rainbow in this case?
Sketch dσ/dΩ versus θ.
Ans: R2 sin 2β
4 sin θ|1 − 2/n| , where θ = π + 2(1 − 2/n)β
8.37 From the equation (8.43), assuming only a single b for a given θ, what is the integral over
all dΩ of dσ/dΩ? Ans: πb2max
8.38 Solve Eq. (8.47) for b when dθ/db = 0. For n = 1.33 what value of θ does this give?
8.39 If the scattering angle θ = π sin(πb/R) for 0 < b < R, what is the resulting differential
2
scattering cross section (with graph). What is the total scattering cross section? Start by
sketching a graph of θ versus b. Ans: 2R2
π2 sin θ
1 − (2θ/π)2
8.40 Work out the signs of all the factors in Eq. (8.49), and determine from that whether red
or blue is on the outside of the rainbow. Ans: Look
8.41 If it suddenly starts to rain small, spherical diamonds instead of water, what happens to
the rainbow? n = 2.4 for diamond.
8.42 What would the rainbow look like for n = 2? You’ll have to look closely at the expansions
in this case. For small b, where does the ray hit the inside surface of the drop?
8.43 (a) The secondary rainbow occurs because there can be two internal reflections before the
light leave the drop. What is the analog of Eqs. (8.44) for this case? (b) Repeat problems 8.38
and 8.40 for this case.
8.44 What is the shortest distance from the origin to the plane defined by A .(r − r0) = 0? Do
this using Lagrange multipliers, and then explain why of course the answer is correct.
8.45 The U.S. Post Office has decided to use a norm like Eq. (6.11)(2) to measure boxes. The
size is defined to be the sum of the height and the circumference of the box, and the circumference
is around the thickest part of the package: “length plus girth.” What is the maximum volume
8—Multivariable Calculus
36
you can ship if this size is constrained to be less than 130 inches? For this purpose, assume the
box is rectangular, not cylindrical, though you may expect the cylinder to improve the result.
Assume that the box’s dimensions are a, a, b, with volume a2b.
(a) Show that if you assume that the girth is 4a, then you will conclude that b > a and that you
didn’t measure the girth at the thickest part of the package.
(b) Do it again with the opposite assumption, that you assume b is big so that the girth is 2b+2a.
Again show that it is a contradiction.
(c) You have two inequalities that you must satisfy: girth plus length measured one way is less
than L = 130 inches and girth plus length measured the other way is too. That is, 4a+b < L and
3a + 2b < L. Plot these regions in the a-b plane, showing the allowed region in a-b space. Also
plot some curves of constant volume, V = a2b. Show that the point of maximum volume subject
to these constraints is on the edge of this allowed region, and that it is at the corner of intersection
of the two inequalities. This is the beginning of the subject called “linear programming.”
Ans: a cube
8.46 Plot θ versus b in equation (8.45) or (8.46).
8.47 A disk of radius R is at a distance c above the x-y plane and parallel to that plane. What
√
is the solid angle that this disk subtends from the origin? Ans: 2π 1 − c/ c2 + R2
8.48 Within a sphere of radius R, what is the volume contained between the planes defined by
z = a and z = b? Ans: π(b − a) R2 − 1 (b2 + ab + a2)
3
8.49 Find the mean-square distance, 1
r2 dV , from a point on the surface of a sphere to
V
points inside the sphere. Note: Plan ahead and try to make this problem as easy as possible.
Ans: 8R2/5
8.50 Find the mean distance, 1
r dV , from a point on the surface of a sphere to points inside
V
the sphere. Unlike the preceding problem, this requires some brute force. Ans: 6R/5
8.51 A volume mass density is specified in spherical coordinates to be
ρ(r, θ, φ) = ρ0 1 + r2/R2 1 + 1 cos θ sin2 φ + 1 cos2 θ sin3 φ
2
4
Compute the total mass in the volume 0 < r < R. Ans: 32πρ0R3/15
8.52 The circumference of a circle is some constant times its radius (C1r). For the two-
dimensional surface that is a sphere in three dimensions the area is of the form C2r2. Start
∞
from the fact that you know the integral
dx e−x2 = π1/2 and write out the following two
−∞
dimensional integral twice. It is over the entire plane.
dA e−r2
using
dA = dx dy
and using
dA = C1r dr
From this, evaluate C1. Repeat this for dV and C2r2 in three dimensions, evaluating C2.
Now repeat this in arbitrary dimensions to evaluate Cn. Do you need to reread chapter one? In
particular, what is C3? It tells you about the three dimensional hypersphere in four dimensions.
From this, what is the total “hypersolid angle” in four dimensions (like 4π in three)? Ans: 2π2
8—Multivariable Calculus
37
8.53 Do the reverse of problem 8.23. Start with the second equation there and change variables
to see that it reverts to a constant times the first equation.
8.54 Carry out the interchange of limits in Eq. (8.22). Does the drawing really represent the
integral?
8.55 Is x2 + xy + y2 a minimum or maximum or something else at (0, 0)? Do the same question
for x2 + 2xy + y2 and for x2 + 3xy + y2. Sketch the surface z = f (x, y) in each case.
8.56 Derive the conditions stated after Eq. (8.33), expressing the circumstances under which
the Hessian matrix is positive definite.
8.57 In the spirit of problems 8.10 et seq. what happens if you have a rectangular drumhead
instead of a circular one? Let 0 < x < a and 0 < y < b. The drumhead is tied down at its
edges, so an appropriate function that satisfies these conditions is
z(x, y) = A sin(nπx/a) sin(mπy/b) cos ωt
Compute the total kinetic and the total potential energy for this oscillation, a function of time.
For energy to be conserved the total energy must be a constant, so compute the frequency ω for
which this is true. As compared to the previous problems about a circular drumhead, this turns
out to give the exact results instead of only approximate ones. Ans: ω2 = π2 µ n2 + m2
T
a2
b2
8.58 Repeat problem 8.45 by another method. Instead of assuming that the box has a square
end, allow it to be any rectangular box, so that its volume is V = abc. Now you have three
independent variables to use, maximizing the volume subject to the post office’s constraint on
length plus girth. This looks like it will have to be harder. Instead, it’s much easier. Draw
pictures! Ans: still a cube
8.59 An asteroid is headed in the general direction of Earth, and its speed
when far away is v
b
0 relative to the Earth. What is the total cross section for
it’s hitting Earth? It is not necessary to compute the complete orbit; all you
have to do is use a couple of conservation laws. Express the result in terms
of the escape speed from Earth.
Ans: σ = πR2 1 + (vesc/v0)2
8.60 In three dimensions the differential scattering cross section appeared in Eqs. (8.42) and
(8.43). If the world were two dimensional this area would be a length instead. What are the
two corresponding equations in that case, giving you an expression for d /dθ. Apply this to the
light scattering from a (two dimensional) drop of water and describe the scattering results. For
simplicity this time, assume the pre-Snell law as in problem 8.36.
8.61 As in the preceding problem, but use the regular Snell law instead.
8.62 This double integral is over the isosceles right triangle in the figure. The
t
t
function to be integrated is f (t ) = αt 3, BUT FIRST, set it up for an arbitrary
f (t ) and then set it up again but with the order of integration reversed. In
one of the two cases you should be able to do one integral without knowing f .
Having done this, apply your two results to this particular f as a test case that
t
your work was correct. In the figure, t and t are the two coordinates and t
is the coordinate of the top of the triangle.
Document Outline
- 8.MULTIVARIABLE CALCULUS
- 1.Partial Derivatives
- 2.Chain Rule
- 3.Differentials
- Differentials in Several Variables
- 4.Geometric Interpretation
- 5.Gradient
- 6.Electrostatics
- 7.Plane Polar Coordinates
- 8.Cylindrical, Spherical Coordinates
- Examples of Multiple Integrals
- An Area
- A Moment of Inertia
- Volume of a Sphere
- A Surface Charge Density
- Limits of Integration
- 9.Vectors: Cylindrical, Spherical Bases
- Solenoid
- Gravitational Field
- Nuclear Magnetic Field
- 10.Gradient in other Coordinates
- 11.Maxima, Minima, Saddles
- 12.Lagrange Multipliers
- Examples of Lagrange Multipliers
- 13.Solid Angle
- Cross Section, Absorption
- Cross Section, Scattering
- 14.Rainbow
- 15.3D Visualization
- Exercises
- Problems