Unit C3: Partial Derivatives and the Gradient单元 C3:偏导数与梯度
Differentiation in several variables, from partial derivatives and the multivariable chain rule to the gradient vector that drives all of multivariable optimization and geometry.多元函数(functions of several variables)的微分,从偏导数(partial derivative)和多元链式法则(chain rule)出发,直到驱动整个多元优化与几何的梯度向量(gradient)。
level set)入手,然后面对多元极限的新微妙之处:一个函数必须沿每一条逼近路径取得相同的值。在此基础上建立偏导数、由克莱罗定理(Clairaut's theorem)支配的高阶与混合偏导数,以及多元链式法则。最终的成果是梯度向量,它编码了最速增加的方向,并且垂直于每一条等值集。请认真做证明和例题;偏导数存在与真正可微性(differentiability)之间的差距,是本单元的概念核心。Functions of Several Variables多元函数
function of several variables)对 $\mathbb{R}^n$ 中某区域的每个点赋予一个实数。当 $n=2$ 时记作 $z = f(x,y)$;定义域是平面的一个子集,图像是位于三维空间中的曲面。单变量微积分的工具可以推广,但现在输入可以沿无穷多个方向移动。Definition (function of two variables).定义(二元函数)。 A function $f$ of two variables is a rule that assigns to each ordered pair $(x,y)$ in a set $D \subseteq \mathbb{R}^2$ a unique real number $f(x,y)$. The set $D$ is the domain and the set of output values is the range. The graph of $f$ is the surface $\{(x,y,z) : z = f(x,y),\ (x,y)\in D\}$.二元函数 $f$ 是一条规则,它对集合 $D \subseteq \mathbb{R}^2$ 中的每个有序对 $(x,y)$ 赋予唯一的实数 $f(x,y)$。集合 $D$ 是定义域(domain),所有输出值构成的集合是值域(range)。$f$ 的图像是曲面 $\{(x,y,z) : z = f(x,y),\ (x,y)\in D\}$。
A level curve of $f(x,y)$ is the set of points where $f$ takes a fixed value $k$. The collection of level curves for several values of $k$ is a contour map: closely spaced curves indicate a steep surface, widely spaced curves a gentle one. For three variables the analogous objects are level surfaces $f(x,y,z)=k$.$f(x,y)$ 的一条等高线(level curve)是 $f$ 取固定值 $k$ 的点集。把若干个 $k$ 值对应的等高线放在一起就是一张等高线图:曲线靠得越密表示曲面越陡,越疏表示越平缓。对于三元函数,类比的对象是等值面 $f(x,y,z)=k$。
Worked Example 1.1: domain and level curves of a logarithmic-radical function例题 1.1:一个根式函数的定义域与等高线
Find the domain and describe the level curves of $f(x,y) = \sqrt{9 - x^2 - y^2}$.
The radicand must be nonnegative, so the domain is the closed disk
$$ 9 - x^2 - y^2 \ge 0 \iff x^2 + y^2 \le 9, $$a disk of radius $3$ centered at the origin. A level curve sets $f(x,y)=k$ with $0 \le k \le 3$:
$$ \sqrt{9 - x^2 - y^2} = k \implies x^2 + y^2 = 9 - k^2, $$a circle of radius $\sqrt{9-k^2}$. The graph is the upper hemisphere of radius $3$; its contour map is a family of concentric circles that shrink to a point as $k \to 3$.
求 $f(x,y) = \sqrt{9 - x^2 - y^2}$ 的定义域并描述其等高线。
被开方数必须非负,所以定义域是闭圆盘
$$ 9 - x^2 - y^2 \ge 0 \iff x^2 + y^2 \le 9, $$即以原点为中心、半径为 $3$ 的圆盘。令 $f(x,y)=k$(其中 $0 \le k \le 3$)得一条等高线:
$$ \sqrt{9 - x^2 - y^2} = k \implies x^2 + y^2 = 9 - k^2, $$它是半径为 $\sqrt{9-k^2}$ 的圆。图像是半径为 $3$ 的上半球面;其等高线图是一族同心圆,当 $k \to 3$ 时收缩为一点。
Reading the shape of a surface from its contours从等高线读出曲面的形状
Contour maps are not just bookkeeping. The spacing of level curves encodes the steepness of the surface, and their shape encodes its qualitative geometry. A saddle, a bowl, and a ridge each leave a distinctive contour signature, which is why geologists and meteorologists read topographic and pressure maps the way a calculus student reads a graph. The next two examples sharpen the skill of moving between the algebraic rule for $f$ and the geometry of its contours.等高线图不只是记录数据。等高线之间的疏密编码了曲面的陡峭程度,它们的形状编码了曲面的定性几何。鞍面、碗状面和山脊各自留下独特的等高线特征,这正是地质学家和气象学家读地形图与气压图的方式,就像微积分学生读函数图一样。接下来的两个例题,专门训练在 $f$ 的代数表达式与其等高线几何之间来回切换的能力。
Worked Example 1.2: contours of a saddle and a paraboloid例题 1.2:鞍面与抛物面的等高线
Describe the level curves of $g(x,y) = x^2 - y^2$ and of $p(x,y) = x^2 + y^2$, and say what the surfaces look like.
Saddle. Setting $g=k$ gives $x^2 - y^2 = k$. For $k>0$ these are hyperbolas opening along the $x$-axis; for $k<0$ they open along the $y$-axis; for $k=0$ the level set degenerates to the two lines $y = \pm x$. The surface $z = x^2 - y^2$ is the standard saddle: it rises along the $x$-axis and falls along the $y$-axis, so the origin is neither a max nor a min.
Paraboloid. Setting $p=k$ gives $x^2 + y^2 = k$, a circle of radius $\sqrt{k}$ for $k>0$ and the single point $(0,0)$ for $k=0$; there is no level curve for $k<0$ because $p \ge 0$ always. Equally spaced values of $k$ produce circles whose radii grow like $\sqrt{k}$, so the circles crowd together as you move outward. That crowding is the contour map telling you the bowl gets steeper away from the origin.
描述 $g(x,y) = x^2 - y^2$ 和 $p(x,y) = x^2 + y^2$ 的等高线,并说明这两个曲面长什么样。
鞍面。 令 $g=k$ 得 $x^2 - y^2 = k$。当 $k>0$ 时是沿 $x$ 轴张开的双曲线;当 $k<0$ 时沿 $y$ 轴张开;当 $k=0$ 时等值集退化为两条直线 $y = \pm x$。曲面 $z = x^2 - y^2$ 是标准鞍面:它沿 $x$ 轴上升、沿 $y$ 轴下降,所以原点既不是最大值点也不是最小值点。
抛物面。 令 $p=k$ 得 $x^2 + y^2 = k$,当 $k>0$ 时是半径为 $\sqrt{k}$ 的圆,当 $k=0$ 时是单个点 $(0,0)$;因为恒有 $p \ge 0$,当 $k<0$ 时没有等高线。等间隔取 $k$ 值得到的圆,其半径按 $\sqrt{k}$ 增长,所以越往外圆越密。这种密集正是等高线图在告诉你:远离原点时碗壁变得更陡。
Worked Example 1.3: a function of three variables and its level surfaces例题 1.3:三元函数及其等值面
Find the domain of $F(x,y,z) = \dfrac{1}{\sqrt{x^2+y^2+z^2-1}}$ and describe its level surfaces.
The square root requires $x^2+y^2+z^2 - 1 > 0$ (strictly positive, since it also sits in a denominator). So the domain is the exterior of the unit sphere, $\{(x,y,z): x^2+y^2+z^2 > 1\}$, with the sphere itself excluded.
A level surface sets $F = k$ for some $k>0$:
$$ \frac{1}{\sqrt{x^2+y^2+z^2-1}} = k \implies x^2+y^2+z^2 - 1 = \frac{1}{k^2} \implies x^2+y^2+z^2 = 1 + \frac{1}{k^2}. $$Each level surface is a sphere centered at the origin with radius $\sqrt{1 + 1/k^2} > 1$. Large $k$ (large output) corresponds to a radius just above $1$, that is, points hugging the excluded sphere; small $k$ corresponds to large spheres far out. The level surfaces are nested spheres filling the entire exterior region.
求 $F(x,y,z) = \dfrac{1}{\sqrt{x^2+y^2+z^2-1}}$ 的定义域并描述其等值面。
平方根要求 $x^2+y^2+z^2 - 1 > 0$(严格为正,因为它同时位于分母中)。所以定义域是单位球面的外部 $\{(x,y,z): x^2+y^2+z^2 > 1\}$,球面本身被排除在外。
对某个 $k>0$ 令 $F = k$ 得一张等值面:
$$ \frac{1}{\sqrt{x^2+y^2+z^2-1}} = k \implies x^2+y^2+z^2 - 1 = \frac{1}{k^2} \implies x^2+y^2+z^2 = 1 + \frac{1}{k^2}. $$每张等值面都是以原点为中心、半径为 $\sqrt{1 + 1/k^2} > 1$ 的球面。$k$ 大(输出大)对应半径刚刚超过 $1$ 的球面,即紧贴被排除球面的点;$k$ 小对应远处的大球面。这些等值面是一族嵌套的球面,填满整个外部区域。
Limits and Continuity极限与连续性
means $f(x,y)$ can be made arbitrarily close to $L$ by taking $(x,y)$ sufficiently close to $(a,b)$, for every approach path.表示沿任意一条逼近路径,只要 $(x,y)$ 充分接近 $(a,b)$,就能使 $f(x,y)$ 任意接近 $L$。
Definition (epsilon-delta).定义(epsilon-delta)。 $\lim_{(x,y)\to(a,b)} f(x,y)=L$ if for every $\varepsilon>0$ there is a $\delta>0$ such that $0 < \sqrt{(x-a)^2+(y-b)^2} < \delta$ implies $|f(x,y)-L| < \varepsilon$.$\lim_{(x,y)\to(a,b)} f(x,y)=L$ 当且仅当对每个 $\varepsilon>0$ 都存在 $\delta>0$,使得 $0 < \sqrt{(x-a)^2+(y-b)^2} < \delta$ 蕴含 $|f(x,y)-L| < \varepsilon$。
Definition (continuity).定义(连续性)。 $f$ is continuous at $(a,b)$ if $\lim_{(x,y)\to(a,b)} f(x,y) = f(a,b)$. Polynomials are continuous everywhere; rational functions are continuous wherever the denominator is nonzero; sums, products, quotients, and compositions of continuous functions are continuous on their domains.$f$ 在 $(a,b)$ 处连续(continuous)当且仅当 $\lim_{(x,y)\to(a,b)} f(x,y) = f(a,b)$。多项式处处连续;有理函数在分母非零处连续;连续函数的和、积、商及复合在其定义域上连续。
Worked Example 2.1: a nonexistent limit by the two-path test例题 2.1:用双路径检验证明极限不存在
Show that $\displaystyle \lim_{(x,y)\to(0,0)} \frac{xy}{x^2+y^2}$ does not exist.
Approach along the $x$-axis ($y=0$): the function is $\frac{0}{x^2}=0$, so the limit along this path is $0$.
Approach along the line $y=x$:
$$ \frac{x\cdot x}{x^2+x^2} = \frac{x^2}{2x^2} = \frac{1}{2}. $$Two paths give two different values ($0$ and $\tfrac12$), so the limit does not exist.
证明 $\displaystyle \lim_{(x,y)\to(0,0)} \frac{xy}{x^2+y^2}$ 不存在。
沿 $x$ 轴($y=0$)逼近:函数为 $\frac{0}{x^2}=0$,所以沿这条路径的极限是 $0$。
沿直线 $y=x$ 逼近:
$$ \frac{x\cdot x}{x^2+x^2} = \frac{x^2}{2x^2} = \frac{1}{2}. $$两条路径给出两个不同的值($0$ 和 $\tfrac12$),所以极限不存在。
Going deeper: paths are not enough, but polar coordinates can close the case深入探讨:路径检验还不够,但极坐标可以一锤定音
Agreement along every straight line does not guarantee a limit; a path may be curved. Consider $g(x,y)=\dfrac{x^2 y}{x^4 + y^2}$. Along any line $y=mx$ the limit is $0$, yet along the parabola $y=x^2$ we get $\dfrac{x^4}{x^4+x^4}=\tfrac12$, so the limit fails.
To prove a limit exists, bound it. For $h(x,y)=\dfrac{x^3}{x^2+y^2}$ switch to polar coordinates $x=r\cos\theta,\ y=r\sin\theta$:
$$ |h| = \left| \frac{r^3\cos^3\theta}{r^2} \right| = r\,|\cos^3\theta| \le r \to 0 \quad \text{as } r\to 0. $$The bound is independent of $\theta$, so the limit is $0$ regardless of direction. The squeeze on $r$ is what makes the argument airtight.
沿每一条直线都一致,并不能保证极限存在;路径可以是弯曲的。考虑 $g(x,y)=\dfrac{x^2 y}{x^4 + y^2}$。沿任意直线 $y=mx$ 极限都是 $0$,然而沿抛物线 $y=x^2$ 却得到 $\dfrac{x^4}{x^4+x^4}=\tfrac12$,所以极限不存在。
要证明极限存在,就要给它定界。对 $h(x,y)=\dfrac{x^3}{x^2+y^2}$,换成极坐标(polar coordinates)$x=r\cos\theta,\ y=r\sin\theta$:
这个界与 $\theta$ 无关,所以无论沿哪个方向极限都是 $0$。对 $r$ 的夹逼正是让论证滴水不漏的关键。
Worked Example 2.2: an epsilon-delta proof of a two-variable limit例题 2.2:二元极限的 epsilon-delta 证明
Prove directly from the definition that $\displaystyle \lim_{(x,y)\to(0,0)} \frac{3x^2 y}{x^2+y^2} = 0$.
Fix $\varepsilon>0$. We must produce $\delta>0$ so that $0 < \sqrt{x^2+y^2} < \delta$ forces $\left|\frac{3x^2 y}{x^2+y^2} - 0\right| < \varepsilon$. The key estimate uses $x^2 \le x^2 + y^2$:
$$ \left| \frac{3x^2 y}{x^2+y^2} \right| = 3\,\frac{x^2}{x^2+y^2}\,|y| \le 3\cdot 1 \cdot |y| = 3|y| \le 3\sqrt{x^2+y^2}. $$So if we choose $\delta = \varepsilon/3$, then $\sqrt{x^2+y^2} < \delta$ gives
$$ \left| \frac{3x^2 y}{x^2+y^2} \right| \le 3\sqrt{x^2+y^2} < 3\delta = \varepsilon. $$This holds for every $\varepsilon>0$, so the limit is $0$. Notice the structure: bound the messy fraction by a constant times the distance $\sqrt{x^2+y^2}$, then read off $\delta$ by solving the resulting inequality. The factor $\frac{x^2}{x^2+y^2} \le 1$ is the workhorse, and it is exactly the kind of bound that path-testing cannot supply.
直接从定义出发证明 $\displaystyle \lim_{(x,y)\to(0,0)} \frac{3x^2 y}{x^2+y^2} = 0$。
固定 $\varepsilon>0$。我们需要找出 $\delta>0$,使 $0 < \sqrt{x^2+y^2} < \delta$ 迫使 $\left|\frac{3x^2 y}{x^2+y^2} - 0\right| < \varepsilon$。关键估计用到 $x^2 \le x^2 + y^2$:
$$ \left| \frac{3x^2 y}{x^2+y^2} \right| = 3\,\frac{x^2}{x^2+y^2}\,|y| \le 3\cdot 1 \cdot |y| = 3|y| \le 3\sqrt{x^2+y^2}. $$因此若取 $\delta = \varepsilon/3$,则 $\sqrt{x^2+y^2} < \delta$ 给出
$$ \left| \frac{3x^2 y}{x^2+y^2} \right| \le 3\sqrt{x^2+y^2} < 3\delta = \varepsilon. $$这对每个 $\varepsilon>0$ 都成立,所以极限为 $0$。注意其结构:用一个常数乘以距离 $\sqrt{x^2+y^2}$ 来给那个杂乱的分式定界,再通过解所得不等式读出 $\delta$。因子 $\frac{x^2}{x^2+y^2} \le 1$ 是主力,而这正是路径检验无法提供的那种界。
Partial Derivatives偏导数
partial derivative)度量当一个变量变化而其他变量保持不变时 $f$ 的变化率。从几何上看,$f_x(a,b)$ 是平面 $y=b$ 切割曲面所得曲线的斜率。从计算上看,你对一个变量求导,把其余变量当作常数。Notation.记号。 The partial derivative of $z=f(x,y)$ with respect to $x$ is written $f_x$, $\dfrac{\partial f}{\partial x}$, or $\dfrac{\partial z}{\partial x}$. The symbol $\partial$ distinguishes a partial derivative from the ordinary derivative $d/dx$ of a single-variable function. Higher dimensions are identical: for $f(x,y,z)$ there are three first-order partials $f_x, f_y, f_z$.$z=f(x,y)$ 关于 $x$ 的偏导数记作 $f_x$、$\dfrac{\partial f}{\partial x}$ 或 $\dfrac{\partial z}{\partial x}$。符号 $\partial$ 把偏导数与单变量函数的普通导数 $d/dx$ 区分开来。更高维完全类似:对 $f(x,y,z)$ 有三个一阶偏导数 $f_x, f_y, f_z$。
Worked Example 3.1: computing both first partials例题 3.1:计算两个一阶偏导数
Let $f(x,y) = x^3 y^2 + \sin(xy)$. Find $f_x$ and $f_y$.
For $f_x$, treat $y$ as a constant and differentiate in $x$:
$$ f_x = 3x^2 y^2 + y\cos(xy). $$For $f_y$, treat $x$ as a constant and differentiate in $y$:
$$ f_y = 2x^3 y + x\cos(xy). $$The chain rule supplies the inner factor ($y$ in $f_x$, $x$ in $f_y$) on the $\sin(xy)$ term.
设 $f(x,y) = x^3 y^2 + \sin(xy)$。求 $f_x$ 和 $f_y$。
求 $f_x$ 时,把 $y$ 当作常数对 $x$ 求导:
$$ f_x = 3x^2 y^2 + y\cos(xy). $$求 $f_y$ 时,把 $x$ 当作常数对 $y$ 求导:
$$ f_y = 2x^3 y + x\cos(xy). $$链式法则为 $\sin(xy)$ 项提供了内层因子($f_x$ 中是 $y$,$f_y$ 中是 $x$)。
Worked Example 3.2: a partial derivative straight from the limit definition例题 3.2:直接用极限定义求偏导数
Use the limit definition to compute $f_x(1,2)$ for $f(x,y) = x^2 y + 3y$.
Hold $y=2$ fixed and form the difference quotient in $x$:
$$ f_x(1,2) = \lim_{h\to 0} \frac{f(1+h,2) - f(1,2)}{h}. $$Compute the two values. Since $f(x,2) = 2x^2 + 6$, we have $f(1+h,2) = 2(1+h)^2 + 6 = 2 + 4h + 2h^2 + 6$ and $f(1,2) = 2 + 6 = 8$. Then
$$ \frac{f(1+h,2)-f(1,2)}{h} = \frac{(8 + 4h + 2h^2) - 8}{h} = \frac{4h + 2h^2}{h} = 4 + 2h. $$Letting $h\to 0$ gives $f_x(1,2) = 4$. As a check, the rule $f_x = 2xy$ gives $f_x(1,2) = 2\cdot 1\cdot 2 = 4$. The definition and the shortcut agree, which is the point of the shortcut.
用极限定义计算 $f(x,y) = x^2 y + 3y$ 的 $f_x(1,2)$。
固定 $y=2$,构造关于 $x$ 的差商:
$$ f_x(1,2) = \lim_{h\to 0} \frac{f(1+h,2) - f(1,2)}{h}. $$计算这两个值。由于 $f(x,2) = 2x^2 + 6$,有 $f(1+h,2) = 2(1+h)^2 + 6 = 2 + 4h + 2h^2 + 6$,以及 $f(1,2) = 2 + 6 = 8$。于是
$$ \frac{f(1+h,2)-f(1,2)}{h} = \frac{(8 + 4h + 2h^2) - 8}{h} = \frac{4h + 2h^2}{h} = 4 + 2h. $$令 $h\to 0$ 得 $f_x(1,2) = 4$。作为验算,规则 $f_x = 2xy$ 给出 $f_x(1,2) = 2\cdot 1\cdot 2 = 4$。定义与捷径一致,这正是捷径的意义所在。
Worked Example 3.3: quotient and exponential partials in three variables例题 3.3:三元函数中的商与指数偏导数
Let $g(x,y,z) = \dfrac{x\,e^{z}}{y}$. Find all three first partials at $(2,1,0)$.
Treat the other two variables as constants each time. With respect to $x$, the factor $e^z/y$ is constant:
$$ g_x = \frac{e^z}{y}. $$With respect to $y$, write $g = x e^z\cdot y^{-1}$, so
$$ g_y = -\frac{x e^z}{y^2}. $$With respect to $z$, the factor $x/y$ is constant and $\frac{\partial}{\partial z}e^z = e^z$:
$$ g_z = \frac{x e^z}{y}. $$Evaluate at $(2,1,0)$, where $e^0 = 1$: $\,g_x = 1,\ g_y = -2,\ g_z = 2$. Notice $g_z$ equals $g$ itself at this point, a consequence of the exponential being its own derivative.
设 $g(x,y,z) = \dfrac{x\,e^{z}}{y}$。求在 $(2,1,0)$ 处的三个一阶偏导数。
每次都把另外两个变量当作常数。关于 $x$,因子 $e^z/y$ 是常数:
$$ g_x = \frac{e^z}{y}. $$关于 $y$,把 $g$ 写成 $g = x e^z\cdot y^{-1}$,于是
$$ g_y = -\frac{x e^z}{y^2}. $$关于 $z$,因子 $x/y$ 是常数且 $\frac{\partial}{\partial z}e^z = e^z$:
$$ g_z = \frac{x e^z}{y}. $$在 $(2,1,0)$ 处求值(此处 $e^0 = 1$):$\,g_x = 1,\ g_y = -2,\ g_z = 2$。注意在该点 $g_z$ 恰好等于 $g$ 本身,这是指数函数以自身为导数的结果。
Going deeper: a partial derivative is an ordinary derivative of a slice深入探讨:偏导数就是一个切片的普通导数
The cleanest way to think about $f_x(a,b)$ is to freeze $y=b$ first and study the one-variable function
$$ \phi(x) = f(x,b). $$This $\phi$ is the slice of the surface cut by the vertical plane $y=b$. Then by the single-variable definition,
$$ \phi'(a) = \lim_{h\to 0}\frac{\phi(a+h)-\phi(a)}{h} = \lim_{h\to 0}\frac{f(a+h,b)-f(a,b)}{h} = f_x(a,b). $$So $f_x(a,b)$ is literally the ordinary derivative of the slice $\phi$ at $x=a$, which is why every single-variable rule (product, quotient, chain) transfers verbatim once you decide which variable is "the" variable. Geometrically, $\phi'(a)$ is the slope of the tangent line to the slice curve, and that tangent line lies in the plane $y=b$. The pair of tangent lines from $f_x$ and $f_y$ spans the tangent plane studied in Section 7.
理解 $f_x(a,b)$ 最干净的方式是先冻结 $y=b$,再研究单变量函数
$$ \phi(x) = f(x,b). $$这个 $\phi$ 是垂直平面 $y=b$ 切割曲面所得的切片。于是按单变量定义,
$$ \phi'(a) = \lim_{h\to 0}\frac{\phi(a+h)-\phi(a)}{h} = \lim_{h\to 0}\frac{f(a+h,b)-f(a,b)}{h} = f_x(a,b). $$所以 $f_x(a,b)$ 实实在在就是切片 $\phi$ 在 $x=a$ 处的普通导数。这就是为什么一旦你确定哪个是“那个”变量,所有单变量法则(乘积、商、链式)都原封不动地搬过来。从几何上看,$\phi'(a)$ 是切片曲线切线的斜率,而这条切线位于平面 $y=b$ 内。来自 $f_x$ 与 $f_y$ 的这一对切线张成了第 7 节研究的切平面(tangent plane)。
Higher Partial Derivatives高阶偏导数
Clairaut's theorem)。正是这种对称性,使得性质良好的函数的梯度表现得如此干净。Theorem (Clairaut, equality of mixed partials).定理(克莱罗定理,混合偏导数相等)。 If $f_{xy}$ and $f_{yx}$ are both continuous on an open disk containing $(a,b)$, then $f_{xy}(a,b) = f_{yx}(a,b)$. The notation $f_{xy}$ means differentiate first with respect to $x$, then with respect to $y$.若 $f_{xy}$ 和 $f_{yx}$ 在含 $(a,b)$ 的某个开圆盘上都连续,则 $f_{xy}(a,b) = f_{yx}(a,b)$。记号 $f_{xy}$ 表示先对 $x$ 求导,再对 $y$ 求导。
Worked Example 4.1: verifying Clairaut's theorem例题 4.1:验证克莱罗定理
Let $f(x,y) = x^2 y^3 + e^{x} y$. Compute $f_{xy}$ and $f_{yx}$ and confirm they agree.
First partials:
$$ f_x = 2xy^3 + e^x y, \qquad f_y = 3x^2 y^2 + e^x. $$Now the mixed partials:
$$ f_{xy} = \frac{\partial}{\partial y}\big(2xy^3 + e^x y\big) = 6xy^2 + e^x, $$ $$ f_{yx} = \frac{\partial}{\partial x}\big(3x^2 y^2 + e^x\big) = 6xy^2 + e^x. $$They match, as Clairaut's theorem guarantees for this smooth function.
设 $f(x,y) = x^2 y^3 + e^{x} y$。计算 $f_{xy}$ 和 $f_{yx}$ 并确认它们相等。
一阶偏导数:
$$ f_x = 2xy^3 + e^x y, \qquad f_y = 3x^2 y^2 + e^x. $$再求混合偏导数:
$$ f_{xy} = \frac{\partial}{\partial y}\big(2xy^3 + e^x y\big) = 6xy^2 + e^x, $$ $$ f_{yx} = \frac{\partial}{\partial x}\big(3x^2 y^2 + e^x\big) = 6xy^2 + e^x. $$两者相等,正如克莱罗定理对这个光滑函数所保证的。
Worked Example 4.2: a function satisfying Laplace's equation例题 4.2:一个满足拉普拉斯方程的函数
Show that $u(x,y) = e^{x}\cos y$ is harmonic, meaning it satisfies Laplace's equation $u_{xx} + u_{yy} = 0$.
Differentiate twice in each variable. First in $x$:
$$ u_x = e^{x}\cos y, \qquad u_{xx} = e^{x}\cos y. $$Now in $y$:
$$ u_y = -e^{x}\sin y, \qquad u_{yy} = -e^{x}\cos y. $$Add them:
$$ u_{xx} + u_{yy} = e^{x}\cos y - e^{x}\cos y = 0. $$So $u$ is harmonic. Functions obeying Laplace's equation describe steady-state temperature, electrostatic potential, and incompressible flow, which is why the second-order partials of this section are the gateway to partial differential equations. As a side check, the mixed partials $u_{xy} = -e^{x}\sin y = u_{yx}$ agree, exactly as Clairaut promises.
证明 $u(x,y) = e^{x}\cos y$ 是调和函数,即它满足拉普拉斯方程 $u_{xx} + u_{yy} = 0$。
对每个变量求两次导。先对 $x$:
$$ u_x = e^{x}\cos y, \qquad u_{xx} = e^{x}\cos y. $$再对 $y$:
$$ u_y = -e^{x}\sin y, \qquad u_{yy} = -e^{x}\cos y. $$把它们相加:
$$ u_{xx} + u_{yy} = e^{x}\cos y - e^{x}\cos y = 0. $$所以 $u$ 是调和函数。满足拉普拉斯方程的函数描述稳态温度、静电势和不可压缩流动,这正是本节二阶偏导数通往偏微分方程的门径。顺便验算,混合偏导数 $u_{xy} = -e^{x}\sin y = u_{yx}$ 相等,与克莱罗定理所承诺的完全一致。
Worked Example 4.3: the full set of second partials of a product例题 4.3:一个乘积函数的全部二阶偏导数
For $f(x,y) = x^3 y - 2xy^2$, compute all four second-order partials and verify the mixed ones agree.
First partials:
$$ f_x = 3x^2 y - 2y^2, \qquad f_y = x^3 - 4xy. $$Pure second partials:
$$ f_{xx} = 6xy, \qquad f_{yy} = -4x. $$Mixed partials, computed both ways:
$$ f_{xy} = \frac{\partial}{\partial y}\big(3x^2 y - 2y^2\big) = 3x^2 - 4y, \qquad f_{yx} = \frac{\partial}{\partial x}\big(x^3 - 4xy\big) = 3x^2 - 4y. $$The two mixed partials are identical, so Clairaut is confirmed. Note that $f_{xx}$ and $f_{yy}$ are generally different from each other; only the mixed pair is forced to agree.
对 $f(x,y) = x^3 y - 2xy^2$,计算全部四个二阶偏导数并验证混合偏导数相等。
一阶偏导数:
$$ f_x = 3x^2 y - 2y^2, \qquad f_y = x^3 - 4xy. $$纯二阶偏导数:
$$ f_{xx} = 6xy, \qquad f_{yy} = -4x. $$混合偏导数,两种顺序都算:
$$ f_{xy} = \frac{\partial}{\partial y}\big(3x^2 y - 2y^2\big) = 3x^2 - 4y, \qquad f_{yx} = \frac{\partial}{\partial x}\big(x^3 - 4xy\big) = 3x^2 - 4y. $$两个混合偏导数完全相同,克莱罗定理得到确认。注意 $f_{xx}$ 与 $f_{yy}$ 一般彼此不同;只有混合那一对被强制相等。
Going deeper: why continuity is required for Clairaut's theorem深入探讨:为什么克莱罗定理需要连续性
The hypothesis of continuity is not cosmetic. The classic counterexample is
$$ f(x,y) = \begin{cases} \dfrac{xy(x^2-y^2)}{x^2+y^2}, & (x,y)\ne(0,0) \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$A direct computation from the limit definitions gives $f_{xy}(0,0) = -1$ but $f_{yx}(0,0) = +1$. The mixed partials disagree at the origin precisely because they fail to be continuous there. Away from the origin the function is smooth and the two mixed partials coincide as usual.
The takeaway: order of differentiation can be swapped freely for the smooth functions you meet in practice, but the underlying reason is the continuity hypothesis, not a universal law.
连续性这个前提不是装饰。经典反例是
$$ f(x,y) = \begin{cases} \dfrac{xy(x^2-y^2)}{x^2+y^2}, & (x,y)\ne(0,0) \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$从极限定义直接计算得到 $f_{xy}(0,0) = -1$ 但 $f_{yx}(0,0) = +1$。两个混合偏导数在原点不一致,恰恰是因为它们在那里不连续。在远离原点处函数是光滑的,两个混合偏导数照常重合。
要点是:对你在实践中遇到的光滑函数,求导顺序可以随意交换,但其根本原因是连续性这个前提,而非一条普适法则。
The Multivariable Chain Rule多元链式法则
and similarly for $\partial z/\partial s$.$\partial z/\partial s$ 同理。
Worked Example 5.1: chain rule along a parametrized path例题 5.1:沿参数化路径的链式法则
Let $z = x^2 y$ with $x = \cos t$ and $y = \sin t$. Find $dz/dt$.
The partials and the parameter derivatives are
$$ \frac{\partial z}{\partial x} = 2xy, \quad \frac{\partial z}{\partial y} = x^2, \quad \frac{dx}{dt} = -\sin t, \quad \frac{dy}{dt} = \cos t. $$Assemble:
$$ \frac{dz}{dt} = 2xy(-\sin t) + x^2(\cos t) = -2\cos t\sin^2 t + \cos^3 t. $$Substituting $x=\cos t,\ y=\sin t$ gives the answer entirely in $t$.
设 $z = x^2 y$,其中 $x = \cos t$,$y = \sin t$。求 $dz/dt$。
各偏导数与参数导数为
$$ \frac{\partial z}{\partial x} = 2xy, \quad \frac{\partial z}{\partial y} = x^2, \quad \frac{dx}{dt} = -\sin t, \quad \frac{dy}{dt} = \cos t. $$组装起来:
$$ \frac{dz}{dt} = 2xy(-\sin t) + x^2(\cos t) = -2\cos t\sin^2 t + \cos^3 t. $$代入 $x=\cos t,\ y=\sin t$,答案就完全用 $t$ 表示了。
Worked Example 5.2: two independent variables via a tree diagram例题 5.2:用树形图处理两个自变量
Let $z = f(x,y)$ with $x = s^2 - t^2$ and $y = 2st$. Express $\partial z/\partial s$ and $\partial z/\partial t$ in terms of $f_x, f_y, s, t$.
There are two paths from $s$ up to $z$: through $x$ and through $y$. Sum the product of partials along each path:
$$ \frac{\partial z}{\partial s} = f_x\,\frac{\partial x}{\partial s} + f_y\,\frac{\partial y}{\partial s} = f_x(2s) + f_y(2t) = 2s\,f_x + 2t\,f_y. $$Likewise for $t$, using $\partial x/\partial t = -2t$ and $\partial y/\partial t = 2s$:
$$ \frac{\partial z}{\partial t} = f_x(-2t) + f_y(2s) = -2t\,f_x + 2s\,f_y. $$The substitution $x = s^2 - t^2,\ y = 2st$ is the real and imaginary part of $(s+it)^2$, so this is exactly how a chain rule converts derivatives under a complex-square change of variables. Even without that observation, the tree diagram makes the bookkeeping mechanical: one term per path, partials multiplied along the path, results summed.
设 $z = f(x,y)$,其中 $x = s^2 - t^2$,$y = 2st$。用 $f_x, f_y, s, t$ 表示 $\partial z/\partial s$ 和 $\partial z/\partial t$。
从 $s$ 向上到 $z$ 有两条路径:经过 $x$ 和经过 $y$。把每条路径上偏导数的乘积相加:
$$ \frac{\partial z}{\partial s} = f_x\,\frac{\partial x}{\partial s} + f_y\,\frac{\partial y}{\partial s} = f_x(2s) + f_y(2t) = 2s\,f_x + 2t\,f_y. $$对 $t$ 同理,用 $\partial x/\partial t = -2t$ 和 $\partial y/\partial t = 2s$:
$$ \frac{\partial z}{\partial t} = f_x(-2t) + f_y(2s) = -2t\,f_x + 2s\,f_y. $$代换 $x = s^2 - t^2,\ y = 2st$ 正是 $(s+it)^2$ 的实部与虚部,所以这恰好是链式法则在复数平方换元下转换导数的方式。即便没有这一观察,树形图也能把记账机械化:每条路径一项,沿路径把偏导数相乘,再把结果相加。
Worked Example 5.3: converting a Laplacian-type derivative to polar coordinates例题 5.3:把拉普拉斯型导数转换到极坐标
If $z = f(x,y)$ with $x = r\cos\theta$ and $y = r\sin\theta$, find $\partial z/\partial r$ and $\partial z/\partial\theta$.
The parameter derivatives are
$$ \frac{\partial x}{\partial r} = \cos\theta, \quad \frac{\partial y}{\partial r} = \sin\theta, \quad \frac{\partial x}{\partial\theta} = -r\sin\theta, \quad \frac{\partial y}{\partial\theta} = r\cos\theta. $$Apply the chain rule along both paths:
$$ \frac{\partial z}{\partial r} = f_x\cos\theta + f_y\sin\theta, \qquad \frac{\partial z}{\partial\theta} = -r\,f_x\sin\theta + r\,f_y\cos\theta. $$These two formulas are the starting point for rewriting the Laplacian $f_{xx}+f_{yy}$ in polar coordinates, a calculation every physics and engineering student eventually needs. The first equation also says $\partial z/\partial r = \nabla f \cdot \langle\cos\theta,\sin\theta\rangle$, the directional derivative of $f$ in the radial direction, a preview of Section 6.
若 $z = f(x,y)$,其中 $x = r\cos\theta$,$y = r\sin\theta$,求 $\partial z/\partial r$ 和 $\partial z/\partial\theta$。
参数导数为
$$ \frac{\partial x}{\partial r} = \cos\theta, \quad \frac{\partial y}{\partial r} = \sin\theta, \quad \frac{\partial x}{\partial\theta} = -r\sin\theta, \quad \frac{\partial y}{\partial\theta} = r\cos\theta. $$沿两条路径应用链式法则:
$$ \frac{\partial z}{\partial r} = f_x\cos\theta + f_y\sin\theta, \qquad \frac{\partial z}{\partial\theta} = -r\,f_x\sin\theta + r\,f_y\cos\theta. $$这两个公式是把拉普拉斯算子 $f_{xx}+f_{yy}$ 改写到极坐标的起点,这是每位物理和工程学生迟早都需要的计算。第一个等式还表明 $\partial z/\partial r = \nabla f \cdot \langle\cos\theta,\sin\theta\rangle$,即 $f$ 在径向方向上的方向导数,这是第 6 节的预告。
Going deeper: deriving the chain rule from the increment formula深入探讨:从增量公式推导链式法则
Suppose $f$ is differentiable at $(x,y)$. Differentiability means the increment satisfies
$$ \Delta z = f_x\,\Delta x + f_y\,\Delta y + \varepsilon_1\,\Delta x + \varepsilon_2\,\Delta y, $$where $\varepsilon_1,\varepsilon_2 \to 0$ as $(\Delta x,\Delta y)\to(0,0)$. Divide by $\Delta t$ and let $\Delta t \to 0$:
$$ \frac{dz}{dt} = f_x\,\frac{dx}{dt} + f_y\,\frac{dy}{dt} + \lim_{\Delta t\to 0}\left(\varepsilon_1\frac{\Delta x}{\Delta t} + \varepsilon_2\frac{\Delta y}{\Delta t}\right). $$Because $\Delta x/\Delta t \to dx/dt$ and $\Delta y/\Delta t \to dy/dt$ are finite while $\varepsilon_1,\varepsilon_2\to 0$, the trailing limit vanishes. What remains is exactly the chain-rule formula. The hypothesis of differentiability (not merely the existence of partials) is what makes the linear increment approximation valid.
设 $f$ 在 $(x,y)$ 处可微。可微意味着增量满足
$$ \Delta z = f_x\,\Delta x + f_y\,\Delta y + \varepsilon_1\,\Delta x + \varepsilon_2\,\Delta y, $$其中当 $(\Delta x,\Delta y)\to(0,0)$ 时 $\varepsilon_1,\varepsilon_2 \to 0$。两边除以 $\Delta t$ 再令 $\Delta t \to 0$:
$$ \frac{dz}{dt} = f_x\,\frac{dx}{dt} + f_y\,\frac{dy}{dt} + \lim_{\Delta t\to 0}\left(\varepsilon_1\frac{\Delta x}{\Delta t} + \varepsilon_2\frac{\Delta y}{\Delta t}\right). $$由于 $\Delta x/\Delta t \to dx/dt$ 和 $\Delta y/\Delta t \to dy/dt$ 有限,而 $\varepsilon_1,\varepsilon_2\to 0$,末尾的极限消失。剩下的恰好就是链式法则公式。正是可微性这个前提(而不仅仅是偏导数存在)使得线性增量近似有效。
The Gradient Vector梯度向量
gradient)$\nabla f$ 把所有一阶偏导数打包成一个向量。它指向 $f$ 最速增加的方向,其长度是最大增加率,并且它垂直于 $f$ 的等高线和等值面。几乎每一个多元优化与几何结论都源自这三个事实。Properties of the gradient.梯度的性质。 At a point where $\nabla f \ne \mathbf{0}$: (1) $\nabla f$ points in the direction of greatest rate of increase of $f$; (2) the maximum rate of increase equals $|\nabla f|$; (3) $\nabla f$ is orthogonal to the level curve (in 2D) or level surface (in 3D) through that point. The directional derivative in a unit direction $\mathbf{u}$ is the dot product $\nabla f \cdot \mathbf{u}$, which Unit C4 develops in full.在 $\nabla f \ne \mathbf{0}$ 的点处:(1) $\nabla f$ 指向 $f$ 增加率最大的方向;(2) 最大增加率等于 $|\nabla f|$;(3) $\nabla f$ 与过该点的等高线(二维)或等值面(三维)正交。沿单位方向 $\mathbf{u}$ 的方向导数(directional derivative)是标量积 $\nabla f \cdot \mathbf{u}$,单元 C4 将完整展开这一内容。
Worked Example 6.1: gradient and steepest-ascent direction例题 6.1:梯度与最速上升方向
For $f(x,y) = x^2 + 3y^2$, find $\nabla f$ at $(1,1)$, the direction of steepest increase, and the maximum rate of increase.
The partials are $f_x = 2x$ and $f_y = 6y$, so
$$ \nabla f = \langle 2x, 6y\rangle, \qquad \nabla f(1,1) = \langle 2, 6\rangle. $$The direction of steepest increase is that of $\langle 2,6\rangle$; as a unit vector,
$$ \mathbf{u} = \frac{\langle 2,6\rangle}{\sqrt{4+36}} = \frac{1}{\sqrt{40}}\langle 2,6\rangle. $$The maximum rate of increase is $|\nabla f(1,1)| = \sqrt{2^2 + 6^2} = \sqrt{40} = 2\sqrt{10}$.
对 $f(x,y) = x^2 + 3y^2$,求 $(1,1)$ 处的 $\nabla f$、最速增加方向以及最大增加率。
偏导数为 $f_x = 2x$ 和 $f_y = 6y$,所以
$$ \nabla f = \langle 2x, 6y\rangle, \qquad \nabla f(1,1) = \langle 2, 6\rangle. $$最速增加方向就是 $\langle 2,6\rangle$ 的方向;化为单位向量,
$$ \mathbf{u} = \frac{\langle 2,6\rangle}{\sqrt{4+36}} = \frac{1}{\sqrt{40}}\langle 2,6\rangle. $$最大增加率是 $|\nabla f(1,1)| = \sqrt{2^2 + 6^2} = \sqrt{40} = 2\sqrt{10}$。
Worked Example 6.2: the gradient as a normal vector to a level surface例题 6.2:梯度作为等值面的法向量
The surface $x^2 + y^2 - z^2 = 1$ is a level surface of $F(x,y,z) = x^2 + y^2 - z^2$ at level $k=1$. Find a vector normal to the surface at the point $(1,1,1)$.
By the orthogonality property, $\nabla F$ is normal to the level surface. Compute:
$$ \nabla F = \langle 2x,\ 2y,\ -2z\rangle, \qquad \nabla F(1,1,1) = \langle 2,\ 2,\ -2\rangle. $$Any nonzero scalar multiple is also normal, so $\langle 1,1,-1\rangle$ serves as a tidy normal vector. This is the engine behind tangent planes: in Unit C4 the plane tangent to the surface at $(1,1,1)$ is $2(x-1) + 2(y-1) - 2(z-1) = 0$, whose coefficients are precisely the components of $\nabla F$. The gradient of the defining function hands you the normal for free.
曲面 $x^2 + y^2 - z^2 = 1$ 是 $F(x,y,z) = x^2 + y^2 - z^2$ 在 $k=1$ 处的等值面。求曲面在点 $(1,1,1)$ 处的一个法向量。
由正交性,$\nabla F$ 垂直于等值面。计算:
$$ \nabla F = \langle 2x,\ 2y,\ -2z\rangle, \qquad \nabla F(1,1,1) = \langle 2,\ 2,\ -2\rangle. $$任意非零的数乘也是法向量,所以 $\langle 1,1,-1\rangle$ 是一个简洁的法向量。这是切平面背后的引擎:在单元 C4 中,曲面在 $(1,1,1)$ 处的切平面是 $2(x-1) + 2(y-1) - 2(z-1) = 0$,其系数恰好是 $\nabla F$ 的各分量。定义函数的梯度把法向量免费交到你手上。
Worked Example 6.3: steepest descent and a level-set tangent direction例题 6.3:最速下降方向与等值集的切向方向
For $f(x,y) = x^2 + xy + y^2$ at the point $(1,2)$, find the direction of steepest decrease and a direction in which $f$ does not change to first order.
The gradient is
$$ \nabla f = \langle 2x + y,\ x + 2y\rangle, \qquad \nabla f(1,2) = \langle 4,\ 5\rangle. $$Steepest increase is along $\langle 4,5\rangle$, so steepest decrease is along the opposite vector $-\nabla f = \langle -4,-5\rangle$. The rate of fastest decrease is $-|\nabla f| = -\sqrt{16+25} = -\sqrt{41}$.
A direction of zero first-order change is one perpendicular to the gradient, since the directional derivative $\nabla f\cdot\mathbf u$ vanishes there. Rotating $\langle 4,5\rangle$ by ninety degrees gives $\langle -5,4\rangle$ (or its negative). Moving along $\langle -5,4\rangle$ keeps you tangent to the level curve through $(1,2)$, so to first order $f$ stays constant, confirming the orthogonality of $\nabla f$ to level sets.
对 $f(x,y) = x^2 + xy + y^2$ 在点 $(1,2)$ 处,求最速下降方向,以及一个使 $f$ 在一阶意义下不变的方向。
梯度为
$$ \nabla f = \langle 2x + y,\ x + 2y\rangle, \qquad \nabla f(1,2) = \langle 4,\ 5\rangle. $$最速增加沿 $\langle 4,5\rangle$,所以最速下降沿相反向量 $-\nabla f = \langle -4,-5\rangle$。最快下降的速率是 $-|\nabla f| = -\sqrt{16+25} = -\sqrt{41}$。
一阶变化为零的方向是与梯度垂直的方向,因为此时方向导数 $\nabla f\cdot\mathbf u$ 为零。把 $\langle 4,5\rangle$ 旋转九十度得到 $\langle -5,4\rangle$(或其相反向量)。沿 $\langle -5,4\rangle$ 移动使你保持与过 $(1,2)$ 的等高线相切,所以在一阶意义下 $f$ 保持不变,这印证了 $\nabla f$ 与等值集的正交性。
Going deeper: why the gradient is orthogonal to level curves深入探讨:为什么梯度与等高线正交
Let $\mathbf{r}(t) = \langle x(t), y(t)\rangle$ be a curve lying inside a single level curve, so $f(x(t),y(t)) = k$ is constant. Differentiate both sides with respect to $t$ using the chain rule:
$$ \frac{d}{dt} f(x(t),y(t)) = f_x\,x'(t) + f_y\,y'(t) = \nabla f \cdot \mathbf{r}'(t) = 0. $$The right side is $0$ because $k$ is constant. Thus $\nabla f$ is orthogonal to the tangent vector $\mathbf{r}'(t)$ of every curve through the point that stays on the level set. Since $\mathbf{r}'(t)$ spans the tangent direction of the level curve, $\nabla f$ is normal to the level curve. The same argument in three variables shows $\nabla f$ is normal to the level surface, which is exactly what gives tangent planes their normal vectors in Unit C4.
设 $\mathbf{r}(t) = \langle x(t), y(t)\rangle$ 是一条落在某一条等高线内部的曲线,使得 $f(x(t),y(t)) = k$ 为常数。两边对 $t$ 用链式法则求导:
$$ \frac{d}{dt} f(x(t),y(t)) = f_x\,x'(t) + f_y\,y'(t) = \nabla f \cdot \mathbf{r}'(t) = 0. $$右边为 $0$,因为 $k$ 是常数。于是 $\nabla f$ 与每一条过该点且留在等值集上的曲线的切向量 $\mathbf{r}'(t)$ 正交。由于 $\mathbf{r}'(t)$ 张成等高线的切向,所以 $\nabla f$ 是等高线的法向量。三元情形下同样的论证表明 $\nabla f$ 是等值面的法向量,这正是单元 C4 中切平面法向量的来源。
Going deeper: why steepest ascent points along the gradient深入探讨:为什么最速上升沿梯度方向
Assume $f$ is differentiable, so the rate of change of $f$ at a point in a unit direction $\mathbf u$ is the directional derivative $D_{\mathbf u}f = \nabla f\cdot\mathbf u$. We maximize this over all unit vectors $\mathbf u$. By the dot-product form,
$$ \nabla f\cdot\mathbf u = |\nabla f|\,|\mathbf u|\cos\alpha = |\nabla f|\cos\alpha, $$where $\alpha$ is the angle between $\nabla f$ and $\mathbf u$ and $|\mathbf u| = 1$. Since $\cos\alpha$ ranges over $[-1,1]$, the expression is largest when $\cos\alpha = 1$, that is, when $\mathbf u$ points the same way as $\nabla f$. At that maximum,
$$ \max_{|\mathbf u|=1} D_{\mathbf u}f = |\nabla f|. $$This proves both gradient properties at once: the direction of steepest increase is $\nabla f/|\nabla f|$, and the maximum rate of increase is exactly $|\nabla f|$. The minimum, $\cos\alpha = -1$, gives steepest descent along $-\nabla f$ with rate $-|\nabla f|$, and the zero-change directions, $\cos\alpha = 0$, are precisely those perpendicular to $\nabla f$, that is, tangent to the level set. Every qualitative claim about the gradient is a corollary of the Cauchy-Schwarz bound $|\nabla f\cdot\mathbf u| \le |\nabla f|$.
设 $f$ 可微,于是 $f$ 在某点沿单位方向 $\mathbf u$ 的变化率就是方向导数 $D_{\mathbf u}f = \nabla f\cdot\mathbf u$。我们在所有单位向量 $\mathbf u$ 上最大化它。由标量积形式,
$$ \nabla f\cdot\mathbf u = |\nabla f|\,|\mathbf u|\cos\alpha = |\nabla f|\cos\alpha, $$其中 $\alpha$ 是 $\nabla f$ 与 $\mathbf u$ 的夹角,且 $|\mathbf u| = 1$。由于 $\cos\alpha$ 取值范围为 $[-1,1]$,当 $\cos\alpha = 1$,即 $\mathbf u$ 与 $\nabla f$ 同向时,该表达式最大。在此最大值处,
$$ \max_{|\mathbf u|=1} D_{\mathbf u}f = |\nabla f|. $$这一次性证明了梯度的两条性质:最速增加方向是 $\nabla f/|\nabla f|$,而最大增加率恰好是 $|\nabla f|$。最小值 $\cos\alpha = -1$ 给出沿 $-\nabla f$ 的最速下降,速率为 $-|\nabla f|$;而零变化方向 $\cos\alpha = 0$ 恰是那些与 $\nabla f$ 垂直、即与等值集相切的方向。关于梯度的每一条定性结论,都是柯西-施瓦茨不等式 $|\nabla f\cdot\mathbf u| \le |\nabla f|$ 的推论。
Going Deeper深入探讨
differentiability)的严谨定义是:在该点附近存在一个良好的线性近似,即切平面。理解这一差距,正是把机械计算与真正的多元分析区分开来的地方。with $\varepsilon_1, \varepsilon_2 \to 0$ as $(\Delta x,\Delta y)\to(0,0)$.其中当 $(\Delta x,\Delta y)\to(0,0)$ 时 $\varepsilon_1, \varepsilon_2 \to 0$。
Theorem (sufficient condition for differentiability).定理(可微性的充分条件)。 If $f_x$ and $f_y$ exist and are continuous on an open disk around $(a,b)$, then $f$ is differentiable at $(a,b)$, and therefore continuous there. Continuous partials are the practical guarantee you almost always rely on.若 $f_x$ 和 $f_y$ 在 $(a,b)$ 周围某个开圆盘上存在且连续,则 $f$ 在 $(a,b)$ 处可微,因而在那里连续。偏导数连续是你几乎总会依赖的实用保证。
Worked Example 7.1: partials exist but the function is discontinuous例题 7.1:偏导数存在但函数不连续
Consider
$$ f(x,y) = \begin{cases} \dfrac{xy}{x^2+y^2}, & (x,y)\ne(0,0) \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$From the limit definition, $f_x(0,0) = \lim_{h\to 0}\frac{f(h,0)-0}{h} = \lim_{h\to 0}\frac{0}{h} = 0$, and likewise $f_y(0,0)=0$. Both partials exist at the origin.
Yet $f$ is not continuous at the origin: along $y=x$ the value is $\frac{x^2}{2x^2}=\tfrac12 \ne 0 = f(0,0)$. So a function can have all first partials at a point and still fail to be continuous there. Differentiability is strictly stronger than the existence of partials.
考虑
$$ f(x,y) = \begin{cases} \dfrac{xy}{x^2+y^2}, & (x,y)\ne(0,0) \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$由极限定义,$f_x(0,0) = \lim_{h\to 0}\frac{f(h,0)-0}{h} = \lim_{h\to 0}\frac{0}{h} = 0$,同理 $f_y(0,0)=0$。两个偏导数在原点都存在。
然而 $f$ 在原点不连续:沿 $y=x$ 取值为 $\frac{x^2}{2x^2}=\tfrac12 \ne 0 = f(0,0)$。所以一个函数可以在某点拥有全部一阶偏导数,却仍在那里不连续。可微性严格强于偏导数的存在。
Worked Example 7.2: verifying differentiability from the definition例题 7.2:从定义验证可微性
Show that $f(x,y) = x^2 + y^2$ is differentiable at $(1,1)$ directly from the linear-approximation criterion.
Here $f(1,1) = 2$, $f_x = 2x$ so $f_x(1,1) = 2$, and $f_y = 2y$ so $f_y(1,1) = 2$. The proposed linear approximation is $L(\Delta x,\Delta y) = 2 + 2\Delta x + 2\Delta y$. Compute the exact increment:
$$ f(1+\Delta x, 1+\Delta y) = (1+\Delta x)^2 + (1+\Delta y)^2 = 2 + 2\Delta x + 2\Delta y + \Delta x^2 + \Delta y^2. $$Subtract the linear part to isolate the remainder:
$$ f(1+\Delta x,1+\Delta y) - L = \Delta x^2 + \Delta y^2. $$For differentiability we need this remainder to be small compared with the step length $\rho = \sqrt{\Delta x^2 + \Delta y^2}$. Indeed
$$ \frac{\Delta x^2 + \Delta y^2}{\rho} = \frac{\rho^2}{\rho} = \rho \to 0 \quad \text{as } (\Delta x,\Delta y)\to(0,0). $$The error vanishes faster than first order, so the tangent plane $z = 2 + 2(x-1) + 2(y-1)$ genuinely approximates the surface, and $f$ is differentiable at $(1,1)$. This is the positive counterpart to Worked Example 7.1: there the partials existed but the function was not even continuous; here the remainder test passes cleanly.
直接用线性近似判据证明 $f(x,y) = x^2 + y^2$ 在 $(1,1)$ 处可微。
此处 $f(1,1) = 2$,$f_x = 2x$ 故 $f_x(1,1) = 2$,$f_y = 2y$ 故 $f_y(1,1) = 2$。拟用的线性近似是 $L(\Delta x,\Delta y) = 2 + 2\Delta x + 2\Delta y$。计算精确增量:
$$ f(1+\Delta x, 1+\Delta y) = (1+\Delta x)^2 + (1+\Delta y)^2 = 2 + 2\Delta x + 2\Delta y + \Delta x^2 + \Delta y^2. $$减去线性部分以分离出余项:
$$ f(1+\Delta x,1+\Delta y) - L = \Delta x^2 + \Delta y^2. $$要可微,我们需要这个余项相对于步长 $\rho = \sqrt{\Delta x^2 + \Delta y^2}$ 来说很小。确实
$$ \frac{\Delta x^2 + \Delta y^2}{\rho} = \frac{\rho^2}{\rho} = \rho \to 0 \quad \text{as } (\Delta x,\Delta y)\to(0,0). $$误差以快于一阶的速度趋于零,所以切平面 $z = 2 + 2(x-1) + 2(y-1)$ 确实近似了该曲面,$f$ 在 $(1,1)$ 处可微。这是例题 7.1 的正面对照:那里偏导数存在但函数甚至不连续;这里余项检验干净通过。
Going deeper: the chain of implications, and where it breaks深入探讨:蕴含关系链,以及它在哪里断裂
For functions of several variables the logical hierarchy is strictly one-directional:
$$ \text{continuous partials} \implies \text{differentiable} \implies \begin{cases} \text{continuous} \\ \text{all partials exist} \end{cases} $$Each arrow is a theorem, and none of them reverses. Reading the diagram:
Differentiable implies continuous. If the increment $\Delta z = f_x\Delta x + f_y\Delta y + \varepsilon_1\Delta x + \varepsilon_2\Delta y$ holds with $\varepsilon_i\to 0$, then as $(\Delta x,\Delta y)\to(0,0)$ every term on the right tends to $0$, so $\Delta z\to 0$, which is exactly continuity at the point.
Differentiable implies partials exist. Set $\Delta y = 0$ in the increment formula and divide by $\Delta x$; the limit is $f_x$. So differentiability automatically produces both partials.
The converses fail. Worked Example 7.1 has both partials at the origin yet is discontinuous, so "partials exist" does not give differentiability. And the single-variable function $|x|$ shows continuity does not give differentiability even in one dimension. The only reliable one-way ticket up the chain is the top hypothesis, continuity of the partials, which is why every practical theorem assumes it. Memorize the direction of the arrows: existence of partials is the weakest condition, continuous partials the strongest.
对多元函数,这条逻辑层级严格是单向的:
$$ \text{continuous partials} \implies \text{differentiable} \implies \begin{cases} \text{continuous} \\ \text{all partials exist} \end{cases} $$每个箭头都是一条定理,没有一个可逆。逐条解读:
可微蕴含连续。 若增量 $\Delta z = f_x\Delta x + f_y\Delta y + \varepsilon_1\Delta x + \varepsilon_2\Delta y$ 成立且 $\varepsilon_i\to 0$,则当 $(\Delta x,\Delta y)\to(0,0)$ 时右边每一项都趋于 $0$,故 $\Delta z\to 0$,这正是该点的连续性。
可微蕴含偏导数存在。 在增量公式中令 $\Delta y = 0$ 再除以 $\Delta x$,其极限就是 $f_x$。所以可微性自动产生两个偏导数。
逆命题不成立。 例题 7.1 在原点有两个偏导数却不连续,所以“偏导数存在”给不出可微性。而单变量函数 $|x|$ 表明即便在一维,连续也给不出可微。沿这条链向上唯一可靠的单程票是顶端的前提,即偏导数连续,这正是每条实用定理都假定它的原因。记牢箭头的方向:偏导数存在是最弱的条件,偏导数连续是最强的。
Flashcards记忆卡片
Unit Quiz单元测验
Readiness Checklist就绪清单
Tap each item you can do without notes. 点击你无需参考资料即可完成的项目。0 / 8 mastered0 / 8 已掌握
- Find the domain of a function of two variables and describe its level curves.求二元函数的定义域并描述其等高线。
- Decide whether a multivariable limit exists using the two-path test, and prove a limit using polar bounds.用双路径检验判断多元极限是否存在,并用极坐标定界证明极限。
- Compute first partial derivatives, treating the other variables as constants.把其余变量当作常数,计算一阶偏导数。
- Compute second-order partials and verify Clairaut's theorem for the mixed partials.计算二阶偏导数,并对混合偏导数验证克莱罗定理。
- Apply the multivariable chain rule with one and with two independent variables.在一个和两个自变量的情形下应用多元链式法则。
- Compute the gradient and identify the direction and maximum rate of steepest increase.计算梯度,并确定最速增加的方向与最大速率。
- Explain why the gradient is orthogonal to level curves and level surfaces.解释为什么梯度与等高线和等值面正交。
- State the difference between existence of partials and differentiability, and the sufficient condition for differentiability.陈述偏导数存在与可微性之间的区别,以及可微性的充分条件。