University Calculus · Calculus III大学微积分 · 微积分 III

Unit C4: Directional Derivatives, Tangent Planes, Linearization第 C4 单元：方向导数（`directional derivative`）、切平面（`tangent plane`）与线性化（`linearization`）

From the gradient to the geometry it controls: rates of change in any direction, tangent planes to surfaces, and the linear approximations that make multivariable calculus computable.从梯度（gradient）出发，掌握它所支配的几何：任意方向上的变化率、曲面的切平面，以及让多元微积分可计算的线性近似。

Calculus III微积分 III Multivariable多变量 Vector Calculus向量微积分 MIT 18.02 / GT 2551 / Princeton MAT 201MIT 18.02 / GT 2551 / Princeton MAT 201

Read me first.阅读须知。 This unit turns the gradient from Unit C3 into a geometric tool. You will compute directional derivatives as $\nabla f \cdot \mathbf{u}$, read off the steepest-ascent direction, write tangent planes to both graphs and level surfaces, and use linearization and differentials for estimation and error propagation. Keep one fact in view throughout: the gradient is perpendicular to level sets and points in the direction of fastest increase.本单元把 C3 单元的梯度（gradient）转化为几何工具。你将把方向导数（directional derivative）计算为 $\nabla f \cdot \mathbf{u}$，读出最速上升方向，为函数图像和等值面（level surface）写出切平面（tangent plane），并用线性化（linearization）与微分（differential）做估值和误差传播。请始终牢记一个事实：梯度垂直于等值集，并指向增长最快的方向。

Section 1第 1 节

Directional Derivatives方向导数（`directional derivative`）

Partial derivatives measure the rate of change of $f$ along the coordinate axes. The directional derivative generalizes this to the rate of change along any unit vector, which is the central object of this unit.偏导数（partial derivative）度量 $f$ 沿坐标轴方向的变化率。方向导数（directional derivative）把它推广为沿任意单位向量（unit vector）方向的变化率，这是本单元的核心对象。

Key idea.核心思想。 The directional derivative of $f$ at a point $\mathbf{a}$ in the direction of a unit vector $\mathbf{u}$ is the instantaneous rate of change of $f$ as you step away from $\mathbf{a}$ along the line $\mathbf{a} + t\mathbf{u}$. It is a single number that answers: how fast does $f$ change per unit of distance traveled in direction $\mathbf{u}$?$f$ 在点 $\mathbf{a}$ 处沿单位向量（unit vector）$\mathbf{u}$ 方向的方向导数，是当你沿直线 $\mathbf{a} + t\mathbf{u}$ 离开 $\mathbf{a}$ 时 $f$ 的瞬时变化率。它是一个数，回答了：沿方向 $\mathbf{u}$ 每移动单位距离，$f$ 变化多快？

Definition (limit form)定义（极限形式）

$$ D_{\mathbf{u}} f(\mathbf{a}) = \lim_{t \to 0} \frac{f(\mathbf{a} + t\mathbf{u}) - f(\mathbf{a})}{t}, \qquad |\mathbf{u}| = 1. $$

When $f$ is differentiable at $\mathbf{a}$, this limit is computed without returning to the definition. Composing $f$ with the line $\mathbf{r}(t) = \mathbf{a} + t\mathbf{u}$ and applying the chain rule gives the gradient dot-product formula.当 $f$ 在 $\mathbf{a}$ 处可微（differentiable）时，这个极限无需回到定义即可计算。把 $f$ 与直线 $\mathbf{r}(t) = \mathbf{a} + t\mathbf{u}$ 复合，再用链式法则（Chain Rule），即得梯度点积公式。

Computation via the gradient用梯度计算

$$ D_{\mathbf{u}} f(\mathbf{a}) = \nabla f(\mathbf{a}) \cdot \mathbf{u} = f_x(\mathbf{a})\, u_1 + f_y(\mathbf{a})\, u_2 + \cdots $$

Remark.说明。 The unit-length requirement is essential. If you use a non-unit vector $\mathbf{v}$, first normalize: $\mathbf{u} = \mathbf{v} / |\mathbf{v}|$. Skipping normalization scales the answer by $|\mathbf{v}|$ and the result no longer has the meaning of rate of change per unit distance.单位长度的要求至关重要。若使用非单位向量 $\mathbf{v}$，须先归一化（normalize）：$\mathbf{u} = \mathbf{v} / |\mathbf{v}|$。跳过归一化会把答案放大 $|\mathbf{v}|$ 倍，结果便不再具有"每单位距离的变化率"这一含义。

It is worth seeing why the gradient formula is true rather than only memorizing it. Define the single-variable slice $g(t) = f(\mathbf{a} + t\mathbf{u})$. By the limit definition, $D_{\mathbf{u}} f(\mathbf{a})$ is exactly $g'(0)$. If $f$ is differentiable at $\mathbf{a}$, the multivariable chain rule applied to $g(t) = f(x(t), y(t))$ with $x(t) = a_1 + t u_1$ and $y(t) = a_2 + t u_2$ gives $g'(t) = f_x\, x'(t) + f_y\, y'(t) = f_x u_1 + f_y u_2$. Evaluating at $t = 0$ returns $\nabla f(\mathbf{a}) \cdot \mathbf{u}$. The whole content of the formula is that, for a differentiable function, the rate of change along a line is the projection of the gradient onto the line.值得弄清梯度公式为何成立，而不只是死记。定义单变量切片 $g(t) = f(\mathbf{a} + t\mathbf{u})$。由极限定义，$D_{\mathbf{u}} f(\mathbf{a})$ 恰为 $g'(0)$。若 $f$ 在 $\mathbf{a}$ 处可微，对 $g(t) = f(x(t), y(t))$（其中 $x(t) = a_1 + t u_1$，$y(t) = a_2 + t u_2$）应用多元链式法则，得 $g'(t) = f_x\, x'(t) + f_y\, y'(t) = f_x u_1 + f_y u_2$。在 $t = 0$ 处取值即得 $\nabla f(\mathbf{a}) \cdot \mathbf{u}$。公式的全部内容是：对可微函数而言，沿一条直线的变化率就是梯度在该直线上的投影。

Common error.常见错误。 A very frequent mistake is to plug the raw direction vector $\mathbf{v}$ into $\nabla f \cdot \mathbf{v}$ without normalizing. For $\mathbf{v} = \langle 3,4\rangle$ this returns $\nabla f \cdot \langle 3,4\rangle$, which is $5$ times too large, because $|\mathbf{v}| = 5$. The fix is mechanical: always divide by $|\mathbf{v}|$ first, and remember that the answer to a directional derivative is invariant to how long you wrote the direction vector. A second, sneakier error is to compute $\nabla f$ as a function but forget to evaluate it at the point $\mathbf{a}$ before dotting with $\mathbf{u}$.一个非常常见的错误是把原始方向向量 $\mathbf{v}$ 直接代入 $\nabla f \cdot \mathbf{v}$ 而不归一化。对 $\mathbf{v} = \langle 3,4\rangle$，这给出 $\nabla f \cdot \langle 3,4\rangle$，由于 $|\mathbf{v}| = 5$，结果偏大 $5$ 倍。改正方法很机械：总是先除以 $|\mathbf{v}|$，并记住方向导数的答案与你把方向向量写多长无关。第二个更隐蔽的错误是：把 $\nabla f$ 当作函数求出后，忘记在与 $\mathbf{u}$ 点乘之前先在点 $\mathbf{a}$ 处取值。

Worked Example 1.1: a directional derivative from the gradient例题 1.1：由梯度求方向导数

Let $f(x,y) = x^2 y + 3y$. Find $D_{\mathbf{u}} f$ at $(1,2)$ in the direction of $\mathbf{v} = \langle 3, 4 \rangle$.设 $f(x,y) = x^2 y + 3y$。求 $f$ 在 $(1,2)$ 处沿 $\mathbf{v} = \langle 3, 4 \rangle$ 方向的 $D_{\mathbf{u}} f$。

First the gradient: $f_x = 2xy$, $f_y = x^2 + 3$, so $\nabla f(1,2) = \langle 4, 4 \rangle$.先求梯度：$f_x = 2xy$，$f_y = x^2 + 3$，故 $\nabla f(1,2) = \langle 4, 4 \rangle$。

Normalize: $|\mathbf{v}| = 5$, so $\mathbf{u} = \langle 3/5, 4/5 \rangle$. Then归一化：$|\mathbf{v}| = 5$，故 $\mathbf{u} = \langle 3/5, 4/5 \rangle$。于是

$$ D_{\mathbf{u}} f(1,2) = \langle 4,4 \rangle \cdot \langle 3/5, 4/5 \rangle = \tfrac{12}{5} + \tfrac{16}{5} = \tfrac{28}{5}. $$

Worked Example 1.2: a three-variable directional derivative例题 1.2：三变量的方向导数

Let $f(x,y,z) = xy^2 z^3$. Find $D_{\mathbf{u}} f$ at the point $(2,-1,1)$ in the direction toward $(0,0,0)$, that is along $\mathbf{v} = \langle 0,0,0\rangle - \langle 2,-1,1\rangle = \langle -2, 1, -1\rangle$.设 $f(x,y,z) = xy^2 z^3$。求 $f$ 在点 $(2,-1,1)$ 处沿指向 $(0,0,0)$ 的方向的 $D_{\mathbf{u}} f$，即沿 $\mathbf{v} = \langle 0,0,0\rangle - \langle 2,-1,1\rangle = \langle -2, 1, -1\rangle$。

The partials are $f_x = y^2 z^3$, $f_y = 2xy z^3$, $f_z = 3xy^2 z^2$. At $(2,-1,1)$:各偏导数为 $f_x = y^2 z^3$，$f_y = 2xy z^3$，$f_z = 3xy^2 z^2$。在 $(2,-1,1)$ 处：

$$ \nabla f(2,-1,1) = \langle (1)(1),\ 2(2)(-1)(1),\ 3(2)(1)(1)\rangle = \langle 1,\ -4,\ 6\rangle. $$

Normalize the direction: $|\mathbf{v}| = \sqrt{4 + 1 + 1} = \sqrt6$, so $\mathbf{u} = \tfrac{1}{\sqrt6}\langle -2,1,-1\rangle$. Then把方向归一化：$|\mathbf{v}| = \sqrt{4 + 1 + 1} = \sqrt6$，故 $\mathbf{u} = \tfrac{1}{\sqrt6}\langle -2,1,-1\rangle$。于是

$$ D_{\mathbf{u}} f = \langle 1,-4,6\rangle \cdot \tfrac{1}{\sqrt6}\langle -2,1,-1\rangle = \frac{-2 - 4 - 6}{\sqrt6} = \frac{-12}{\sqrt6} = -2\sqrt6. $$

The negative sign tells us $f$ is decreasing as we head from $(2,-1,1)$ toward the origin, at a rate of $2\sqrt6 \approx 4.90$ units of $f$ per unit of distance.负号表明：从 $(2,-1,1)$ 朝原点方向走时 $f$ 在减小，速率为每单位距离减少 $2\sqrt6 \approx 4.90$ 个单位的 $f$。

Worked Example 1.3: recovering a partial derivative as a special direction例题 1.3：把偏导数看作特殊方向的特例

Directional derivatives generalize partials, so the partials must reappear as special cases. Take $f(x,y) = e^{x}\sin y$ and the standard basis direction $\mathbf{u} = \mathbf{i} = \langle 1, 0\rangle$ at a general point $(x,y)$.方向导数是偏导数的推广，因此偏导数必定作为特例重新出现。取 $f(x,y) = e^{x}\sin y$，在一般点 $(x,y)$ 处沿标准基方向 $\mathbf{u} = \mathbf{i} = \langle 1, 0\rangle$。

$\nabla f = \langle e^x \sin y,\ e^x \cos y\rangle$, and since $\mathbf{i}$ is already a unit vector,$\nabla f = \langle e^x \sin y,\ e^x \cos y\rangle$，由于 $\mathbf{i}$ 已是单位向量，

$$ D_{\mathbf{i}} f = \nabla f \cdot \langle 1,0\rangle = e^x \sin y = f_x. $$

Likewise $D_{\mathbf{j}} f = f_y = e^x \cos y$. This is the sanity check that anchors the whole subject: the directional derivative in the direction of a coordinate axis is just the partial derivative for that variable. If a formula ever fails this test, it is wrong.同理 $D_{\mathbf{j}} f = f_y = e^x \cos y$。这是贯穿整个主题的检验基准：沿坐标轴方向的方向导数就是该变量的偏导数。任何公式若通不过这一检验，就是错的。

Let $f(x,y)=x^2+y^2$ with $\nabla f(1,1)=\langle 2,2\rangle$. What is $D_{\mathbf{u}}f(1,1)$ in the direction $\mathbf{v}=\langle 1,1\rangle$?设 $f(x,y)=x^2+y^2$，且 $\nabla f(1,1)=\langle 2,2\rangle$。沿方向 $\mathbf{v}=\langle 1,1\rangle$ 的 $D_{\mathbf{u}}f(1,1)$ 是多少？

1.1

$4$

$2\sqrt{2}$

$\sqrt{2}$

$0$

Correct. Normalize: $\mathbf{u}=\langle 1/\sqrt2,1/\sqrt2\rangle$, so $\langle 2,2\rangle\cdot\mathbf{u}=2/\sqrt2+2/\sqrt2=4/\sqrt2=2\sqrt2$.正确。归一化：$\mathbf{u}=\langle 1/\sqrt2,1/\sqrt2\rangle$，故 $\langle 2,2\rangle\cdot\mathbf{u}=2/\sqrt2+2/\sqrt2=4/\sqrt2=2\sqrt2$。

Use $\nabla f\cdot\mathbf{u}$ with a unit vector. The value $4$ forgets to normalize; $\sqrt2$ and $0$ are not consistent with the dot product.应对单位向量使用 $\nabla f\cdot\mathbf{u}$。$4$ 忘了归一化；$\sqrt2$ 和 $0$ 与点积结果不符。

Section 2第 2 节

The Gradient and Steepest Ascent梯度（`gradient`）与最速上升

Key idea.核心思想。 Because $D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = |\nabla f|\cos\theta$, where $\theta$ is the angle between $\mathbf{u}$ and $\nabla f$, the directional derivative is largest when $\theta = 0$. The gradient points in the direction of steepest increase of $f$, and its magnitude is the maximum rate of increase.因为 $D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = |\nabla f|\cos\theta$，其中 $\theta$ 是 $\mathbf{u}$ 与 $\nabla f$ 的夹角，所以当 $\theta = 0$ 时方向导数最大。梯度指向 $f$ 增长最陡的方向，其模即为最大增长率。

Steepest ascent, descent, and level directions最速上升、最速下降与水平方向

$$ \max_{|\mathbf{u}|=1} D_{\mathbf{u}} f = |\nabla f|\ \text{(at } \mathbf{u}=\nabla f/|\nabla f|), \qquad \min_{|\mathbf{u}|=1} D_{\mathbf{u}} f = -|\nabla f|, \qquad D_{\mathbf{u}} f = 0 \iff \mathbf{u}\perp\nabla f. $$

The three facts follow from a single trigonometric identity. The steepest descent direction is $-\nabla f$, and any direction perpendicular to $\nabla f$ keeps $f$ momentarily constant, which is exactly the tangent direction to the level curve.这三个事实都来自同一个三角恒等式。最速下降方向是 $-\nabla f$，任何垂直于 $\nabla f$ 的方向都使 $f$ 瞬时保持不变，而这正是等值线（level curve）的切线方向。

Going deeper: why $\nabla f$ is the steepest-ascent direction深入探讨：为何 $\nabla f$ 是最速上升方向

Fix a point where $\nabla f \neq \mathbf{0}$. For any unit vector $\mathbf{u}$, the Cauchy-Schwarz inequality gives固定一个满足 $\nabla f \neq \mathbf{0}$ 的点。对任意单位向量 $\mathbf{u}$，由柯西-施瓦茨不等式（Cauchy-Schwarz inequality）得

$$ D_{\mathbf{u}} f = \nabla f \cdot \mathbf{u} = |\nabla f|\,|\mathbf{u}|\cos\theta = |\nabla f|\cos\theta. $$

Since $-1 \le \cos\theta \le 1$, the value ranges over $[-|\nabla f|,\, |\nabla f|]$. The maximum $|\nabla f|$ is attained exactly when $\cos\theta = 1$, that is when $\mathbf{u}$ is parallel to $\nabla f$. The minimum is attained at $\mathbf{u} = -\nabla f / |\nabla f|$, and $D_{\mathbf{u}} f = 0$ precisely when $\cos\theta = 0$, i.e. $\mathbf{u}\perp\nabla f$. If $\nabla f = \mathbf{0}$ every directional derivative is zero and the point is critical.由于 $-1 \le \cos\theta \le 1$，该值取遍 $[-|\nabla f|,\, |\nabla f|]$。最大值 $|\nabla f|$ 恰在 $\cos\theta = 1$ 时取得，即 $\mathbf{u}$ 与 $\nabla f$ 平行时。最小值在 $\mathbf{u} = -\nabla f / |\nabla f|$ 处取得；而 $D_{\mathbf{u}} f = 0$ 当且仅当 $\cos\theta = 0$，即 $\mathbf{u}\perp\nabla f$。若 $\nabla f = \mathbf{0}$，则每个方向导数都为零，该点为临界点（critical point）。

Common error.常见错误。 Students often report the maximum rate of increase as the gradient vector $\nabla f$ itself, or as a component of it, rather than as the magnitude $|\nabla f|$. The maximum rate is a single nonnegative number, $|\nabla f|$; the gradient vector answers a different question, namely the direction. A related slip is to give the steepest-descent direction as $\nabla f$ with the sign of the rate flipped. The descent direction is the vector $-\nabla f$, and the rate in that direction is $-|\nabla f|$. Keep the direction (a vector) and the rate (a scalar) in separate boxes.学生常把最大增长率报成梯度向量 $\nabla f$ 本身或它的某个分量，而非模 $|\nabla f|$。最大增长率是一个非负的数 $|\nabla f|$；梯度向量回答的是另一个问题，即方向。相关的失误是把最速下降方向写成 $\nabla f$ 而只翻转速率的符号。下降方向是向量 $-\nabla f$，该方向上的速率为 $-|\nabla f|$。请把方向（向量）和速率（标量）分开存放。

Worked Example 2.1: hottest direction to walk例题 2.1：往哪个方向走升温最快

Temperature is $T(x,y) = 100 - x^2 - 2y^2$. At $(2,1)$, in which direction does $T$ increase fastest, and how fast?温度为 $T(x,y) = 100 - x^2 - 2y^2$。在 $(2,1)$ 处，$T$ 沿哪个方向增长最快，增长有多快？

$\nabla T = \langle -2x, -4y \rangle$, so $\nabla T(2,1) = \langle -4, -4 \rangle$. The steepest-ascent direction is $\langle -4,-4\rangle$, or as a unit vector $\langle -1/\sqrt2, -1/\sqrt2\rangle$. The maximum rate is $|\nabla T| = \sqrt{16+16} = 4\sqrt2$ degrees per unit distance.$\nabla T = \langle -2x, -4y \rangle$，故 $\nabla T(2,1) = \langle -4, -4 \rangle$。最速上升方向是 $\langle -4,-4\rangle$，写成单位向量为 $\langle -1/\sqrt2, -1/\sqrt2\rangle$。最大速率为 $|\nabla T| = \sqrt{16+16} = 4\sqrt2$ 度每单位距离。

Worked Example 2.2: a prescribed rate of change in a chosen direction例题 2.2：在选定方向上达到指定的变化率

Let $f(x,y) = x e^{y}$. At the point $P=(2,0)$, find a unit vector $\mathbf{u}$ for which $D_{\mathbf{u}} f(P) = 1$, and explain when no such direction exists.设 $f(x,y) = x e^{y}$。在点 $P=(2,0)$ 处，求使 $D_{\mathbf{u}} f(P) = 1$ 的单位向量 $\mathbf{u}$，并说明何时不存在这样的方向。

First, $\nabla f = \langle e^y,\ x e^y\rangle$, so $\nabla f(2,0) = \langle 1, 2\rangle$ and $|\nabla f| = \sqrt5$. Writing $D_{\mathbf{u}} f = |\nabla f|\cos\theta = \sqrt5\cos\theta$, we need $\sqrt5 \cos\theta = 1$, hence $\cos\theta = 1/\sqrt5 \approx 0.447$, giving $\theta \approx 63.4^\circ$ measured from the gradient. A prescribed rate $r$ is achievable as a directional derivative precisely when $|r| \le |\nabla f|$, because $D_{\mathbf{u}} f$ ranges over $[-|\nabla f|, |\nabla f|]$. Here $1 \le \sqrt5$, so two directions work (one on each side of $\nabla f$). For a concrete one, rotate the unit gradient $\langle 1/\sqrt5, 2/\sqrt5\rangle$ by $\theta = 63.4^\circ$. Asking for a rate above $\sqrt5$ would be impossible: no direction can beat the steepest one.首先，$\nabla f = \langle e^y,\ x e^y\rangle$，故 $\nabla f(2,0) = \langle 1, 2\rangle$，$|\nabla f| = \sqrt5$。写成 $D_{\mathbf{u}} f = |\nabla f|\cos\theta = \sqrt5\cos\theta$，需要 $\sqrt5 \cos\theta = 1$，从而 $\cos\theta = 1/\sqrt5 \approx 0.447$，即从梯度方向起量得 $\theta \approx 63.4^\circ$。指定速率 $r$ 能作为方向导数实现，当且仅当 $|r| \le |\nabla f|$，因为 $D_{\mathbf{u}} f$ 取遍 $[-|\nabla f|, |\nabla f|]$。此处 $1 \le \sqrt5$，故有两个方向可行（梯度两侧各一个）。具体取一个：把单位梯度 $\langle 1/\sqrt5, 2/\sqrt5\rangle$ 旋转 $\theta = 63.4^\circ$ 即可。要求速率超过 $\sqrt5$ 则不可能：没有方向能胜过最陡的方向。

Worked Example 2.3: steepest ascent on a hillside例题 2.3：山坡上的最速上升

The height of a hill is $h(x,y) = 200 - \tfrac{1}{100}(3x^2 + 2y^2)$ meters, with $x,y$ in meters. A hiker stands above the ground point $(60, 40)$. Find the bearing of steepest ascent and the slope encountered in that direction.某山丘的高度为 $h(x,y) = 200 - \tfrac{1}{100}(3x^2 + 2y^2)$ 米，其中 $x,y$ 以米为单位。一名登山者站在地面点 $(60, 40)$ 正上方。求最速上升的方位以及该方向上的坡度。

$\nabla h = \langle -\tfrac{6x}{100}, -\tfrac{4y}{100}\rangle = \langle -0.06x, -0.04y\rangle$. At $(60,40)$: $\nabla h = \langle -3.6,\ -1.6\rangle$. Steepest ascent points along $\langle -3.6, -1.6\rangle$, i.e. back toward smaller $x$ and $y$, which makes sense because the summit is at the origin where $h$ is largest. The slope in that direction is$\nabla h = \langle -\tfrac{6x}{100}, -\tfrac{4y}{100}\rangle = \langle -0.06x, -0.04y\rangle$。在 $(60,40)$ 处：$\nabla h = \langle -3.6,\ -1.6\rangle$。最速上升沿 $\langle -3.6, -1.6\rangle$，即朝更小的 $x$ 和 $y$ 回退，这很合理，因为山顶在原点，那里 $h$ 最大。该方向上的坡度为

$$ |\nabla h| = \sqrt{(-3.6)^2 + (-1.6)^2} = \sqrt{12.96 + 2.56} = \sqrt{15.52} \approx 3.94. $$

So the hiker rises about $3.94$ meters of height per meter of horizontal travel in the steepest direction. Walking perpendicular to $\nabla h$, along $\langle 1.6, -3.6\rangle$, keeps elevation momentarily constant: that is a contour line of the hill.因此登山者沿最陡方向每水平前进 $1$ 米，高度约上升 $3.94$ 米。沿垂直于 $\nabla h$ 的方向 $\langle 1.6, -3.6\rangle$ 行走则瞬时保持海拔不变：那就是山丘的一条等高线。

If $\nabla f(\mathbf{a}) = \langle 3, -4 \rangle$, what is the maximum value of $D_{\mathbf{u}} f(\mathbf{a})$ over all unit $\mathbf{u}$?若 $\nabla f(\mathbf{a}) = \langle 3, -4 \rangle$，则在所有单位 $\mathbf{u}$ 上 $D_{\mathbf{u}} f(\mathbf{a})$ 的最大值是多少？

2.1

$-1$

$7$

$25$

$5$

Correct. The maximum directional derivative equals $|\nabla f| = \sqrt{9+16} = 5$.正确。最大方向导数等于 $|\nabla f| = \sqrt{9+16} = 5$。

The maximum rate is the magnitude $|\nabla f|=\sqrt{3^2+(-4)^2}=5$, not the component sum, the squared magnitude, or a difference.最大速率是模 $|\nabla f|=\sqrt{3^2+(-4)^2}=5$，而非分量之和、模的平方或某个差。

Section 3第 3 节

Tangent Planes切平面（`tangent plane`）

Key idea.核心思想。 If $z = f(x,y)$ is differentiable at $(a,b)$, the graph has a tangent plane at the point $(a,b,f(a,b))$. The plane is built from the two partial derivatives, which give its slopes in the $x$ and $y$ directions. Differentiability is exactly the condition that this plane is a good local fit to the surface.若 $z = f(x,y)$ 在 $(a,b)$ 处可微，则其图像在点 $(a,b,f(a,b))$ 处有切平面（tangent plane）。该平面由两个偏导数构成，它们给出平面在 $x$ 与 $y$ 方向上的斜率。可微性正是保证该平面在局部很好地贴合曲面的条件。

Tangent plane to $z = f(x,y)$$z = f(x,y)$ 的切平面

$$ z = f(a,b) + f_x(a,b)(x-a) + f_y(a,b)(y-b). $$

This is the graph version of linearization. The same plane can be written with the gradient, which makes the perpendicularity to $\nabla F$ for a level surface (Section 6) transparent. There is a useful bridge between the two viewpoints: a graph $z = f(x,y)$ is itself a level surface of the three-variable function $F(x,y,z) = f(x,y) - z$ at level $0$. Then $\nabla F = \langle f_x, f_y, -1\rangle$, so the surface normal is $\langle f_x, f_y, -1\rangle$ and the plane $f_x(x-a) + f_y(y-b) - (z - f(a,b)) = 0$ rearranges to exactly the formula above. The $-1$ in the third slot is the signature of a graph.这是线性化（linearization）的图像版本。同一平面也可用梯度写出，这使第 6 节中它与等值面法向量 $\nabla F$ 的垂直关系一目了然。两种视角之间有一座有用的桥梁：图像 $z = f(x,y)$ 本身就是三变量函数 $F(x,y,z) = f(x,y) - z$ 在水平值 $0$ 处的等值面。于是 $\nabla F = \langle f_x, f_y, -1\rangle$，故曲面法向量为 $\langle f_x, f_y, -1\rangle$，而平面 $f_x(x-a) + f_y(y-b) - (z - f(a,b)) = 0$ 整理后恰好是上面的公式。第三个分量上的 $-1$ 是"图像"这一情形的标志。

Common error.常见错误。 A tangent plane is not found by treating $z$ as a constant. The most common mistake is to forget the base value $f(a,b)$ and write $z = f_x(a,b)(x-a) + f_y(a,b)(y-b)$, which is a plane through the origin offset, not through the point on the surface. Always anchor the plane at the actual surface point: it must satisfy $z = f(a,b)$ when $(x,y) = (a,b)$. A second error is to evaluate the partials symbolically and forget to plug in $(a,b)$, leaving variables where numbers belong; the coefficients of a tangent plane are constants.求切平面不能把 $z$ 当作常数处理。最常见的错误是漏掉基准值 $f(a,b)$，写成 $z = f_x(a,b)(x-a) + f_y(a,b)(y-b)$，这是一个平移后过原点的平面，而非过曲面上那一点的平面。始终把平面锚定在真实的曲面点上：当 $(x,y) = (a,b)$ 时它必须满足 $z = f(a,b)$。第二个错误是把偏导数符号化求出后忘记代入 $(a,b)$，在本应是数字的位置留下了变量；切平面的系数都是常数。

Worked Example 3.1: tangent plane to a paraboloid例题 3.1：抛物面的切平面

Find the tangent plane to $z = x^2 + y^2$ at $(1, 2, 5)$.求 $z = x^2 + y^2$ 在 $(1, 2, 5)$ 处的切平面。

$f_x = 2x$, $f_y = 2y$, so $f_x(1,2) = 2$ and $f_y(1,2) = 4$. With $f(1,2) = 5$:$f_x = 2x$，$f_y = 2y$，故 $f_x(1,2) = 2$，$f_y(1,2) = 4$。又 $f(1,2) = 5$：

$$ z = 5 + 2(x-1) + 4(y-2) = 2x + 4y - 5. $$

Check: at $(1,2)$ this gives $z = 2 + 8 - 5 = 5$, matching the point on the surface.检验：在 $(1,2)$ 处得 $z = 2 + 8 - 5 = 5$，与曲面上的点一致。

Worked Example 3.2: tangent plane to a non-polynomial graph例题 3.2：非多项式图像的切平面

Find the tangent plane to $z = \ln(x^2 + y^2)$ at the point above $(1, 0)$.求 $z = \ln(x^2 + y^2)$ 在 $(1, 0)$ 正上方那一点处的切平面。

First the base value: $f(1,0) = \ln(1) = 0$, so the surface point is $(1,0,0)$. The partials are先求基准值：$f(1,0) = \ln(1) = 0$，故曲面点为 $(1,0,0)$。各偏导数为

$$ f_x = \frac{2x}{x^2+y^2}, \qquad f_y = \frac{2y}{x^2+y^2}. $$

At $(1,0)$: $f_x = 2/1 = 2$ and $f_y = 0/1 = 0$. The tangent plane is在 $(1,0)$ 处：$f_x = 2/1 = 2$，$f_y = 0/1 = 0$。切平面为

$$ z = 0 + 2(x-1) + 0(y-0) = 2(x-1) = 2x - 2. $$

Notice the plane does not involve $y$ at all, because the surface is locally flat in the $y$ direction at this point: $f_y(1,0) = 0$. That is geometrically sensible since $(1,0)$ sits on the $x$-axis, a symmetry line of $\ln(x^2+y^2)$.注意该平面完全不含 $y$，因为在这一点曲面沿 $y$ 方向局部是平的：$f_y(1,0) = 0$。这在几何上很合理，因为 $(1,0)$ 位于 $x$ 轴上，而 $x$ 轴是 $\ln(x^2+y^2)$ 的一条对称线。

Worked Example 3.3: the normal line to a graph例题 3.3：图像的法线

The tangent plane comes paired with a normal line. For $z = x^2 + y^2$ at $(1,2,5)$, using the graph normal $\langle f_x, f_y, -1\rangle = \langle 2, 4, -1\rangle$, the normal line through $(1,2,5)$ is切平面总与一条法线（normal line）配对出现。对 $z = x^2 + y^2$ 在 $(1,2,5)$ 处，用图像法向量 $\langle f_x, f_y, -1\rangle = \langle 2, 4, -1\rangle$，过 $(1,2,5)$ 的法线为

$$ (x,y,z) = (1,2,5) + t\langle 2, 4, -1\rangle, \quad \text{i.e.}\quad x = 1 + 2t,\ y = 2 + 4t,\ z = 5 - t. $$

This line is perpendicular to the tangent plane $2x + 4y - z = 5$ found by rearranging Worked Example 3.1. A quick check: the plane's coefficient vector $\langle 2, 4, -1\rangle$ matches the direction of the normal line, confirming they are perpendicular and parallel respectively, as they must be.该直线垂直于把例题 3.1 整理后得到的切平面 $2x + 4y - z = 5$。快速检验：平面的系数向量 $\langle 2, 4, -1\rangle$ 与法线的方向一致，证实它们分别相互垂直、相互平行，正如理应如此。

For $f(x,y) = xy$ with $f_x(2,3)=3$ and $f_y(2,3)=2$, which is the tangent plane at $(2,3,6)$?设 $f(x,y) = xy$，$f_x(2,3)=3$，$f_y(2,3)=2$，则 $(2,3,6)$ 处的切平面是哪一个？

3.1

$z = 6 + 3(x-2) + 2(y-3)$

$z = 6 + 2(x-2) + 3(y-3)$

$z = 6 + 3x + 2y$

$z = 3(x-2) + 2(y-3)$

Correct. The plane is $f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$, here $6+3(x-2)+2(y-3)$.正确。平面为 $f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$，此处即 $6+3(x-2)+2(y-3)$。

Match each partial to its own variable, keep the base value $f(a,b)=6$, and use the increments $(x-2)$ and $(y-3)$.让每个偏导数对应它自己的变量，保留基准值 $f(a,b)=6$，并使用增量 $(x-2)$ 与 $(y-3)$。

Section 4第 4 节

Linear Approximation and Differentials线性近似与微分（`differential`）

Key idea.核心思想。 The linearization $L(x,y)$ of $f$ at $(a,b)$ is the function whose graph is the tangent plane. For points near $(a,b)$, $f(x,y) \approx L(x,y)$, which lets us estimate values and propagate small errors without evaluating $f$ exactly.$f$ 在 $(a,b)$ 处的线性化（linearization）$L(x,y)$，就是以切平面为图像的那个函数。对 $(a,b)$ 附近的点，$f(x,y) \approx L(x,y)$，这让我们无需精确计算 $f$ 就能估值并传播小误差。

Linearization线性化

$$ L(x,y) = f(a,b) + f_x(a,b)(x-a) + f_y(a,b)(y-b). $$

Total differential全微分（total differential）

$$ dz = f_x(a,b)\, dx + f_y(a,b)\, dy, \qquad \Delta z \approx dz. $$

The differential $dz$ is the change predicted by the tangent plane when the inputs change by $dx$ and $dy$. It is the workhorse of error estimation: if measured quantities carry small uncertainties $dx, dy$, then $dz$ estimates the resulting uncertainty in $z$.微分（differential）$dz$ 是当输入变化 $dx$ 和 $dy$ 时切平面所预测的改变量。它是误差估计的主力工具：若测量量带有小的不确定度 $dx, dy$，则 $dz$ 估计出 $z$ 中由此产生的不确定度。

Common error.常见错误。 Two errors dominate here. First, picking a base point $(a,b)$ that is not actually easy: the linearization is only useful when $f(a,b)$ and its partials are clean numbers and $(a,b)$ is close to the target. Choosing $(a,b)=(3,4)$ for $\sqrt{(3.02)^2+(3.97)^2}$ works because $\sqrt{3^2+4^2}=5$ is exact and the increments are tiny. Second, sign errors in the increments $dx = x_{\text{target}} - a$ and $dy = y_{\text{target}} - b$. If the target coordinate is smaller than the base, the increment is negative, and that sign must be carried through. Writing $dy = 3.97 - 4 = -0.03$, not $+0.03$, is the difference between right and wrong.这里有两个最常见的错误。第一，选了一个其实并不"好算"的基点 $(a,b)$：只有当 $f(a,b)$ 及其偏导数是干净的数、且 $(a,b)$ 靠近目标时，线性化才有用。对 $\sqrt{(3.02)^2+(3.97)^2}$ 取 $(a,b)=(3,4)$ 之所以可行，是因为 $\sqrt{3^2+4^2}=5$ 是精确值且增量很小。第二，增量 $dx = x_{\text{target}} - a$ 与 $dy = y_{\text{target}} - b$ 的符号错误。若目标坐标小于基点坐标，增量为负，这个符号必须一路带下去。写成 $dy = 3.97 - 4 = -0.03$（而非 $+0.03$）正是对与错的分界。

Worked Example 4.1: estimating a value例题 4.1：估计一个数值

Use linearization to estimate $\sqrt{(3.02)^2 + (3.97)^2}$.用线性化估计 $\sqrt{(3.02)^2 + (3.97)^2}$。

Let $f(x,y) = \sqrt{x^2 + y^2}$ at $(a,b) = (3,4)$, where $f(3,4) = 5$. The partials are $f_x = x/\sqrt{x^2+y^2}$ and $f_y = y/\sqrt{x^2+y^2}$, so $f_x(3,4) = 3/5$, $f_y(3,4) = 4/5$.取 $f(x,y) = \sqrt{x^2 + y^2}$，基点 $(a,b) = (3,4)$，其中 $f(3,4) = 5$。偏导数为 $f_x = x/\sqrt{x^2+y^2}$，$f_y = y/\sqrt{x^2+y^2}$，故 $f_x(3,4) = 3/5$，$f_y(3,4) = 4/5$。

$$ L(3.02, 3.97) = 5 + \tfrac{3}{5}(0.02) + \tfrac{4}{5}(-0.03) = 5 + 0.012 - 0.024 = 4.988. $$

Worked Example 4.2: linearizing a product例题 4.2：对乘积做线性化

Estimate $(2.01)^3 (0.98)^4$ using a linear approximation.用线性近似估计 $(2.01)^3 (0.98)^4$。

Let $f(x,y) = x^3 y^4$ at $(a,b) = (2,1)$, where $f(2,1) = 8$. The partials are $f_x = 3x^2 y^4$ and $f_y = 4x^3 y^3$, so $f_x(2,1) = 3(4)(1) = 12$ and $f_y(2,1) = 4(8)(1) = 32$. The increments are $dx = 0.01$, $dy = -0.02$. Then取 $f(x,y) = x^3 y^4$，基点 $(a,b) = (2,1)$，其中 $f(2,1) = 8$。偏导数为 $f_x = 3x^2 y^4$，$f_y = 4x^3 y^3$，故 $f_x(2,1) = 3(4)(1) = 12$，$f_y(2,1) = 4(8)(1) = 32$。增量为 $dx = 0.01$，$dy = -0.02$。于是

$$ L(2.01, 0.98) = 8 + 12(0.01) + 32(-0.02) = 8 + 0.12 - 0.64 = 7.48. $$

For comparison the exact value is $(2.01)^3(0.98)^4 \approx 8.1206 \times 0.92237 \approx 7.491$, so the linear estimate $7.48$ is within about $0.15\%$. The size of the error scales with the second derivatives times the square of the increments, which is why small increments make the estimate sharp.作为对照，精确值为 $(2.01)^3(0.98)^4 \approx 8.1206 \times 0.92237 \approx 7.491$，故线性估计 $7.48$ 的误差约在 $0.15\%$ 以内。误差大小与二阶导数乘以增量的平方成比例，这正是增量越小估计越精确的原因。

Worked Example 4.3: percentage error propagation例题 4.3：百分比误差的传播

The period of a pendulum is $T = 2\pi\sqrt{L/g}$. If $L$ is measured with a relative error up to $0.5\%$ and $g$ with a relative error up to $0.1\%$, bound the relative error in $T$.单摆的周期为 $T = 2\pi\sqrt{L/g}$。若 $L$ 的相对误差不超过 $0.5\%$，$g$ 的相对误差不超过 $0.1\%$，请给出 $T$ 的相对误差上界。

Take logarithms before differentiating, a standard trick for products and powers: $\ln T = \ln(2\pi) + \tfrac12\ln L - \tfrac12 \ln g$. The differential is微分前先取对数，这是处理乘积与幂的标准技巧：$\ln T = \ln(2\pi) + \tfrac12\ln L - \tfrac12 \ln g$。其微分为

$$ \frac{dT}{T} = \frac{1}{2}\frac{dL}{L} - \frac{1}{2}\frac{dg}{g}. $$

In the worst case the two contributions add in magnitude:在最坏情况下，两项贡献在量级上相加：

$$ \left|\frac{dT}{T}\right| \le \frac{1}{2}(0.005) + \frac{1}{2}(0.001) = 0.0025 + 0.0005 = 0.003 = 0.3\%. $$

The logarithmic differential converts each input's relative error into a weighted contribution, and the weights are exactly the exponents in the formula, here $+\tfrac12$ for $L$ and $-\tfrac12$ for $g$.对数微分把每个输入的相对误差转化为一项加权贡献，而权重正是公式中的指数，此处 $L$ 为 $+\tfrac12$，$g$ 为 $-\tfrac12$。

Going deeper: differentials and error propagation深入探讨：微分与误差传播

A rectangle is measured as $x = 30$ cm and $y = 24$ cm, each with a possible error of $\pm 0.1$ cm. Estimate the maximum error in the computed area $A = xy$.一个矩形测得 $x = 30$ cm，$y = 24$ cm，每个测量都可能有 $\pm 0.1$ cm 的误差。估计所算面积 $A = xy$ 的最大误差。

$dA = A_x\, dx + A_y\, dy = y\, dx + x\, dy$. At $(30,24)$ with $|dx|, |dy| \le 0.1$:$dA = A_x\, dx + A_y\, dy = y\, dx + x\, dy$。在 $(30,24)$ 处，且 $|dx|, |dy| \le 0.1$：

$$ |dA| \le 24(0.1) + 30(0.1) = 2.4 + 3.0 = 5.4 \ \text{cm}^2. $$

So the area $720\ \text{cm}^2$ carries an estimated uncertainty of about $5.4\ \text{cm}^2$, a relative error near $0.75\%$. The differential turns input tolerances into an output tolerance via a single linear formula.因此面积 $720\ \text{cm}^2$ 的估计不确定度约为 $5.4\ \text{cm}^2$，相对误差接近 $0.75\%$。微分通过一条线性公式把输入的容差转化为输出的容差。

If $f_x(1,2)=3$ and $f_y(1,2)=-1$, the differential $dz$ for $dx=0.1$, $dy=0.2$ is若 $f_x(1,2)=3$，$f_y(1,2)=-1$，则当 $dx=0.1$，$dy=0.2$ 时微分 $dz$ 等于

4.1

$0.5$

$0.3$

$0.1$

$-0.1$

Correct. $dz = 3(0.1) + (-1)(0.2) = 0.3 - 0.2 = 0.1$.正确。$dz = 3(0.1) + (-1)(0.2) = 0.3 - 0.2 = 0.1$。

Apply $dz=f_x\,dx+f_y\,dy=3(0.1)+(-1)(0.2)=0.1$. Keep the sign of $f_y$ and pair each partial with its own increment.套用 $dz=f_x\,dx+f_y\,dy=3(0.1)+(-1)(0.2)=0.1$。保留 $f_y$ 的符号，并让每个偏导数与它自己的增量配对。

Section 5第 5 节

The Chain Rule Revisited再看链式法则（`Chain Rule`）

Key idea.核心思想。 When the inputs of $f$ themselves depend on other variables, derivatives flow along every path through the dependency tree and add up. The directional derivative of Section 1 is the simplest case: composing $f$ with a straight-line path. The general multivariable chain rule handles curved paths and chains of variables.当 $f$ 的输入本身又依赖于其他变量时，导数沿依赖关系树中的每条路径流动并相加。第 1 节的方向导数是最简单的情形：把 $f$ 与一条直线路径复合。一般的多元链式法则（Chain Rule）则能处理曲线路径以及多层变量链。

One independent variable一个自变量

$$ \frac{df}{dt} = f_x \frac{dx}{dt} + f_y \frac{dy}{dt} = \nabla f \cdot \mathbf{r}'(t), \quad \text{where } \mathbf{r}(t) = (x(t), y(t)). $$

Two independent variables两个自变量

$$ \frac{\partial f}{\partial s} = f_x \frac{\partial x}{\partial s} + f_y \frac{\partial y}{\partial s}, \qquad \frac{\partial f}{\partial t} = f_x \frac{\partial x}{\partial t} + f_y \frac{\partial y}{\partial t}. $$

The pattern is mechanical once you draw the tree: sum over each path from $f$ to the independent variable, multiplying the partial derivatives along each branch. Writing $df/dt = \nabla f \cdot \mathbf{r}'(t)$ also recovers the directional derivative when $\mathbf{r}'(t)$ is a unit vector.一旦画出依赖树，这个套路就很机械：对从 $f$ 到该自变量的每条路径求和，沿每条分支把偏导数相乘。把式子写成 $df/dt = \nabla f \cdot \mathbf{r}'(t)$，当 $\mathbf{r}'(t)$ 是单位向量时还能重新得到方向导数。

Common error.常见错误。 When $f$ depends on $x$ and $y$, and both depend on $t$, the answer $df/dt$ is a sum over both paths: $f_x x' + f_y y'$. A frequent error is to keep only one term, or to mismatch a partial with the wrong inner derivative (pairing $f_x$ with $y'$). Another trap appears with two independent variables $s,t$: the symbol $\partial f/\partial s$ in the chain rule means the rate holding $t$ fixed, not the partial of the outer $f$ alone. Draw the dependency tree, list every path from $f$ down to the variable you are differentiating against, and sum the products of the branch derivatives. The tree removes the guesswork.当 $f$ 依赖 $x$ 和 $y$，且两者都依赖 $t$ 时，答案 $df/dt$ 是对两条路径求和：$f_x x' + f_y y'$。常见错误是只保留一项，或把某个偏导数与错误的内层导数配对（把 $f_x$ 配 $y'$）。两个自变量 $s,t$ 时还有一个陷阱：链式法则中的符号 $\partial f/\partial s$ 指的是固定 $t$ 时的变化率，而不是单纯外层 $f$ 的偏导数。画出依赖树，列出从 $f$ 到所求变量的每条路径，把各分支导数之积相加。依赖树能消除瞎猜。

Worked Example 5.1: chain rule along a path例题 5.1：沿路径的链式法则

Let $f(x,y) = x^2 y$ with $x = \cos t$, $y = \sin t$. Find $df/dt$ at $t = 0$.设 $f(x,y) = x^2 y$，且 $x = \cos t$，$y = \sin t$。求 $t = 0$ 处的 $df/dt$。

$f_x = 2xy$, $f_y = x^2$, and $x'(t) = -\sin t$, $y'(t) = \cos t$. So$f_x = 2xy$，$f_y = x^2$，且 $x'(t) = -\sin t$，$y'(t) = \cos t$。于是

$$ \frac{df}{dt} = (2xy)(-\sin t) + (x^2)(\cos t). $$

At $t = 0$: $x = 1$, $y = 0$, $\sin 0 = 0$, $\cos 0 = 1$, giving $df/dt = (0)(0) + (1)(1) = 1$.在 $t = 0$ 处：$x = 1$，$y = 0$，$\sin 0 = 0$，$\cos 0 = 1$，得 $df/dt = (0)(0) + (1)(1) = 1$。

Worked Example 5.2: two independent variables (polar coordinates)例题 5.2：两个自变量（极坐标）

Let $f(x,y) = x^2 + y^2$ with $x = r\cos\theta$ and $y = r\sin\theta$. Compute $\partial f/\partial r$ and $\partial f/\partial\theta$ by the chain rule and confirm against direct substitution.设 $f(x,y) = x^2 + y^2$，且 $x = r\cos\theta$，$y = r\sin\theta$。用链式法则求 $\partial f/\partial r$ 和 $\partial f/\partial\theta$，并与直接代入的结果相互印证。

$f_x = 2x$, $f_y = 2y$. The inner partials are $x_r = \cos\theta$, $y_r = \sin\theta$, $x_\theta = -r\sin\theta$, $y_\theta = r\cos\theta$. Then$f_x = 2x$，$f_y = 2y$。内层偏导数为 $x_r = \cos\theta$，$y_r = \sin\theta$，$x_\theta = -r\sin\theta$，$y_\theta = r\cos\theta$。于是

$$ \frac{\partial f}{\partial r} = 2x\cos\theta + 2y\sin\theta = 2r\cos^2\theta + 2r\sin^2\theta = 2r, $$ $$ \frac{\partial f}{\partial\theta} = 2x(-r\sin\theta) + 2y(r\cos\theta) = -2r^2\cos\theta\sin\theta + 2r^2\sin\theta\cos\theta = 0. $$

Direct substitution gives $f = r^2\cos^2\theta + r^2\sin^2\theta = r^2$, so indeed $f_r = 2r$ and $f_\theta = 0$. The vanishing $\theta$ derivative is the statement that $f$ is rotationally symmetric: it depends only on the radius.直接代入得 $f = r^2\cos^2\theta + r^2\sin^2\theta = r^2$，故确有 $f_r = 2r$，$f_\theta = 0$。对 $\theta$ 的导数为零，正说明 $f$ 具有旋转对称性：它只依赖于半径。

Worked Example 5.3: implicit differentiation from the chain rule例题 5.3：由链式法则得到隐函数求导

The chain rule yields the implicit-function formula. Suppose $y$ is defined implicitly by $F(x,y) = 0$. Differentiate with respect to $x$, treating $y = y(x)$:链式法则给出隐函数（implicit differentiation）公式。设 $y$ 由 $F(x,y) = 0$ 隐式确定。把 $y = y(x)$，对 $x$ 求导：

$$ F_x \cdot 1 + F_y \cdot \frac{dy}{dx} = 0 \quad\Longrightarrow\quad \frac{dy}{dx} = -\frac{F_x}{F_y}\ \ (F_y \neq 0). $$

Apply it to the folium-style curve $x^3 + y^3 = 6xy$ at the point $(3,3)$. Let $F = x^3 + y^3 - 6xy$. Then $F_x = 3x^2 - 6y$ and $F_y = 3y^2 - 6x$. At $(3,3)$: $F_x = 27 - 18 = 9$ and $F_y = 27 - 18 = 9$, so把它用于笛卡尔叶形线（folium）$x^3 + y^3 = 6xy$ 在点 $(3,3)$ 处。设 $F = x^3 + y^3 - 6xy$。则 $F_x = 3x^2 - 6y$，$F_y = 3y^2 - 6x$。在 $(3,3)$ 处：$F_x = 27 - 18 = 9$，$F_y = 27 - 18 = 9$，故

$$ \frac{dy}{dx} = -\frac{9}{9} = -1. $$

The tangent line to the curve at $(3,3)$ has slope $-1$. This is the multivariable chain rule doing implicit differentiation in one clean step, with no need to solve for $y$.曲线在 $(3,3)$ 处的切线斜率为 $-1$。这就是多元链式法则一步干净地完成隐函数求导，无需解出 $y$。

With $z=f(x,y)$, $x=x(t)$, $y=y(t)$, the correct chain rule for $dz/dt$ is当 $z=f(x,y)$，$x=x(t)$，$y=y(t)$ 时，$dz/dt$ 的正确链式法则是

5.1

$f_x + f_y$

$f_x\,\dfrac{dx}{dt} + f_y\,\dfrac{dy}{dt}$

$f_x\,\dfrac{dy}{dt} + f_y\,\dfrac{dx}{dt}$

$\dfrac{dx}{dt}\,\dfrac{dy}{dt}$

Correct. Sum over each path: $f_x$ times $dx/dt$ plus $f_y$ times $dy/dt$.正确。对每条路径求和：$f_x$ 乘 $dx/dt$ 加上 $f_y$ 乘 $dy/dt$。

Each partial pairs with the derivative of its own variable: $f_x\,dx/dt + f_y\,dy/dt$. The other forms mismatch or drop the inner derivatives.每个偏导数都与自己变量的导数配对：$f_x\,dx/dt + f_y\,dy/dt$。其他形式要么配错，要么漏掉了内层导数。

Section 6第 6 节

Gradients and Level Surfaces梯度与等值面（`level surface`）

Key idea.核心思想。 For a function of three variables $F(x,y,z)$, the gradient $\nabla F$ at a point on a level surface $F = k$ is orthogonal to that surface. This single fact gives the normal vector to the surface, and hence its tangent plane, directly from the gradient, without solving for $z$.对三变量函数 $F(x,y,z)$，在等值面（level surface）$F = k$ 上某点处的梯度 $\nabla F$ 与该曲面正交。仅凭这一事实，就能由梯度直接得到曲面的法向量，进而得到切平面，而无需解出 $z$。

Tangent plane to a level surface $F(x,y,z)=k$等值面 $F(x,y,z)=k$ 的切平面

$$ F_x(\mathbf{a})(x-a) + F_y(\mathbf{a})(y-b) + F_z(\mathbf{a})(z-c) = 0, \quad \mathbf{a}=(a,b,c). $$

The normal line to the surface at $\mathbf{a}$ runs in the direction $\nabla F(\mathbf{a})$. In two variables the same principle says $\nabla f$ is perpendicular to the level curve $f = k$, which is why moving perpendicular to $\nabla f$ keeps $f$ constant (Section 2).曲面在 $\mathbf{a}$ 处的法线（normal line）沿方向 $\nabla F(\mathbf{a})$。在二维情形下同样的原理表明 $\nabla f$ 垂直于等值线 $f = k$，这也正是为何沿垂直于 $\nabla f$ 的方向移动会保持 $f$ 不变（见第 2 节）。

Going deeper: why $\nabla F \perp$ the level surface深入探讨：为何 $\nabla F$ 垂直于等值面

Let $\mathbf{r}(t)$ be any differentiable curve lying entirely in the level surface $F(x,y,z) = k$ with $\mathbf{r}(t_0) = \mathbf{a}$. Then $F(\mathbf{r}(t)) = k$ is constant, so differentiating with the chain rule:设 $\mathbf{r}(t)$ 是完全位于等值面 $F(x,y,z) = k$ 内的任一可微曲线，且 $\mathbf{r}(t_0) = \mathbf{a}$。则 $F(\mathbf{r}(t)) = k$ 为常数，用链式法则求导：

$$ \frac{d}{dt}F(\mathbf{r}(t)) = \nabla F(\mathbf{r}(t)) \cdot \mathbf{r}'(t) = 0. $$

At $t = t_0$ this gives $\nabla F(\mathbf{a}) \cdot \mathbf{r}'(t_0) = 0$. Since $\mathbf{r}'(t_0)$ can be the tangent vector of any curve through $\mathbf{a}$ in the surface, $\nabla F(\mathbf{a})$ is orthogonal to every tangent vector, hence normal to the surface. The tangent plane is the plane through $\mathbf{a}$ with normal $\nabla F(\mathbf{a})$.在 $t = t_0$ 处得 $\nabla F(\mathbf{a}) \cdot \mathbf{r}'(t_0) = 0$。由于 $\mathbf{r}'(t_0)$ 可以是曲面内过 $\mathbf{a}$ 的任意曲线的切向量，故 $\nabla F(\mathbf{a})$ 与每个切向量都正交，因而是曲面的法向量。切平面就是过 $\mathbf{a}$ 且以 $\nabla F(\mathbf{a})$ 为法向量的平面。

Common error.常见错误。 For a level surface $F(x,y,z) = k$, the normal is $\nabla F = \langle F_x, F_y, F_z\rangle$, and you must not subtract the constant $k$ inside $F$ before differentiating in a way that changes the gradient (subtracting a constant is fine, but differentiate the full $F$, not just part of it). A more damaging error is to mix up the two tangent-plane formulas: for a graph $z = f(x,y)$ the normal is $\langle f_x, f_y, -1\rangle$, while for a level surface $F = k$ the normal is $\langle F_x, F_y, F_z\rangle$. Using the graph formula on a level surface, or vice versa, drops or invents a $-1$. Decide first which form you are in: is the surface given as "$z$ equals" (graph) or as "$F$ equals constant" (level surface)?对等值面 $F(x,y,z) = k$，法向量为 $\nabla F = \langle F_x, F_y, F_z\rangle$，不要在求导前对 $F$ 内部减去常数 $k$ 而改变梯度（减去常数本身没问题，但要对完整的 $F$ 求导，而非只对其中一部分）。更严重的错误是把两条切平面公式混淆：对图像 $z = f(x,y)$，法向量为 $\langle f_x, f_y, -1\rangle$；而对等值面 $F = k$，法向量为 $\langle F_x, F_y, F_z\rangle$。在等值面上用图像公式，或反之，都会丢掉或凭空多出一个 $-1$。先判断你处于哪种情形：曲面是以"$z$ 等于"给出（图像），还是以"$F$ 等于常数"给出（等值面）？

Worked Example 6.1: tangent plane to a sphere例题 6.1：球面的切平面

Find the tangent plane to $x^2 + y^2 + z^2 = 9$ at $(2, 1, 2)$.求 $x^2 + y^2 + z^2 = 9$ 在 $(2, 1, 2)$ 处的切平面。

Let $F = x^2 + y^2 + z^2$. Then $\nabla F = \langle 2x, 2y, 2z \rangle$, so $\nabla F(2,1,2) = \langle 4, 2, 4 \rangle$. The plane is设 $F = x^2 + y^2 + z^2$。则 $\nabla F = \langle 2x, 2y, 2z \rangle$，故 $\nabla F(2,1,2) = \langle 4, 2, 4 \rangle$。切平面为

$$ 4(x-2) + 2(y-1) + 4(z-2) = 0, \quad \text{i.e.}\quad 2x + y + 2z = 9. $$

This is consistent with the geometry: the radius to $(2,1,2)$ is the normal to the sphere there.这与几何一致：指向 $(2,1,2)$ 的半径就是球面在该处的法向量。

Worked Example 6.2: tangent plane and normal line to an ellipsoid例题 6.2：椭球面的切平面与法线

Find the tangent plane and the normal line to the ellipsoid $\dfrac{x^2}{4} + y^2 + \dfrac{z^2}{9} = 3$ at $(2, 1, 3)$.求椭球面 $\dfrac{x^2}{4} + y^2 + \dfrac{z^2}{9} = 3$ 在 $(2, 1, 3)$ 处的切平面与法线。

Let $F = \tfrac{x^2}{4} + y^2 + \tfrac{z^2}{9}$. Then $\nabla F = \langle \tfrac{x}{2},\ 2y,\ \tfrac{2z}{9}\rangle$. At $(2,1,3)$: $\nabla F = \langle 1,\ 2,\ \tfrac{2}{3}\rangle$. The tangent plane is设 $F = \tfrac{x^2}{4} + y^2 + \tfrac{z^2}{9}$。则 $\nabla F = \langle \tfrac{x}{2},\ 2y,\ \tfrac{2z}{9}\rangle$。在 $(2,1,3)$ 处：$\nabla F = \langle 1,\ 2,\ \tfrac{2}{3}\rangle$。切平面为

$$ 1(x-2) + 2(y-1) + \tfrac{2}{3}(z-3) = 0 \quad\Longrightarrow\quad x + 2y + \tfrac{2}{3}z = 6, $$

or, clearing the fraction, $3x + 6y + 2z = 18$. The normal line uses the same direction $\langle 1, 2, \tfrac23\rangle$ (or its scalar multiple $\langle 3, 6, 2\rangle$):或者去分母得 $3x + 6y + 2z = 18$。法线沿同一方向 $\langle 1, 2, \tfrac23\rangle$（或其标量倍 $\langle 3, 6, 2\rangle$）：

$$ (x,y,z) = (2,1,3) + t\langle 3, 6, 2\rangle. $$

A quick verification: the point $(2,1,3)$ satisfies $3(2)+6(1)+2(3) = 6+6+6 = 18$, so it lies on the plane, as it must.快速验证：点 $(2,1,3)$ 满足 $3(2)+6(1)+2(3) = 6+6+6 = 18$，故它在该平面上，正如理应如此。

Worked Example 6.3: angle of intersection of two surfaces例题 6.3：两曲面的相交角

The surfaces $x^2 + y^2 + z^2 = 6$ and $x^2 + y^2 - z = 0$ both pass through $(1, 1, 2)$. Find the angle between them there, defined as the angle between their normal vectors.曲面 $x^2 + y^2 + z^2 = 6$ 与 $x^2 + y^2 - z = 0$ 都过 $(1, 1, 2)$。求它们在该处的夹角，定义为两个法向量之间的夹角。

For $F = x^2 + y^2 + z^2$: $\nabla F = \langle 2x, 2y, 2z\rangle$, so $\nabla F(1,1,2) = \langle 2, 2, 4\rangle$. For $G = x^2 + y^2 - z$: $\nabla G = \langle 2x, 2y, -1\rangle$, so $\nabla G(1,1,2) = \langle 2, 2, -1\rangle$. The angle $\phi$ between the normals satisfies对 $F = x^2 + y^2 + z^2$：$\nabla F = \langle 2x, 2y, 2z\rangle$，故 $\nabla F(1,1,2) = \langle 2, 2, 4\rangle$。对 $G = x^2 + y^2 - z$：$\nabla G = \langle 2x, 2y, -1\rangle$，故 $\nabla G(1,1,2) = \langle 2, 2, -1\rangle$。法向量之间的夹角 $\phi$ 满足

$$ \cos\phi = \frac{\nabla F \cdot \nabla G}{|\nabla F||\nabla G|} = \frac{4 + 4 - 4}{\sqrt{4+4+16}\,\sqrt{4+4+1}} = \frac{4}{\sqrt{24}\,\sqrt{9}} = \frac{4}{3\sqrt{24}} = \frac{4}{6\sqrt6} = \frac{2}{3\sqrt6}. $$

Numerically $\cos\phi = 2/(3\sqrt6) \approx 0.2722$, so $\phi \approx 74.2^\circ$. The surfaces meet at roughly $74$ degrees at that point. The gradient is doing all the geometric work: it converts each implicit surface into a concrete normal direction.数值上 $\cos\phi = 2/(3\sqrt6) \approx 0.2722$，故 $\phi \approx 74.2^\circ$。两曲面在该点约以 $74$ 度相交。梯度承担了全部几何工作：它把每个隐式曲面转化为一个具体的法向量方向。

At a point on the level surface $F(x,y,z)=k$, the vector $\nabla F$ is在等值面 $F(x,y,z)=k$ 上某点处，向量 $\nabla F$ 是

6.1

tangent to the surface与曲面相切

always the zero vector总是零向量

normal (perpendicular) to the surface与曲面正交（垂直）

parallel to the $z$-axis平行于 $z$ 轴

Correct. The gradient of $F$ is orthogonal to the level surface $F=k$, so it serves as the surface normal.正确。$F$ 的梯度与等值面 $F=k$ 正交，故可作为曲面的法向量。

Differentiating $F(\mathbf{r}(t))=k$ gives $\nabla F\cdot\mathbf{r}'=0$ for every in-surface tangent, so $\nabla F$ is normal, not tangent or axis-aligned.对 $F(\mathbf{r}(t))=k$ 求导给出 $\nabla F\cdot\mathbf{r}'=0$，对曲面内每个切向量都成立，故 $\nabla F$ 是法向量，而非切向量或与坐标轴对齐。

Section 7第 7 节

Going Deeper深入探讨

Key idea.核心思想。 Differentiability is stronger than the mere existence of directional derivatives. A function can have every directional derivative at a point and still fail to be differentiable there. The gradient dot-product formula $D_{\mathbf{u}}f = \nabla f \cdot \mathbf{u}$ holds only when $f$ is differentiable.可微性比仅仅存在方向导数要强。一个函数可以在某点拥有全部方向导数，却仍在该处不可微。梯度点积公式 $D_{\mathbf{u}}f = \nabla f \cdot \mathbf{u}$ 只有在 $f$ 可微时才成立。

The unifying statement of this unit is the differentiability condition: near $(a,b)$,本单元的统一表述就是可微性条件：在 $(a,b)$ 附近，

Differentiability (first-order Taylor form)可微性（一阶泰勒形式）

$$ f(\mathbf{a}+\mathbf{h}) = f(\mathbf{a}) + \nabla f(\mathbf{a})\cdot\mathbf{h} + \varepsilon(\mathbf{h})|\mathbf{h}|, \quad \varepsilon(\mathbf{h}) \to 0 \text{ as } \mathbf{h}\to\mathbf{0}. $$

This says the tangent plane (the linear term $\nabla f \cdot \mathbf{h}$) approximates $f$ with error that vanishes faster than $|\mathbf{h}|$. A useful sufficient condition: if $f_x$ and $f_y$ exist and are continuous near $\mathbf{a}$, then $f$ is differentiable at $\mathbf{a}$, and all of the formulas in this unit apply.这表明切平面（线性项 $\nabla f \cdot \mathbf{h}$）以比 $|\mathbf{h}|$ 更快趋于零的误差逼近 $f$。一个有用的充分条件：若 $f_x$ 和 $f_y$ 在 $\mathbf{a}$ 附近存在且连续，则 $f$ 在 $\mathbf{a}$ 处可微，本单元的全部公式都适用。

Common error.常见错误。 The single most common false belief in this subject is that "both partials exist" is enough for differentiability, the tangent-plane formula, and the gradient dot-product rule. It is not. The counterexample below has both partials equal to $0$ at the origin yet is not even continuous there. The correct chain of implications is: continuously differentiable ($C^1$, meaning the partials exist and are continuous) $\Rightarrow$ differentiable $\Rightarrow$ all directional derivatives exist and equal $\nabla f \cdot \mathbf{u}$, and also $\Rightarrow$ $f$ continuous. None of these arrows reverses. Treat "partials exist" as the weakest hypothesis, never as a license to use the gradient formula.本主题中最常见的错误信念是："两个偏导数都存在"就足以保证可微性、切平面公式以及梯度点积法则。其实不然。下面的反例在原点处两个偏导数都等于 $0$，却连连续都做不到。正确的蕴含链是：连续可微（$C^1$，即偏导数存在且连续）$\Rightarrow$ 可微 $\Rightarrow$ 全部方向导数存在且等于 $\nabla f \cdot \mathbf{u}$，同时 $\Rightarrow$ $f$ 连续。这些箭头都不可逆。把"偏导数存在"当作最弱的假设，绝不能据此就放心使用梯度公式。

Going deeper: why continuous partials force differentiability深入探讨：为何偏导数连续就能保证可微

We prove the standard sufficient condition: if $f_x$ and $f_y$ exist and are continuous on a disk around $\mathbf{a} = (a,b)$, then $f$ is differentiable at $\mathbf{a}$. The engine is the one-variable Mean Value Theorem applied one coordinate at a time.我们证明标准的充分条件：若 $f_x$ 和 $f_y$ 在 $\mathbf{a} = (a,b)$ 的某个圆盘上存在且连续，则 $f$ 在 $\mathbf{a}$ 处可微。核心引擎是逐个坐标地应用一元中值定理（Mean Value Theorem）。

Write the increment $f(a+h, b+k) - f(a,b)$ and split it through the corner point $(a+h, b)$:写出增量 $f(a+h, b+k) - f(a,b)$，并经由拐角点 $(a+h, b)$ 拆分：

$$ \Delta f = \big[f(a+h, b+k) - f(a+h, b)\big] + \big[f(a+h, b) - f(a,b)\big]. $$

Apply the Mean Value Theorem to each bracket. In the first, only the second coordinate changes, so there is some $k^\ast$ between $0$ and $k$ with $f(a+h, b+k) - f(a+h, b) = f_y(a+h, b+k^\ast)\,k$. In the second, only the first coordinate changes, so there is some $h^\ast$ between $0$ and $h$ with $f(a+h, b) - f(a,b) = f_x(a+h^\ast, b)\,h$. Hence对每个方括号应用中值定理。在第一个里只有第二个坐标变化，故存在介于 $0$ 与 $k$ 之间的某个 $k^\ast$，使 $f(a+h, b+k) - f(a+h, b) = f_y(a+h, b+k^\ast)\,k$。在第二个里只有第一个坐标变化，故存在介于 $0$ 与 $h$ 之间的某个 $h^\ast$，使 $f(a+h, b) - f(a,b) = f_x(a+h^\ast, b)\,h$。于是

$$ \Delta f = f_x(a+h^\ast, b)\,h + f_y(a+h, b+k^\ast)\,k. $$

Now subtract the proposed linear part $f_x(a,b)h + f_y(a,b)k$:现在减去拟用的线性部分 $f_x(a,b)h + f_y(a,b)k$：

$$ \Delta f - \big[f_x(a,b)h + f_y(a,b)k\big] = \underbrace{\big[f_x(a+h^\ast,b) - f_x(a,b)\big]}_{=\,\varepsilon_1}\,h + \underbrace{\big[f_y(a+h,b+k^\ast) - f_y(a,b)\big]}_{=\,\varepsilon_2}\,k. $$

Because $f_x$ and $f_y$ are continuous at $(a,b)$, and because $(a+h^\ast, b) \to (a,b)$ and $(a+h, b+k^\ast) \to (a,b)$ as $(h,k) \to (0,0)$, both $\varepsilon_1 \to 0$ and $\varepsilon_2 \to 0$. Finally, since $|h| \le |\mathbf{h}|$ and $|k| \le |\mathbf{h}|$ where $\mathbf{h} = (h,k)$,由于 $f_x$ 和 $f_y$ 在 $(a,b)$ 处连续，且当 $(h,k) \to (0,0)$ 时 $(a+h^\ast, b) \to (a,b)$、$(a+h, b+k^\ast) \to (a,b)$，故 $\varepsilon_1 \to 0$ 且 $\varepsilon_2 \to 0$。最后，由于 $|h| \le |\mathbf{h}|$ 且 $|k| \le |\mathbf{h}|$，其中 $\mathbf{h} = (h,k)$，

$$ \frac{\big|\Delta f - (f_x(a,b)h + f_y(a,b)k)\big|}{|\mathbf{h}|} \le |\varepsilon_1| + |\varepsilon_2| \longrightarrow 0. $$

That is exactly the differentiability condition with $\nabla f(\mathbf{a})\cdot\mathbf{h} = f_x h + f_y k$ as the linear term. So $C^1$ implies differentiable, which is the workhorse theorem every formula in this unit quietly relies on.这正是以 $\nabla f(\mathbf{a})\cdot\mathbf{h} = f_x h + f_y k$ 为线性项的可微性条件。于是 $C^1$ 蕴含可微，这是本单元每条公式默默依赖的主力定理。

Going deeper: directional derivatives without differentiability深入探讨：有方向导数却不可微

Consider考虑

$$ f(x,y) = \begin{cases} \dfrac{x^2 y}{x^2 + y^2}, & (x,y)\neq(0,0), \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$

Along any unit direction $\mathbf{u} = \langle a, b\rangle$ we compute沿任意单位方向 $\mathbf{u} = \langle a, b\rangle$ 计算

$$ D_{\mathbf{u}}f(0,0) = \lim_{t\to 0}\frac{f(ta,tb)}{t} = \lim_{t\to 0}\frac{1}{t}\cdot\frac{t^3 a^2 b}{t^2(a^2+b^2)} = \frac{a^2 b}{a^2+b^2} = a^2 b. $$

So every directional derivative exists at the origin. Yet if the gradient formula held we would need $D_{\mathbf{u}}f = \nabla f(0,0)\cdot\mathbf{u}$ to be linear in $\mathbf{u}$. Here $f_x(0,0) = 0$ and $f_y(0,0) = 0$, so the formula predicts $0$ for every direction, contradicting $a^2 b$ (take $a=b=1/\sqrt2$, giving $1/(2\sqrt2)\neq 0$). Hence $f$ is not differentiable at the origin, even though all directional derivatives exist.所以原点处每个方向导数都存在。然而若梯度公式成立，就需要 $D_{\mathbf{u}}f = \nabla f(0,0)\cdot\mathbf{u}$ 关于 $\mathbf{u}$ 是线性的。此处 $f_x(0,0) = 0$，$f_y(0,0) = 0$，故公式对每个方向都预测为 $0$，与 $a^2 b$ 矛盾（取 $a=b=1/\sqrt2$，得 $1/(2\sqrt2)\neq 0$）。因此尽管全部方向导数都存在，$f$ 在原点仍不可微。

Worked Example 7.1: partials exist but the function is not even continuous例题 7.1：偏导数存在，函数却连连续都不是

The previous counterexample still had all directional derivatives. This one is more dramatic: both partials exist at the origin, yet the function is discontinuous there, so it cannot possibly be differentiable. Let上一个反例好歹还拥有全部方向导数。这一个更极端：两个偏导数在原点都存在，函数却在那里不连续，因而绝不可能可微。设

$$ g(x,y) = \begin{cases} \dfrac{xy}{x^2 + y^2}, & (x,y)\neq(0,0), \\[4pt] 0, & (x,y)=(0,0). \end{cases} $$

The partial $g_x(0,0)$ uses only the slice $y = 0$, where $g(x,0) = 0$ for all $x$, so $g_x(0,0) = 0$. By symmetry $g_y(0,0) = 0$. Both partials exist.偏导数 $g_x(0,0)$ 只用到切片 $y = 0$，在该切片上对所有 $x$ 都有 $g(x,0) = 0$，故 $g_x(0,0) = 0$。由对称性 $g_y(0,0) = 0$。两个偏导数都存在。

But approach the origin along $y = x$: then $g(x,x) = \dfrac{x^2}{2x^2} = \dfrac12$ for every $x \neq 0$, whereas along $y = 0$ the value is $0$. The two path limits disagree, so $\lim_{(x,y)\to(0,0)} g$ does not exist and $g$ is discontinuous at the origin. Since differentiability implies continuity, $g$ is not differentiable at $(0,0)$ despite having both partial derivatives there. This is the cleanest possible warning that "partials exist" carries almost no analytic weight on its own.但沿 $y = x$ 趋近原点：对每个 $x \neq 0$ 有 $g(x,x) = \dfrac{x^2}{2x^2} = \dfrac12$，而沿 $y = 0$ 该值为 $0$。两条路径的极限不一致，故 $\lim_{(x,y)\to(0,0)} g$ 不存在，$g$ 在原点不连续。由于可微蕴含连续，尽管 $g$ 在 $(0,0)$ 处两个偏导数都存在，它在那里仍不可微。这是最干净不过的警示：单凭"偏导数存在"几乎不带任何分析上的分量。

Which statement is true?下列哪个说法是正确的？

7.1

Existence of both partials at a point guarantees differentiability there.某点处两个偏导数都存在就能保证该处可微。

Continuity of $f_x$ and $f_y$ near a point guarantees differentiability there.$f_x$ 与 $f_y$ 在某点附近连续就能保证该处可微。

If all directional derivatives exist, then $D_{\mathbf{u}}f=\nabla f\cdot\mathbf{u}$ must hold.若全部方向导数都存在，则 $D_{\mathbf{u}}f=\nabla f\cdot\mathbf{u}$ 必定成立。

Differentiability is weaker than the existence of partial derivatives.可微性比偏导数存在更弱。

Correct. Continuously differentiable (class $C^1$) implies differentiable; this is the standard sufficient condition.正确。连续可微（$C^1$ 类）蕴含可微；这是标准的充分条件。

Mere existence of partials, or even of all directional derivatives, does not imply differentiability or the gradient formula. Continuity of the partials is what suffices.仅有偏导数存在、甚至全部方向导数存在，都不蕴含可微性或梯度公式。真正充分的是偏导数的连续性。

Flashcards闪卡

Definition of the directional derivative $D_{\mathbf{u}}f(\mathbf{a})$ (limit form)方向导数 $D_{\mathbf{u}}f(\mathbf{a})$ 的定义（极限形式）

$D_{\mathbf{u}}f(\mathbf{a})=\lim_{t\to 0}\dfrac{f(\mathbf{a}+t\mathbf{u})-f(\mathbf{a})}{t}$, with $|\mathbf{u}|=1$.$D_{\mathbf{u}}f(\mathbf{a})=\lim_{t\to 0}\dfrac{f(\mathbf{a}+t\mathbf{u})-f(\mathbf{a})}{t}$，其中 $|\mathbf{u}|=1$。

Compute $D_{\mathbf{u}}f$ when $f$ is differentiable当 $f$ 可微时如何计算 $D_{\mathbf{u}}f$

$D_{\mathbf{u}}f(\mathbf{a})=\nabla f(\mathbf{a})\cdot\mathbf{u}$. Always normalize the direction first: $\mathbf{u}=\mathbf{v}/|\mathbf{v}|$.$D_{\mathbf{u}}f(\mathbf{a})=\nabla f(\mathbf{a})\cdot\mathbf{u}$。务必先把方向归一化：$\mathbf{u}=\mathbf{v}/|\mathbf{v}|$。

Direction and rate of steepest ascent最速上升的方向与速率

When is $D_{\mathbf{u}}f=0$?$D_{\mathbf{u}}f=0$ 何时成立？

When $\mathbf{u}\perp\nabla f$. These are the directions tangent to the level curve or surface.当 $\mathbf{u}\perp\nabla f$ 时。这些是与等值线或等值面相切的方向。

Tangent plane to a graph $z=f(x,y)$ at $(a,b)$图像 $z=f(x,y)$ 在 $(a,b)$ 处的切平面

$z=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$.$z=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$。

Linearization $L(x,y)$ of $f$ at $(a,b)$$f$ 在 $(a,b)$ 处的线性化 $L(x,y)$

$L(x,y)=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$, and $f\approx L$ near $(a,b)$.$L(x,y)=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$，且在 $(a,b)$ 附近 $f\approx L$。

Total differential $dz$全微分 $dz$

$dz=f_x\,dx+f_y\,dy$, used to estimate $\Delta z$ and propagate small measurement errors.$dz=f_x\,dx+f_y\,dy$，用于估计 $\Delta z$ 并传播小的测量误差。

Chain rule for $\dfrac{df}{dt}$, with $x(t),y(t)$当 $x(t),y(t)$ 时 $\dfrac{df}{dt}$ 的链式法则

$\dfrac{df}{dt}=f_x\dfrac{dx}{dt}+f_y\dfrac{dy}{dt}=\nabla f\cdot\mathbf{r}'(t)$.$\dfrac{df}{dt}=f_x\dfrac{dx}{dt}+f_y\dfrac{dy}{dt}=\nabla f\cdot\mathbf{r}'(t)$。

Gradient and a level surface $F(x,y,z)=k$梯度与等值面 $F(x,y,z)=k$

$\nabla F$ is normal (perpendicular) to the surface; it gives the surface normal and the normal line direction.$\nabla F$ 与曲面正交（垂直）；它给出曲面法向量和法线方向。

Tangent plane to a level surface $F=k$ at $\mathbf{a}$等值面 $F=k$ 在 $\mathbf{a}$ 处的切平面

$F_x(\mathbf{a})(x-a)+F_y(\mathbf{a})(y-b)+F_z(\mathbf{a})(z-c)=0$.$F_x(\mathbf{a})(x-a)+F_y(\mathbf{a})(y-b)+F_z(\mathbf{a})(z-c)=0$。

Sufficient condition for differentiability可微性的充分条件

If $f_x,f_y$ exist and are continuous near $\mathbf{a}$ (class $C^1$), then $f$ is differentiable at $\mathbf{a}$.若 $f_x,f_y$ 在 $\mathbf{a}$ 附近存在且连续（$C^1$ 类），则 $f$ 在 $\mathbf{a}$ 处可微。

Why the gradient formula can fail梯度公式为何可能失效

Existence of all directional derivatives does not imply differentiability. $D_{\mathbf{u}}f=\nabla f\cdot\mathbf{u}$ requires $f$ differentiable.全部方向导数都存在并不蕴含可微。$D_{\mathbf{u}}f=\nabla f\cdot\mathbf{u}$ 要求 $f$ 可微。

Check Yourself自我检测

Unit Quiz单元测验

For $f(x,y)=x^2-xy$, the gradient at $(2,1)$ is $\langle 3,-2\rangle$. Find $D_{\mathbf{u}}f(2,1)$ in the direction $\mathbf{v}=\langle 0,5\rangle$.设 $f(x,y)=x^2-xy$，其在 $(2,1)$ 处的梯度为 $\langle 3,-2\rangle$。求沿方向 $\mathbf{v}=\langle 0,5\rangle$ 的 $D_{\mathbf{u}}f(2,1)$。

$3$

$-10$

$-2$

$1$

Correct. $\mathbf{u}=\langle 0,1\rangle$, so $\langle 3,-2\rangle\cdot\langle 0,1\rangle=-2$.正确。$\mathbf{u}=\langle 0,1\rangle$，故 $\langle 3,-2\rangle\cdot\langle 0,1\rangle=-2$。

Normalize $\mathbf{v}$ to $\langle 0,1\rangle$, then dot with the gradient to get $-2$.把 $\mathbf{v}$ 归一化为 $\langle 0,1\rangle$，再与梯度点乘得 $-2$。

The direction of steepest descent of $f$ at a point is$f$ 在某点处的最速下降方向是

$-\nabla f$

$+\nabla f$

any vector perpendicular to $\nabla f$任何垂直于 $\nabla f$ 的向量

the zero vector零向量

Correct. The minimum directional derivative $-|\nabla f|$ occurs in the direction $-\nabla f$.正确。最小方向导数 $-|\nabla f|$ 在方向 $-\nabla f$ 上取得。

$+\nabla f$ is steepest ascent; perpendicular directions give rate zero. Steepest descent points opposite the gradient.$+\nabla f$ 是最速上升；垂直方向上速率为零。最速下降指向梯度的反方向。

The tangent plane to $z=\ln(x+y)$ at $(1,0)$ uses $f_x(1,0)=f_y(1,0)=1$ and $f(1,0)=0$. It is$z=\ln(x+y)$ 在 $(1,0)$ 处的切平面用到 $f_x(1,0)=f_y(1,0)=1$ 和 $f(1,0)=0$。它是

$z=x+y$

$z=1+(x-1)+y$

$z=(x-1)y$

$z=(x-1)+y$

Correct. $z=0+1(x-1)+1(y-0)=(x-1)+y$.正确。$z=0+1(x-1)+1(y-0)=(x-1)+y$。

Use $f(a,b)+f_x(x-a)+f_y(y-b)=0+(x-1)+(y-0)$. The base value is $0$, not $1$.套用 $f(a,b)+f_x(x-a)+f_y(y-b)=0+(x-1)+(y-0)$。基准值是 $0$，不是 $1$。

A cylinder has radius $r=5$ and height $h=10$, each measured with error up to $\pm0.1$. Using $V=\pi r^2 h$, the estimate of $|dV|$ is closest to某圆柱半径 $r=5$、高 $h=10$，每个测量误差不超过 $\pm0.1$。用 $V=\pi r^2 h$，则 $|dV|$ 的估计最接近

$5\pi$

$12.5\pi$

$25\pi$

$2.5\pi$

Correct. $dV=2\pi r h\,dr+\pi r^2\,dh$. Max $|dV|=2\pi(5)(10)(0.1)+\pi(25)(0.1)=10\pi+2.5\pi=12.5\pi$.正确。$dV=2\pi r h\,dr+\pi r^2\,dh$。最大 $|dV|=2\pi(5)(10)(0.1)+\pi(25)(0.1)=10\pi+2.5\pi=12.5\pi$。

Add both contributions: $2\pi r h\,dr=10\pi$ and $\pi r^2\,dh=2.5\pi$, totaling $12.5\pi$.把两项贡献相加：$2\pi r h\,dr=10\pi$ 与 $\pi r^2\,dh=2.5\pi$，合计 $12.5\pi$。

With $z=f(x,y)$, $x=s^2$, $y=st$, the term in $\partial z/\partial t$ coming through $y$ is当 $z=f(x,y)$，$x=s^2$，$y=st$ 时，$\partial z/\partial t$ 中经由 $y$ 的那一项是

$f_x\cdot 2s$

$f_y\cdot t$

$f_y\cdot s$

$f_x\cdot s$

Correct. $\partial y/\partial t=s$, so the path through $y$ contributes $f_y\cdot s$.正确。$\partial y/\partial t=s$，故经由 $y$ 的路径贡献 $f_y\cdot s$。

Differentiate $y=st$ with respect to $t$: $\partial y/\partial t=s$. The $y$-path term is $f_y\cdot s$.对 $y=st$ 关于 $t$ 求导：$\partial y/\partial t=s$。$y$ 路径项为 $f_y\cdot s$。

A normal vector to the surface $x^2+2y^2+3z^2=6$ at $(1,1,1)$ is曲面 $x^2+2y^2+3z^2=6$ 在 $(1,1,1)$ 处的一个法向量是

$\langle 2,4,6\rangle$

$\langle 1,1,1\rangle$

$\langle 1,2,3\rangle$

$\langle 2,2,2\rangle$

Correct. $\nabla F=\langle 2x,4y,6z\rangle$, which at $(1,1,1)$ is $\langle 2,4,6\rangle$.正确。$\nabla F=\langle 2x,4y,6z\rangle$，在 $(1,1,1)$ 处为 $\langle 2,4,6\rangle$。

The normal is $\nabla F=\langle 2x,4y,6z\rangle$ evaluated at the point, giving $\langle 2,4,6\rangle$.法向量是 $\nabla F=\langle 2x,4y,6z\rangle$ 在该点取值，得 $\langle 2,4,6\rangle$。

Before You Move On继续之前

Readiness Checklist学习自测清单

Tap each item you can do without notes. 点击你无需参考资料即可完成的项目。0 / 8 mastered0 / 8 已掌握

Compute a directional derivative as a normalized gradient dot product.把方向导数计算为归一化后的梯度点积。
State the steepest-ascent direction and the maximum rate of change at a point.说出某点处的最速上升方向和最大变化率。
Write the tangent plane to a graph z = f(x,y) using the partial derivatives.用偏导数写出图像 z = f(x,y) 的切平面。
Build the linearization L(x,y) and use it to estimate a nearby value.构造线性化 L(x,y) 并用它估计附近的数值。
Use the total differential dz to estimate change and propagate measurement error.用全微分 dz 估计改变量并传播测量误差。
Apply the multivariable chain rule along a path and for chains of variables.沿路径以及对多层变量链应用多元链式法则。
Find the tangent plane and normal line to a level surface using the gradient.用梯度求等值面的切平面与法线。
Explain why directional derivatives can exist where f is not differentiable.解释为何在 f 不可微处方向导数仍可能存在。

Unit C4: Directional Derivatives, Tangent Planes, Linearization第 C4 单元：方向导数（directional derivative）、切平面（tangent plane）与线性化（linearization）

Directional Derivatives方向导数（directional derivative）

The Gradient and Steepest Ascent梯度（gradient）与最速上升

Tangent Planes切平面（tangent plane）

Linear Approximation and Differentials线性近似与微分（differential）

The Chain Rule Revisited再看链式法则（Chain Rule）

Gradients and Level Surfaces梯度与等值面（level surface）

Going Deeper深入探讨

Flashcards闪卡

Unit Quiz单元测验

Readiness Checklist学习自测清单

Unit C4: Directional Derivatives, Tangent Planes, Linearization第 C4 单元：方向导数（`directional derivative`）、切平面（`tangent plane`）与线性化（`linearization`）

Directional Derivatives方向导数（`directional derivative`）

The Gradient and Steepest Ascent梯度（`gradient`）与最速上升

Tangent Planes切平面（`tangent plane`）

Linear Approximation and Differentials线性近似与微分（`differential`）

The Chain Rule Revisited再看链式法则（`Chain Rule`）

Gradients and Level Surfaces梯度与等值面（`level surface`）