University Calculus · Calculus III大学微积分 · 微积分 III

Unit C5: Optimization and Lagrange Multipliers单元 C5:最优化与拉格朗日乘数

Find and classify the extrema of multivariable functions, search closed regions for global extrema, and optimize under one or two constraints with Lagrange multipliers.寻找并判别多元函数的极值,在闭区域上搜索全局极值,并用拉格朗日乘数(Lagrange multiplier)在一个或两个约束(constraint)下求最优解。

Calculus III微积分 III Multivariable多元函数 Vector Calculus向量微积分 MIT 18.02 / GT 2551 / Princeton MAT 201
Read me first.请先阅读。 This unit develops the full optimization toolkit for functions of several variables. We start with critical points and the second derivatives test, extend to absolute extrema on closed bounded regions, and then build the method of Lagrange multipliers for one and two constraints. Each idea is grounded in the first-order condition that the gradient vanishes or aligns with the constraint gradients, and we close by examining degenerate cases where the standard tests fall silent.本单元构建多元函数的完整最优化工具箱。我们从临界点(critical point)和二阶导数判别法(second derivative test)开始,推广到闭有界区域上的全局极值(global extremum),再建立单约束与双约束下的拉格朗日乘数法。每个思想都立足于一阶条件,即梯度(gradient)为零或与约束梯度对齐;最后我们考察标准判别法失效的退化情形。

Critical Points临界点

Key idea.核心思想。 A local extremum of a differentiable function of several variables can occur only where every first partial derivative vanishes. Such a point is called a critical point. The vanishing of the gradient is necessary, not sufficient, so a critical point may be a maximum, a minimum, or a saddle point.可微多元函数的局部极值(local extremum)只可能出现在所有一阶偏导数都为零之处。这样的点称为临界点(critical point)。梯度(gradient)为零是必要而非充分条件,因此一个临界点可能是极大值、极小值,或鞍点(saddle point)。

Let $f(x,y)$ be defined on an open set $D\subseteq\mathbb{R}^2$. We say $f$ has a local maximum at $(a,b)$ if $f(x,y)\le f(a,b)$ for all $(x,y)$ in some disk centered at $(a,b)$, and a local minimum if the reverse inequality holds.设 $f(x,y)$ 定义在开集 $D\subseteq\mathbb{R}^2$ 上。若在某个以 $(a,b)$ 为中心的圆盘内对所有 $(x,y)$ 都有 $f(x,y)\le f(a,b)$,则称 $f$ 在 $(a,b)$ 处取得局部极大值;若反向不等式成立,则取得局部极小值

Definition: critical point定义:临界点
$$(a,b)\ \text{is critical}\iff f_x(a,b)=0\ \text{and}\ f_y(a,b)=0,\quad\text{i.e.}\ \nabla f(a,b)=\mathbf{0},$$

or one of the partials fails to exist. The geometric content is that the tangent plane $z=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$ is horizontal at an interior extremum.或某个偏导数不存在。其几何含义是:在内部极值点处,切平面 $z=f(a,b)+f_x(a,b)(x-a)+f_y(a,b)(y-b)$ 是水平的。

Fermat's theorem (multivariable form)费马定理(多元形式)
$$\text{If } f \text{ has a local extremum at an interior point } (a,b) \text{ and } \nabla f(a,b) \text{ exists, then } \nabla f(a,b)=\mathbf{0}.$$
Going deeper: why the gradient must vanish at an interior extremum深入探讨:为何梯度必在内部极值点处为零

Suppose $f$ has a local maximum at the interior point $(a,b)$. Restrict $f$ to the horizontal line $y=b$, giving the single-variable function $g(x)=f(x,b)$. Then $g$ has a local maximum at $x=a$, so by the single-variable Fermat theorem $g'(a)=0$. But $g'(a)=f_x(a,b)$, hence $f_x(a,b)=0$.设 $f$ 在内部点 $(a,b)$ 处取得局部极大值。把 $f$ 限制在水平线 $y=b$ 上,得到单变量函数 $g(x)=f(x,b)$。则 $g$ 在 $x=a$ 处取得局部极大值,由单变量费马定理得 $g'(a)=0$。而 $g'(a)=f_x(a,b)$,故 $f_x(a,b)=0$。

The identical argument along $x=a$ with $h(y)=f(a,y)$ gives $h'(b)=f_y(a,b)=0$. Therefore $\nabla f(a,b)=\mathbf{0}$.沿 $x=a$ 用 $h(y)=f(a,y)$ 作同样论证,得 $h'(b)=f_y(a,b)=0$。因此 $\nabla f(a,b)=\mathbf{0}$。

$$g'(a)=\lim_{x\to a}\frac{f(x,b)-f(a,b)}{x-a}=f_x(a,b)=0.$$
Worked Example 1.1: locating critical points例题 1.1:定位临界点

Find all critical points of $f(x,y)=x^3+y^3-3xy$.求 $f(x,y)=x^3+y^3-3xy$ 的所有临界点。

Compute the partials and set them to zero:计算偏导数并令其为零:

$$f_x=3x^2-3y=0,\qquad f_y=3y^2-3x=0.$$

From the first equation $y=x^2$. Substituting into the second gives $3x^4-3x=0$, so $3x(x^3-1)=0$, giving $x=0$ or $x=1$. Then $y=x^2$ yields the points由第一个方程得 $y=x^2$。代入第二个方程得 $3x^4-3x=0$,即 $3x(x^3-1)=0$,故 $x=0$ 或 $x=1$。再由 $y=x^2$ 得到点

$$(0,0)\quad\text{and}\quad(1,1).$$

Both satisfy $\nabla f=\mathbf{0}$, so these are the only critical points.两者都满足 $\nabla f=\mathbf{0}$,因此这是仅有的临界点。

Worked Example 1.2: a critical point along a line例题 1.2:沿直线的临界点

Find all critical points of $f(x,y)=x^2-2xy+y^2+2x$.求 $f(x,y)=x^2-2xy+y^2+2x$ 的所有临界点。

The partials are偏导数为

$$f_x=2x-2y+2,\qquad f_y=-2x+2y.$$

Setting $f_y=0$ gives $x=y$. Substituting into $f_x=0$ gives $2x-2x+2=2\ne 0$, a contradiction. The system is therefore inconsistent, so $f$ has no critical points at all. This is honest behaviour: the quadratic part $x^2-2xy+y^2=(x-y)^2$ is degenerate (a ridge, not a bowl), and the linear term $2x$ tilts the ridge so it never flattens. A function need not have any critical point.令 $f_y=0$ 得 $x=y$。代入 $f_x=0$ 得 $2x-2x+2=2\ne 0$,矛盾。因此方程组无解,$f$ 根本没有临界点。这是合理现象:二次部分 $x^2-2xy+y^2=(x-y)^2$ 是退化的(一条山脊,而非碗状),线性项 $2x$ 又把山脊倾斜,使它永不变平。函数未必有临界点。

Worked Example 1.3: a non-differentiable critical point例题 1.3:不可微的临界点

Locate and describe the extremum of $f(x,y)=\sqrt{x^2+y^2}$, the distance from the origin.定位并描述 $f(x,y)=\sqrt{x^2+y^2}$(即到原点的距离)的极值。

Away from the origin the partials are在原点之外,偏导数为

$$f_x=\frac{x}{\sqrt{x^2+y^2}},\qquad f_y=\frac{y}{\sqrt{x^2+y^2}},$$

which never both vanish, since $f_x^2+f_y^2=1$ wherever they are defined. So there is no point where $\nabla f=\mathbf{0}$. Yet $f$ has an obvious global minimum of $0$ at the origin. The resolution is that $f$ is not differentiable at $(0,0)$: the cone has a sharp tip there. The full definition of a critical point includes points where a partial derivative fails to exist, and this is exactly such a point.它们从不同时为零,因为只要有定义就有 $f_x^2+f_y^2=1$。所以不存在 $\nabla f=\mathbf{0}$ 的点。然而 $f$ 显然在原点取得 $0$ 的全局极小值。原因在于 $f$ 在 $(0,0)$ 处不可微:圆锥在那里有一个尖顶。临界点的完整定义包含偏导数不存在的点,而这正是这样一个点。

Common error.常见错误。 Students often solve $f_x=0$ and $f_y=0$ separately and pair up the roots independently, reporting every combination as a critical point. The equations must be solved as a simultaneous system: a point is critical only if it satisfies both at once. In Worked Example 1.1, $f_x=0$ gives $y=x^2$ and that relation must be carried into $f_y=0$; treating $x$ from one equation and $y$ from the other would invent points that are not on the surface's flat spots. Always back-substitute and verify $\nabla f=\mathbf{0}$ at each candidate.学生常常分别解 $f_x=0$ 和 $f_y=0$,再把根独立配对,把每一种组合都报告为临界点。这两个方程必须作为联立方程组求解:只有同时满足两者的点才是临界点。在例题 1.1 中,$f_x=0$ 给出 $y=x^2$,这一关系必须代入 $f_y=0$;若把一个方程里的 $x$ 与另一个方程里的 $y$ 拼凑,就会臆造出并不在曲面平坦处的点。务必回代并在每个候选点验证 $\nabla f=\mathbf{0}$。
Which condition is necessary for $f(x,y)$ to have a local extremum at an interior point where it is differentiable?$f(x,y)$ 在某个可微的内部点取得局部极值,下列哪个条件是必要的?
1.1
The Hessian determinant is positive.黑塞矩阵(Hessian)的行列式为正。
$\nabla f=\mathbf{0}$ at the point.该点处 $\nabla f=\mathbf{0}$。
$f_{xx}>0$ at the point.该点处 $f_{xx}>0$。
$f$ is bounded on its domain.$f$ 在其定义域上有界。
Correct. At an interior differentiable extremum the gradient must vanish; this is the multivariable Fermat theorem.正确。在可微的内部极值点处梯度必为零;这就是多元费马定理。
A vanishing gradient is the necessary first-order condition. The Hessian and second partials classify a critical point but are not necessary for an extremum to exist, and boundedness is unrelated.梯度为零才是必要的一阶条件。黑塞矩阵和二阶偏导数用于判别临界点,但并非极值存在的必要条件,而有界性与此无关。

The Second Derivatives Test二阶导数判别法

Key idea.核心思想。 The sign of the Hessian determinant, together with the sign of $f_{xx}$, classifies a nondegenerate critical point as a local maximum, a local minimum, or a saddle point. The test reads the local quadratic shape of the surface.黑塞矩阵(Hessian)行列式的符号,连同 $f_{xx}$ 的符号,可把一个非退化临界点判别为局部极大值、局部极小值或鞍点(saddle point)。该判别法读取曲面的局部二次形状。
Discriminant (Hessian determinant)判别式(discriminant,即黑塞行列式)
$$D(a,b)=f_{xx}(a,b)\,f_{yy}(a,b)-\big[f_{xy}(a,b)\big]^2=\det\begin{pmatrix}f_{xx}&f_{xy}\\ f_{xy}&f_{yy}\end{pmatrix}.$$

Let $(a,b)$ be a critical point of $f$ with continuous second partials. The classification is:设 $(a,b)$ 是 $f$ 的一个临界点,且二阶偏导数连续。判别如下:

Classification判别
$$\begin{aligned} D>0,\ f_{xx}>0 &\implies \text{local minimum},\\ D>0,\ f_{xx}<0 &\implies \text{local maximum},\\ D<0 &\implies \text{saddle point},\\ D=0 &\implies \text{test inconclusive}. \end{aligned}$$

When $D>0$ the two eigenvalues of the Hessian share a sign, so $f_{yy}$ has the same sign as $f_{xx}$ and either may be used. When $D<0$ the eigenvalues have opposite signs, producing a directional ascent and a directional descent through the point.当 $D>0$ 时,黑塞矩阵的两个特征值同号,故 $f_{yy}$ 与 $f_{xx}$ 同号,用哪一个都可以。当 $D<0$ 时,特征值异号,于是通过该点存在一个方向上升、一个方向下降。

Going deeper: the test from the quadratic Taylor expansion深入探讨:从二次泰勒展开推导判别法

Near a critical point $(a,b)$, write $\Delta x=x-a$, $\Delta y=y-b$. Because $\nabla f(a,b)=\mathbf{0}$, the second-order Taylor expansion is在临界点 $(a,b)$ 附近,记 $\Delta x=x-a$,$\Delta y=y-b$。由于 $\nabla f(a,b)=\mathbf{0}$,二阶泰勒展开为

$$f(x,y)-f(a,b)\approx \tfrac12\big(f_{xx}\,\Delta x^2+2f_{xy}\,\Delta x\,\Delta y+f_{yy}\,\Delta y^2\big).$$

Assume $f_{xx}\ne 0$ and complete the square in $\Delta x$:设 $f_{xx}\ne 0$,对 $\Delta x$ 配方:

$$Q=\frac{1}{2f_{xx}}\Big[(f_{xx}\,\Delta x+f_{xy}\,\Delta y)^2+(f_{xx}f_{yy}-f_{xy}^2)\,\Delta y^2\Big].$$

The bracket contains a perfect square plus $D\,\Delta y^2$. If $D>0$ the bracket is positive for all nonzero displacements, so $Q$ has the sign of $f_{xx}$: a minimum when $f_{xx}>0$, a maximum when $f_{xx}<0$. If $D<0$ the bracket changes sign, giving a saddle. This is exactly the test.方括号内是一个完全平方加上 $D\,\Delta y^2$。若 $D>0$,则方括号对所有非零位移都为正,故 $Q$ 与 $f_{xx}$ 同号:当 $f_{xx}>0$ 为极小值,当 $f_{xx}<0$ 为极大值。若 $D<0$,方括号变号,得到鞍点。这恰好就是判别法。

Worked Example 2.1: classifying critical points例题 2.1:判别临界点

Classify the critical points of $f(x,y)=x^3+y^3-3xy$ found earlier, namely $(0,0)$ and $(1,1)$.判别先前求出的 $f(x,y)=x^3+y^3-3xy$ 的临界点,即 $(0,0)$ 和 $(1,1)$。

The second partials are $f_{xx}=6x$, $f_{yy}=6y$, $f_{xy}=-3$, so $D=36xy-9$.二阶偏导数为 $f_{xx}=6x$,$f_{yy}=6y$,$f_{xy}=-3$,故 $D=36xy-9$。

At $(0,0)$: $D=-9<0$, so the origin is a saddle point.在 $(0,0)$:$D=-9<0$,故原点是鞍点

At $(1,1)$: $D=36-9=27>0$ and $f_{xx}=6>0$, so $(1,1)$ is a local minimum, with value $f(1,1)=-1$.在 $(1,1)$:$D=36-9=27>0$ 且 $f_{xx}=6>0$,故 $(1,1)$ 是局部极小值,值为 $f(1,1)=-1$。

Worked Example 2.2: a function with several critical points例题 2.2:具有多个临界点的函数

Find and classify all critical points of $f(x,y)=x^4+y^4-4xy+1$.求并判别 $f(x,y)=x^4+y^4-4xy+1$ 的所有临界点。

The partials are $f_x=4x^3-4y$ and $f_y=4y^3-4x$. Setting both to zero gives $y=x^3$ and $x=y^3$. Substituting, $x=(x^3)^3=x^9$, so $x^9-x=0$, that is $x(x^8-1)=0$. The real roots are $x=0,\ x=1,\ x=-1$, giving the critical points偏导数为 $f_x=4x^3-4y$ 与 $f_y=4y^3-4x$。令两者为零得 $y=x^3$ 与 $x=y^3$。代入得 $x=(x^3)^3=x^9$,故 $x^9-x=0$,即 $x(x^8-1)=0$。实根为 $x=0,\ x=1,\ x=-1$,对应临界点

$$(0,0),\qquad (1,1),\qquad (-1,-1).$$

The second partials are $f_{xx}=12x^2$, $f_{yy}=12y^2$, $f_{xy}=-4$, so $D=144x^2y^2-16$.二阶偏导数为 $f_{xx}=12x^2$,$f_{yy}=12y^2$,$f_{xy}=-4$,故 $D=144x^2y^2-16$。

At $(0,0)$: $D=0-16=-16<0$, a saddle. At $(1,1)$ and $(-1,-1)$: $D=144-16=128>0$ with $f_{xx}=12>0$, so both are local minima, each with value $f=1+1-4+1=-1$. This is a classic two-well surface with a saddle ridge separating the wells.在 $(0,0)$:$D=0-16=-16<0$,为鞍点。在 $(1,1)$ 与 $(-1,-1)$:$D=144-16=128>0$ 且 $f_{xx}=12>0$,故两者都是局部极小值,值均为 $f=1+1-4+1=-1$。这是经典的双井曲面,一条鞍脊把两个井分隔开。

Worked Example 2.3: a local maximum例题 2.3:局部极大值

Classify the critical point of $f(x,y)=4+2x+2y-x^2-y^2$.判别 $f(x,y)=4+2x+2y-x^2-y^2$ 的临界点。

Setting $f_x=2-2x=0$ and $f_y=2-2y=0$ gives the single critical point $(1,1)$. The second partials are $f_{xx}=-2$, $f_{yy}=-2$, $f_{xy}=0$, so令 $f_x=2-2x=0$ 与 $f_y=2-2y=0$ 得唯一临界点 $(1,1)$。二阶偏导数为 $f_{xx}=-2$,$f_{yy}=-2$,$f_{xy}=0$,故

$$D=(-2)(-2)-0^2=4>0,\qquad f_{xx}=-2<0.$$

Therefore $(1,1)$ is a local maximum, with value $f(1,1)=4+2+2-1-1=6$. Completing the square confirms it: $f=6-(x-1)^2-(y-1)^2\le 6$, so the maximum is global as well.因此 $(1,1)$ 是局部极大值,值为 $f(1,1)=4+2+2-1-1=6$。配方加以验证:$f=6-(x-1)^2-(y-1)^2\le 6$,所以该极大值也是全局的。

Common error.常见错误。 A widespread mistake is to conclude "saddle" or "no extremum" whenever $f_{xx}<0$ but $f_{yy}>0$ (or vice versa). The discriminant, not the individual second partials, decides. It is the cross term $f_{xy}$ that can rescue or ruin the classification. Equally common is reading $D>0$ alone as "minimum"; you must then check the sign of $f_{xx}$ to tell a max from a min. And $D=0$ never means "saddle": it means the test says nothing, and you must analyse $f$ directly along curves through the point.一个普遍的错误是:只要 $f_{xx}<0$ 而 $f_{yy}>0$(或反之)就断言"鞍点"或"无极值"。决定权在判别式(discriminant),而非单个二阶偏导数。正是交叉项 $f_{xy}$ 可能挽救或破坏判别结果。同样常见的是仅凭 $D>0$ 就读作"极小值";此时还须查看 $f_{xx}$ 的符号才能区分极大与极小。而 $D=0$ 绝不意味着"鞍点":它意味着判别法无结论,你必须沿过该点的曲线直接分析 $f$。
At a critical point, $f_{xx}=2$, $f_{yy}=8$, and $f_{xy}=5$. What is the classification?在某临界点处,$f_{xx}=2$,$f_{yy}=8$,$f_{xy}=5$。该点属于哪一类?
2.1
Local minimum.局部极小值。
Local maximum.局部极大值。
Inconclusive.无法判别。
Saddle point.鞍点。
Correct. $D=(2)(8)-5^2=16-25=-9<0$, so the point is a saddle.正确。$D=(2)(8)-5^2=16-25=-9<0$,故该点为鞍点。
Compute $D=f_{xx}f_{yy}-f_{xy}^2=16-25=-9$. Since $D<0$ the point is a saddle regardless of the sign of $f_{xx}$.计算 $D=f_{xx}f_{yy}-f_{xy}^2=16-25=-9$。由于 $D<0$,无论 $f_{xx}$ 的符号如何,该点都是鞍点。

Absolute Extrema on Closed Regions闭区域上的全局极值

Key idea.核心思想。 A continuous function on a closed and bounded region attains a global maximum and a global minimum. These occur either at an interior critical point or on the boundary, so the strategy is to gather all interior critical values and all boundary extreme values, then compare.闭有界区域上的连续函数必取得全局极大值与全局极小值(global extremum)。它们出现在内部临界点(critical point)或边界上,因此策略是收集所有内部临界值与所有边界极值,再逐一比较。
Extreme value theorem (two variables)极值定理(二元情形)
$$f \text{ continuous on a closed, bounded } R\subset\mathbb{R}^2 \implies f \text{ attains an absolute max and min on } R.$$

The closed-region method has three steps:闭区域法分为三步:

Closed-region procedure闭区域求解流程
$$\begin{aligned} &\text{1. Find interior critical points; evaluate } f \text{ there.}\\ &\text{2. Optimize } f \text{ on each boundary piece (reduce to one variable or use Lagrange).}\\ &\text{3. Include corner points; the largest and smallest of all values are the absolute extrema.} \end{aligned}$$

On a boundary curve one typically parametrizes the edge and reduces to a single-variable optimization, then checks the endpoints of each edge as well.在边界曲线上,通常将该边参数化,化为单变量最优化问题,并同时检查每条边的端点。

Worked Example 3.1: extrema on a rectangle例题 3.1:矩形上的极值

Find the absolute extrema of $f(x,y)=x^2+y^2-x-y$ on the square $R=[0,1]\times[0,1]$.求 $f(x,y)=x^2+y^2-x-y$ 在正方形 $R=[0,1]\times[0,1]$ 上的全局极值。

Interior.内部。 Setting $f_x=2x-1=0$ and $f_y=2y-1=0$ gives the single interior critical point $(\tfrac12,\tfrac12)$ with $f=\tfrac14+\tfrac14-\tfrac12-\tfrac12=-\tfrac12$.令 $f_x=2x-1=0$ 与 $f_y=2y-1=0$ 得唯一内部临界点 $(\tfrac12,\tfrac12)$,$f=\tfrac14+\tfrac14-\tfrac12-\tfrac12=-\tfrac12$。

Boundary.边界。 On $y=0$: $f=x^2-x$, minimized at $x=\tfrac12$ giving $-\tfrac14$, with endpoint values $f(0,0)=0$, $f(1,0)=0$. By symmetry the edges $x=0$, $x=1$, $y=1$ give the same range, with $f(1,1)=0$.在 $y=0$ 上:$f=x^2-x$,在 $x=\tfrac12$ 处取得极小 $-\tfrac14$,端点值 $f(0,0)=0$、$f(1,0)=0$。由对称性,边 $x=0$、$x=1$、$y=1$ 给出相同范围,且 $f(1,1)=0$。

Compare.比较。 The candidate values are $-\tfrac12$, $-\tfrac14$, and $0$. The absolute minimum is $-\tfrac12$ at $(\tfrac12,\tfrac12)$ and the absolute maximum is $0$ at the corners.候选值为 $-\tfrac12$、$-\tfrac14$ 和 $0$。全局极小值是 $-\tfrac12$,在 $(\tfrac12,\tfrac12)$ 处;全局极大值是 $0$,在各角点处。

Worked Example 3.2: extrema on a triangular region例题 3.2:三角形区域上的极值

Find the absolute extrema of $f(x,y)=1+4x-5y$ on the closed triangle $T$ with vertices $(0,0)$, $(2,0)$, and $(0,3)$.求 $f(x,y)=1+4x-5y$ 在以 $(0,0)$、$(2,0)$、$(0,3)$ 为顶点的闭三角形 $T$ 上的全局极值。

Interior.内部。 Since $f_x=4$ and $f_y=-5$ never vanish, there are no interior critical points. A linear function on a polygon attains its extrema only on the boundary, in fact at the vertices.由于 $f_x=4$ 与 $f_y=-5$ 从不为零,没有内部临界点。多边形上的线性函数只在边界上取得极值,事实上是在顶点处。

Edges.各边。 We still check each edge to be systematic. On $y=0$, $0\le x\le 2$: $f=1+4x$ runs from $1$ to $9$. On $x=0$, $0\le y\le 3$: $f=1-5y$ runs from $1$ down to $-14$. On the hypotenuse from $(2,0)$ to $(0,3)$, parametrize $x=2-2t$, $y=3t$, $t\in[0,1]$: $f=1+4(2-2t)-15t=9-23t$, decreasing from $9$ to $-14$.为系统起见仍逐边检查。在 $y=0$,$0\le x\le 2$:$f=1+4x$ 从 $1$ 变到 $9$。在 $x=0$,$0\le y\le 3$:$f=1-5y$ 从 $1$ 降到 $-14$。在从 $(2,0)$ 到 $(0,3)$ 的斜边上,参数化 $x=2-2t$,$y=3t$,$t\in[0,1]$:$f=1+4(2-2t)-15t=9-23t$,从 $9$ 递减到 $-14$。

Compare.比较。 The vertex values are $f(0,0)=1$, $f(2,0)=9$, $f(0,3)=-14$. The absolute maximum is $9$ at $(2,0)$ and the absolute minimum is $-14$ at $(0,3)$.顶点值为 $f(0,0)=1$、$f(2,0)=9$、$f(0,3)=-14$。全局极大值是 $9$,在 $(2,0)$;全局极小值是 $-14$,在 $(0,3)$。

Worked Example 3.3: a curved boundary handled with substitution例题 3.3:用代入法处理曲边界

Find the absolute extrema of $f(x,y)=x^2+2y^2-x$ on the closed disk $x^2+y^2\le 1$.求 $f(x,y)=x^2+2y^2-x$ 在闭圆盘 $x^2+y^2\le 1$ 上的全局极值。

Interior.内部。 $f_x=2x-1=0$ and $f_y=4y=0$ give the single critical point $(\tfrac12,0)$, which lies inside the disk. There $f=\tfrac14-\tfrac12=-\tfrac14$.$f_x=2x-1=0$ 与 $f_y=4y=0$ 给出唯一临界点 $(\tfrac12,0)$,它落在圆盘内部。在此 $f=\tfrac14-\tfrac12=-\tfrac14$。

Boundary.边界。 On $x^2+y^2=1$ replace $y^2=1-x^2$, with $-1\le x\le 1$. Then在 $x^2+y^2=1$ 上用 $y^2=1-x^2$ 替换,$-1\le x\le 1$。于是

$$f=x^2+2(1-x^2)-x=-x^2-x+2.$$

This single-variable function has $\tfrac{d}{dx}(-x^2-x+2)=-2x-1=0$ at $x=-\tfrac12$, giving $f=-\tfrac14+\tfrac12+2=\tfrac94$. Check the interval endpoints $x=\pm 1$ (where $y=0$): $f(1,0)=0$ and $f(-1,0)=2$.这个单变量函数在 $x=-\tfrac12$ 处有 $\tfrac{d}{dx}(-x^2-x+2)=-2x-1=0$,给出 $f=-\tfrac14+\tfrac12+2=\tfrac94$。检查区间端点 $x=\pm 1$(此处 $y=0$):$f(1,0)=0$,$f(-1,0)=2$。

Compare.比较。 Candidates are $-\tfrac14$, $\tfrac94$, $0$, $2$. The absolute minimum is $-\tfrac14$ at $(\tfrac12,0)$ and the absolute maximum is $\tfrac94$ at $\left(-\tfrac12,\pm\tfrac{\sqrt3}{2}\right)$.候选值为 $-\tfrac14$、$\tfrac94$、$0$、$2$。全局极小值是 $-\tfrac14$,在 $(\tfrac12,0)$;全局极大值是 $\tfrac94$,在 $\left(-\tfrac12,\pm\tfrac{\sqrt3}{2}\right)$。

Going deeper: why a closed bounded region guarantees extrema深入探讨:为何闭有界区域能保证极值存在

The extreme value theorem in two variables rests on two topological facts. First, a closed and bounded subset of $\mathbb{R}^2$ is compact (the Heine-Borel theorem). Second, a continuous function maps a compact set to a compact set, and a compact subset of $\mathbb{R}$ is closed and bounded, hence contains its supremum and infimum.二元极值定理依赖两个拓扑事实。第一,$\mathbb{R}^2$ 的闭有界子集是的(海涅-博雷尔定理,Heine-Borel theorem)。第二,连续函数把紧集映为紧集,而 $\mathbb{R}$ 的紧子集是闭有界的,因而包含其上确界与下确界。

To see why both hypotheses are needed, drop one at a time. On the open disk $x^2+y^2<1$ the function $f=x$ has supremum $1$ but never attains it, because the boundary where $x=1$ is excluded. On the unbounded strip $0\le x$, $0\le y\le 1$ the function $f=x$ is continuous but has no maximum. Closedness supplies the boundary; boundedness keeps the values from escaping to infinity. Remove either and the guarantee fails.要看清为何两个前提都必要,可逐一去掉。在圆盘 $x^2+y^2<1$ 上,函数 $f=x$ 的上确界为 $1$ 却永不达到,因为 $x=1$ 的边界被排除。在无界带形 $0\le x$、$0\le y\le 1$ 上,$f=x$ 连续却无极大值。闭性提供边界;有界性使函数值不致逃向无穷。去掉任一个,保证就失效。

This is exactly why the closed-region procedure must scan the entire boundary, not just interior critical points: the theorem promises the extremum exists somewhere on the compact set, and the boundary is precisely where it hides when no interior critical point wins.这正是闭区域流程必须扫描整个边界、而不仅是内部临界点的原因:定理保证极值在紧集的某处存在,而当没有内部临界点胜出时,它恰好藏在边界上。

Common error.常见错误。 The most frequent slip on closed-region problems is forgetting the endpoints of each boundary edge. Reducing an edge to a one-variable function and finding its interior critical point is only half the job: the corners where edges meet are candidates too, and on a curved boundary the parameter interval endpoints must be evaluated. A close second is parametrizing the boundary but then dropping the constraint that the parameter stays in its interval, which can produce a fake extreme value lying outside the region.闭区域问题最常见的失误是忘记每条边界边的端点。把一条边化为单变量函数并求其内部临界点只是工作的一半:各边相交的角点也是候选,曲边界上还须代入参数区间的端点。紧随其后的是:把边界参数化后却丢掉了参数须留在其区间内的约束,从而产生落在区域之外的假极值。
When finding the absolute extrema of a continuous $f$ on a closed bounded region, which set of candidates must be compared?在闭有界区域上求连续函数 $f$ 的全局极值时,必须比较哪一组候选点?
3.1
Interior critical points and boundary extreme points (including corners).内部临界点与边界极值点(含角点)。
Only interior critical points.仅内部临界点。
Only the four corners of the region.仅区域的四个角点。
Points where the Hessian determinant is zero.黑塞行列式为零的点。
Correct. Global extrema occur at interior critical points or on the boundary, so both sets of candidates must be evaluated and compared.正确。全局极值出现在内部临界点或边界上,因此两组候选都必须求值并比较。
A global extremum can sit on the boundary even when no interior critical point is extreme, so interior points alone are not enough, and boundaries are more than just corners.即使没有内部临界点取极值,全局极值也可能落在边界上,所以仅靠内部点不够,而边界也不只是角点。

Lagrange Multipliers: One Constraint拉格朗日乘数:单约束

Key idea.核心思想。 To extremize $f$ subject to a constraint $g=k$, look for points where the gradient of $f$ is parallel to the gradient of the constraint. At a constrained extremum the level curve of $f$ is tangent to the constraint curve, so their gradients are scalar multiples of each other.要在约束(constraint)$g=k$ 下求 $f$ 的极值,就寻找 $f$ 的梯度(gradient)与约束梯度平行的点。在条件极值处,$f$ 的等值线与约束曲线相切,故两者的梯度互为标量倍数。
Lagrange condition (one constraint)拉格朗日条件(单约束)
$$\nabla f(\mathbf{x})=\lambda\,\nabla g(\mathbf{x}),\qquad g(\mathbf{x})=k,$$

where the scalar $\lambda$ is the Lagrange multiplier. In two variables this is the system $f_x=\lambda g_x$, $f_y=\lambda g_y$, $g(x,y)=k$, three equations in the three unknowns $x,y,\lambda$. The method requires $\nabla g\ne\mathbf{0}$ on the constraint set.其中标量 $\lambda$ 就是拉格朗日乘数(Lagrange multiplier)。在二元情形中,这是方程组 $f_x=\lambda g_x$、$f_y=\lambda g_y$、$g(x,y)=k$,即三个方程含三个未知量 $x,y,\lambda$。该方法要求在约束集上 $\nabla g\ne\mathbf{0}$。

Going deeper: why the gradients must be parallel深入探讨:为何两个梯度必须平行

Parametrize the constraint curve $g(x,y)=k$ as $\mathbf{r}(t)$. Along it, define $\phi(t)=f(\mathbf{r}(t))$. At a constrained extremum $\phi'(t_0)=0$. By the chain rule,将约束曲线 $g(x,y)=k$ 参数化为 $\mathbf{r}(t)$。沿其定义 $\phi(t)=f(\mathbf{r}(t))$。在条件极值处 $\phi'(t_0)=0$。由链式法则,

$$\phi'(t_0)=\nabla f\cdot \mathbf{r}'(t_0)=0,$$

so $\nabla f$ is orthogonal to the tangent $\mathbf{r}'(t_0)$. But $\nabla g$ is also orthogonal to that tangent, since $g$ is constant along the curve and $\tfrac{d}{dt}g(\mathbf{r}(t))=\nabla g\cdot\mathbf{r}'=0$. In the plane, two vectors orthogonal to the same nonzero direction are parallel, hence故 $\nabla f$ 与切向量 $\mathbf{r}'(t_0)$ 正交。但 $\nabla g$ 也与该切向量正交,因为 $g$ 沿曲线恒定,且 $\tfrac{d}{dt}g(\mathbf{r}(t))=\nabla g\cdot\mathbf{r}'=0$。在平面上,与同一非零方向正交的两个向量必平行,因此

$$\nabla f=\lambda\,\nabla g.$$
Worked Example 4.1: extremize on a circle例题 4.1:在圆上求极值

Find the extreme values of $f(x,y)=xy$ subject to $x^2+y^2=8$.求 $f(x,y)=xy$ 在约束 $x^2+y^2=8$ 下的极值。

Here $g=x^2+y^2$, so $\nabla f=(y,x)$ and $\nabla g=(2x,2y)$. The Lagrange system is此处 $g=x^2+y^2$,故 $\nabla f=(y,x)$,$\nabla g=(2x,2y)$。拉格朗日方程组为

$$y=2\lambda x,\qquad x=2\lambda y,\qquad x^2+y^2=8.$$

Substituting the first into the second gives $x=2\lambda(2\lambda x)=4\lambda^2 x$, so $x=0$ or $\lambda^2=\tfrac14$. If $x=0$ then $y=0$, which violates the constraint, so $\lambda=\pm\tfrac12$, giving $y=\pm x$.把第一式代入第二式得 $x=2\lambda(2\lambda x)=4\lambda^2 x$,故 $x=0$ 或 $\lambda^2=\tfrac14$。若 $x=0$ 则 $y=0$,这违反约束,故 $\lambda=\pm\tfrac12$,得 $y=\pm x$。

With $y=x$: $2x^2=8$, so $x=\pm 2$ and $f=4$. With $y=-x$: $f=-4$. The maximum is $4$ and the minimum is $-4$.当 $y=x$:$2x^2=8$,故 $x=\pm 2$,$f=4$。当 $y=-x$:$f=-4$。极大值为 $4$,极小值为 $-4$。

Worked Example 4.2: maximize a product under a sum constraint例题 4.2:在和约束下最大化乘积

Maximize $f(x,y,z)=xyz$ subject to $x+y+z=12$ with $x,y,z>0$.在 $x+y+z=12$ 且 $x,y,z>0$ 下最大化 $f(x,y,z)=xyz$。

Here $\nabla f=(yz,xz,xy)$ and $\nabla g=(1,1,1)$. The Lagrange system is此处 $\nabla f=(yz,xz,xy)$,$\nabla g=(1,1,1)$。拉格朗日方程组为

$$yz=\lambda,\qquad xz=\lambda,\qquad xy=\lambda,\qquad x+y+z=12.$$

From $yz=xz$ and $z>0$ we get $x=y$. From $xz=xy$ and $x>0$ we get $y=z$. Hence $x=y=z$, and the constraint gives $3x=12$, so $x=y=z=4$. The maximum is由 $yz=xz$ 与 $z>0$ 得 $x=y$。由 $xz=xy$ 与 $x>0$ 得 $y=z$。故 $x=y=z$,代入约束得 $3x=12$,所以 $x=y=z=4$。极大值为

$$f(4,4,4)=64.$$

This is the three-variable AM-GM inequality in disguise: among positive numbers with a fixed sum, the product is largest when they are equal.这其实是三元算术-几何平均不等式(AM-GM inequality)的化身:在和固定的正数中,乘积在它们相等时最大。

Worked Example 4.3: an inequality proved by Lagrange例题 4.3:用拉格朗日法证明一个不等式

Find the extreme values of $f(x,y)=2x+y$ on the ellipse $x^2+4y^2=1$, and read off the resulting inequality.求 $f(x,y)=2x+y$ 在椭圆 $x^2+4y^2=1$ 上的极值,并由此读出相应的不等式。

With $g=x^2+4y^2$, $\nabla f=(2,1)$ and $\nabla g=(2x,8y)$. The system is取 $g=x^2+4y^2$,$\nabla f=(2,1)$,$\nabla g=(2x,8y)$。方程组为

$$2=2\lambda x,\qquad 1=8\lambda y,\qquad x^2+4y^2=1.$$

So $x=\tfrac{1}{\lambda}$ and $y=\tfrac{1}{8\lambda}$. Substituting into the constraint, $\tfrac{1}{\lambda^2}+4\cdot\tfrac{1}{64\lambda^2}=1$, that is $\tfrac{1}{\lambda^2}\left(1+\tfrac{1}{16}\right)=1$, so $\lambda^2=\tfrac{17}{16}$ and $\lambda=\pm\tfrac{\sqrt{17}}{4}$.于是 $x=\tfrac{1}{\lambda}$,$y=\tfrac{1}{8\lambda}$。代入约束得 $\tfrac{1}{\lambda^2}+4\cdot\tfrac{1}{64\lambda^2}=1$,即 $\tfrac{1}{\lambda^2}\left(1+\tfrac{1}{16}\right)=1$,故 $\lambda^2=\tfrac{17}{16}$,$\lambda=\pm\tfrac{\sqrt{17}}{4}$。

Then $f=2x+y=\tfrac{2}{\lambda}+\tfrac{1}{8\lambda}=\tfrac{17}{8\lambda}=\pm\tfrac{17}{8}\cdot\tfrac{4}{\sqrt{17}}=\pm\tfrac{\sqrt{17}}{2}$. So the maximum is $\tfrac{\sqrt{17}}{2}$ and the minimum is $-\tfrac{\sqrt{17}}{2}$. Equivalently $|2x+y|\le \tfrac{\sqrt{17}}{2}$ for every point on the ellipse, a sharp bound attained at the two tangency points.则 $f=2x+y=\tfrac{2}{\lambda}+\tfrac{1}{8\lambda}=\tfrac{17}{8\lambda}=\pm\tfrac{17}{8}\cdot\tfrac{4}{\sqrt{17}}=\pm\tfrac{\sqrt{17}}{2}$。故极大值为 $\tfrac{\sqrt{17}}{2}$,极小值为 $-\tfrac{\sqrt{17}}{2}$。等价地,对椭圆上每个点都有 $|2x+y|\le \tfrac{\sqrt{17}}{2}$,这是在两个切点处达到的紧界。

Common error.常见错误。 A frequent mistake is to cancel a variable when manipulating the Lagrange equations without recording the case where that variable is zero. From $yz=xz$ one divides by $z$ to get $x=y$, but $z=0$ is a separate branch that must be examined (it may satisfy or violate the constraint and can hide an extremum). A second error is solving for $\lambda$ and stopping: $\lambda$ is only an auxiliary unknown. You must return to find the actual points $(x,y)$ and evaluate $f$ there, then compare, because Lagrange produces candidates, not a labelled max and min.一个常见错误是在变形拉格朗日方程时约去某个变量,却不记录该变量为零的情形。由 $yz=xz$ 除以 $z$ 得 $x=y$,但 $z=0$ 是必须单独考察的另一支(它可能满足或违反约束,并可能藏着一个极值)。第二个错误是解出 $\lambda$ 就停手:$\lambda$ 只是辅助未知量。你必须回头求出实际的点 $(x,y)$,在那里求 $f$ 的值再比较,因为拉格朗日法产生的是候选,而非已标注好的极大与极小。
The Lagrange condition for extremizing $f$ subject to $g=k$ states that at a solution:在约束 $g=k$ 下对 $f$ 求极值的拉格朗日条件指出,在解处:
4.1
$\nabla f=\mathbf{0}$ and $g=k$.且 $g=k$。
$\nabla f$ is orthogonal to $\nabla g$.$\nabla f$ 与 $\nabla g$ 正交。
$\nabla f=\lambda\,\nabla g$ for some scalar $\lambda$, with $g=k$.存在某标量 $\lambda$ 使 $\nabla f=\lambda\,\nabla g$,且 $g=k$。
$f=g$ at the optimal point.在最优点处 $f=g$。
Correct. The gradients are parallel at a constrained extremum, expressed as $\nabla f=\lambda\nabla g$, together with the constraint equation.正确。在条件极值处两个梯度平行,记作 $\nabla f=\lambda\nabla g$,再加上约束方程。
At a constrained extremum the gradient of $f$ need not vanish; instead it is parallel (not orthogonal) to $\nabla g$, which is the relation $\nabla f=\lambda\nabla g$.在条件极值处 $f$ 的梯度不必为零;它是与 $\nabla g$ 平行(而非正交),即关系 $\nabla f=\lambda\nabla g$。

Lagrange Multipliers: Two Constraints拉格朗日乘数:双约束

Key idea.核心思想。 With two constraints in space, the feasible set is the curve of intersection of two surfaces. At a constrained extremum the gradient of $f$ lies in the plane spanned by the two constraint gradients, so $\nabla f$ is a linear combination of $\nabla g$ and $\nabla h$.在空间中有两个约束时,可行集是两曲面的交线。在条件极值处,$f$ 的梯度落在由两个约束梯度张成的平面内,故 $\nabla f$ 是 $\nabla g$ 与 $\nabla h$ 的线性组合。
Lagrange condition (two constraints)拉格朗日条件(双约束)
$$\nabla f=\lambda\,\nabla g+\mu\,\nabla h,\qquad g(\mathbf{x})=k_1,\qquad h(\mathbf{x})=k_2.$$

In three variables this gives five equations ($f_x=\lambda g_x+\mu h_x$ and its $y,z$ analogues, plus the two constraints) in the five unknowns $x,y,z,\lambda,\mu$. The method requires $\nabla g$ and $\nabla h$ to be linearly independent along the intersection curve.在三元情形中,这给出五个方程($f_x=\lambda g_x+\mu h_x$ 及其 $y,z$ 类比,再加上两个约束),含五个未知量 $x,y,z,\lambda,\mu$。该方法要求 $\nabla g$ 与 $\nabla h$ 沿交线线性无关。

The geometry: the intersection curve has tangent direction $\nabla g\times\nabla h$. At an extremum $\nabla f$ must be orthogonal to this tangent, which forces $\nabla f$ into the span of $\nabla g$ and $\nabla h$.几何上:交线的切方向为 $\nabla g\times\nabla h$。在极值处 $\nabla f$ 须与该切向量正交,这迫使 $\nabla f$ 落入 $\nabla g$ 与 $\nabla h$ 张成的空间内。

Worked Example 5.1: two planes constraint例题 5.1:两平面约束

Maximize $f(x,y,z)=x+2y+3z$ on the curve where the plane $x+y+z=1$ meets the cylinder constraint... here take the second constraint $x-y+z=0$, the intersection line of two planes.在平面 $x+y+z=1$ 与第二个约束相交的曲线上最大化 $f(x,y,z)=x+2y+3z$,这里取第二个约束 $x-y+z=0$,即两平面的交线。

Gradients: $\nabla f=(1,2,3)$, $\nabla g=(1,1,1)$, $\nabla h=(1,-1,1)$. The condition $\nabla f=\lambda\nabla g+\mu\nabla h$ gives the component equations各梯度:$\nabla f=(1,2,3)$,$\nabla g=(1,1,1)$,$\nabla h=(1,-1,1)$。条件 $\nabla f=\lambda\nabla g+\mu\nabla h$ 给出分量方程

$$1=\lambda+\mu,\qquad 2=\lambda-\mu,\qquad 3=\lambda+\mu.$$

The first and third equations conflict ($1\ne 3$), so $\nabla f$ does not lie in the span of $\nabla g,\nabla h$ at any point. Geometrically $f$ is linear and the feasible set is a line, so $f$ is unbounded along it and has no finite extremum. This illustrates that the Lagrange system being inconsistent flags the absence of a constrained extremum.第一与第三个方程冲突($1\ne 3$),故 $\nabla f$ 在任何点都不落在 $\nabla g,\nabla h$ 张成的空间内。几何上 $f$ 是线性的而可行集是一条直线,所以 $f$ 沿其无界、没有有限极值。这说明拉格朗日方程组无解正是条件极值不存在的标志。

Worked Example 5.2: plane meets sphere例题 5.2:平面与球面相交

Find the extreme values of $f(x,y,z)=z$ on the intersection of the plane $g=x+y+z=0$ and the sphere $h=x^2+y^2+z^2=1$.求 $f(x,y,z)=z$ 在平面 $g=x+y+z=0$ 与球面 $h=x^2+y^2+z^2=1$ 的交线上的极值。

Gradients: $\nabla f=(0,0,1)$, $\nabla g=(1,1,1)$, $\nabla h=(2x,2y,2z)$. The system $\nabla f=\lambda\nabla g+\mu\nabla h$ reads各梯度:$\nabla f=(0,0,1)$,$\nabla g=(1,1,1)$,$\nabla h=(2x,2y,2z)$。方程组 $\nabla f=\lambda\nabla g+\mu\nabla h$ 写作

$$0=\lambda+2\mu x,\quad 0=\lambda+2\mu y,\quad 1=\lambda+2\mu z.$$

Subtracting the first two gives $2\mu(x-y)=0$, so $\mu=0$ (impossible, since then $\lambda=0$ and $1=0$) or $x=y$. With $x=y$, the plane gives $z=-2x$. The sphere gives $x^2+x^2+4x^2=1$, so $x^2=\tfrac16$, $x=\pm\tfrac{1}{\sqrt6}$. Then $z=-2x=\mp\tfrac{2}{\sqrt6}$.前两式相减得 $2\mu(x-y)=0$,故 $\mu=0$(不可能,因为这时 $\lambda=0$ 且 $1=0$)或 $x=y$。当 $x=y$,由平面得 $z=-2x$。由球面得 $x^2+x^2+4x^2=1$,故 $x^2=\tfrac16$,$x=\pm\tfrac{1}{\sqrt6}$。于是 $z=-2x=\mp\tfrac{2}{\sqrt6}$。

$$z_{\max}=\frac{2}{\sqrt6},\qquad z_{\min}=-\frac{2}{\sqrt6}.$$
Worked Example 5.3: nearest point on a line in space例题 5.3:空间直线上离原点最近的点

Find the point on the line of intersection of the planes $x+y+z=1$ and $x-y+2z=2$ that is closest to the origin.求平面 $x+y+z=1$ 与 $x-y+2z=2$ 交线上离原点最近的点。

Minimize $f=x^2+y^2+z^2$ subject to $g=x+y+z=1$ and $h=x-y+2z=2$. With $\nabla f=(2x,2y,2z)$, $\nabla g=(1,1,1)$, $\nabla h=(1,-1,2)$, the condition $\nabla f=\lambda\nabla g+\mu\nabla h$ gives在 $g=x+y+z=1$ 与 $h=x-y+2z=2$ 下最小化 $f=x^2+y^2+z^2$。取 $\nabla f=(2x,2y,2z)$,$\nabla g=(1,1,1)$,$\nabla h=(1,-1,2)$,条件 $\nabla f=\lambda\nabla g+\mu\nabla h$ 给出

$$2x=\lambda+\mu,\qquad 2y=\lambda-\mu,\qquad 2z=\lambda+2\mu.$$

So $x=\tfrac{\lambda+\mu}{2}$, $y=\tfrac{\lambda-\mu}{2}$, $z=\tfrac{\lambda+2\mu}{2}$. Impose the two constraints. The sum $x+y+z=1$ gives $\tfrac{3\lambda+2\mu}{2}=1$, that is $3\lambda+2\mu=2$. For the second, $x-y=\mu$ and $2z=\lambda+2\mu$, so $x-y+2z=\mu+(\lambda+2\mu)=\lambda+3\mu=2$.于是 $x=\tfrac{\lambda+\mu}{2}$,$y=\tfrac{\lambda-\mu}{2}$,$z=\tfrac{\lambda+2\mu}{2}$。代入两个约束。和 $x+y+z=1$ 给出 $\tfrac{3\lambda+2\mu}{2}=1$,即 $3\lambda+2\mu=2$。对第二个约束,$x-y=\mu$ 且 $2z=\lambda+2\mu$,故 $x-y+2z=\mu+(\lambda+2\mu)=\lambda+3\mu=2$。

Solve the linear system $3\lambda+2\mu=2$, $\lambda+3\mu=2$. From the second, $\lambda=2-3\mu$; substituting into the first, $3(2-3\mu)+2\mu=2$, so $6-7\mu=2$ and $\mu=\tfrac{4}{7}$. Then $\lambda=2-3\cdot\tfrac{4}{7}=\tfrac{2}{7}$. The closest point is解线性方程组 $3\lambda+2\mu=2$、$\lambda+3\mu=2$。由第二式 $\lambda=2-3\mu$;代入第一式得 $3(2-3\mu)+2\mu=2$,故 $6-7\mu=2$,$\mu=\tfrac{4}{7}$。于是 $\lambda=2-3\cdot\tfrac{4}{7}=\tfrac{2}{7}$。最近点为

$$x=\tfrac{\lambda+\mu}{2}=\tfrac{3}{7},\quad y=\tfrac{\lambda-\mu}{2}=-\tfrac{1}{7},\quad z=\tfrac{\lambda+2\mu}{2}=\tfrac{5}{7}.$$

A quick check: both $x+y+z=\tfrac{3-1+5}{7}=1$ and $x-y+2z=\tfrac{3+1+10}{7}=2$ hold, confirming the solution sits on the line.快速验证:$x+y+z=\tfrac{3-1+5}{7}=1$ 与 $x-y+2z=\tfrac{3+1+10}{7}=2$ 都成立,确认该解落在直线上。

Going deeper: why the gradient lies in the span of the constraint gradients深入探讨:为何梯度落在约束梯度张成的空间内

Suppose $g$ and $h$ are smooth with $\nabla g$ and $\nabla h$ linearly independent at a point $\mathbf{x}_0$ on the intersection curve $C=\{g=k_1,\ h=k_2\}$. By the implicit function theorem $C$ is a smooth curve near $\mathbf{x}_0$, and its tangent direction is $\mathbf{T}=\nabla g\times\nabla h$, since $\mathbf{T}$ is orthogonal to both surface normals.设 $g$ 与 $h$ 光滑,且在交线 $C=\{g=k_1,\ h=k_2\}$ 上的点 $\mathbf{x}_0$ 处 $\nabla g$ 与 $\nabla h$ 线性无关。由隐函数定理(implicit function theorem),$C$ 在 $\mathbf{x}_0$ 附近是光滑曲线,其切方向为 $\mathbf{T}=\nabla g\times\nabla h$,因为 $\mathbf{T}$ 与两个曲面法向量都正交。

Parametrize $C$ as $\mathbf{r}(t)$ with $\mathbf{r}(t_0)=\mathbf{x}_0$, and set $\phi(t)=f(\mathbf{r}(t))$. At a constrained extremum, $\phi'(t_0)=\nabla f\cdot\mathbf{r}'(t_0)=0$, so $\nabla f\perp\mathbf{T}$.将 $C$ 参数化为 $\mathbf{r}(t)$,其中 $\mathbf{r}(t_0)=\mathbf{x}_0$,并令 $\phi(t)=f(\mathbf{r}(t))$。在条件极值处 $\phi'(t_0)=\nabla f\cdot\mathbf{r}'(t_0)=0$,故 $\nabla f\perp\mathbf{T}$。

Now $\{\nabla g,\nabla h,\mathbf{T}\}$ is an orthogonal-style spanning set for $\mathbb{R}^3$: $\mathbf{T}$ is perpendicular to the plane spanned by $\nabla g$ and $\nabla h$. Any vector perpendicular to $\mathbf{T}$ must therefore lie in that plane. Since $\nabla f\perp\mathbf{T}$, we conclude $\nabla f=\lambda\nabla g+\mu\nabla h$ for some scalars. The independence of $\nabla g,\nabla h$ is exactly what guarantees the plane is two-dimensional and the multipliers are determined.现在 $\{\nabla g,\nabla h,\mathbf{T}\}$ 是 $\mathbb{R}^3$ 的一组正交式张成集:$\mathbf{T}$ 垂直于 $\nabla g$ 与 $\nabla h$ 张成的平面。因此任何垂直于 $\mathbf{T}$ 的向量都必落在该平面内。由 $\nabla f\perp\mathbf{T}$,我们得出存在某些标量使 $\nabla f=\lambda\nabla g+\mu\nabla h$。$\nabla g,\nabla h$ 的线性无关恰好保证该平面是二维的,且乘数被唯一确定。

Common error.常见错误。 With two constraints students sometimes write a single multiplier, $\nabla f=\lambda\nabla g$, ignoring $h$ entirely, or they impose the impossible $\nabla f\perp\nabla g$ and $\nabla f\perp\nabla h$. Two constraints require two multipliers and the relation $\nabla f=\lambda\nabla g+\mu\nabla h$. A subtler error is applying the method where $\nabla g$ and $\nabla h$ are parallel (linearly dependent): there the intersection is not a smooth curve and the multiplier representation can fail, so such points must be inspected on their own.面对两个约束时,学生有时只写一个乘数 $\nabla f=\lambda\nabla g$,完全忽略 $h$,或强加不可能成立的 $\nabla f\perp\nabla g$ 且 $\nabla f\perp\nabla h$。两个约束需要两个乘数和关系 $\nabla f=\lambda\nabla g+\mu\nabla h$。更隐蔽的错误是在 $\nabla g$ 与 $\nabla h$ 平行(线性相关)处套用该方法:那里交集不是光滑曲线,乘数表示可能失效,故这类点必须单独考察。
For extremizing $f$ subject to two constraints $g=k_1$ and $h=k_2$ in $\mathbb{R}^3$, the optimality condition is:在 $\mathbb{R}^3$ 中于两个约束 $g=k_1$ 与 $h=k_2$ 下对 $f$ 求极值,最优性条件是:
5.1
$\nabla f=\lambda\,\nabla g$ and $\nabla h=\mathbf{0}$.且 $\nabla h=\mathbf{0}$。
$\nabla f=\lambda\,\nabla g+\mu\,\nabla h$ with both constraints satisfied.$\nabla f=\lambda\,\nabla g+\mu\,\nabla h$,且两个约束都满足。
$\nabla g=\nabla h$ at the optimal point.在最优点处 $\nabla g=\nabla h$。
$\nabla f$ is orthogonal to both $\nabla g$ and $\nabla h$.$\nabla f$ 与 $\nabla g$ 和 $\nabla h$ 都正交。
Correct. With two constraints the gradient of $f$ must be a linear combination of the two constraint gradients, plus both constraint equations must hold.正确。在两个约束下,$f$ 的梯度必须是两个约束梯度的线性组合,且两个约束方程都成立。
Two constraints require two multipliers: $\nabla f=\lambda\nabla g+\mu\nabla h$. The gradient of $f$ lies in the span of the constraint gradients, not orthogonal to them.两个约束需要两个乘数:$\nabla f=\lambda\nabla g+\mu\nabla h$。$f$ 的梯度落在约束梯度张成的空间内,而非与它们正交。

Applications应用

Key idea.核心思想。 Optimization with constraints models physical and economic problems: minimizing surface area for a fixed volume, fitting the largest box in a region, or maximizing output subject to a budget. The multiplier $\lambda$ itself carries meaning as a sensitivity, the rate of change of the optimal value with respect to the constraint level.带约束的最优化可对物理与经济问题建模:在体积固定下最小化表面积、在某区域中装入最大的盒子,或在预算下最大化产出。乘数 $\lambda$ 本身具有灵敏度含义,即最优值关于约束水平的变化率。
Shadow price interpretation of the multiplier乘数的影子价格诠释
$$\lambda=\frac{d}{dk}\,f^{*}(k),$$

where $f^{*}(k)$ is the optimal value of $f$ when the constraint is $g=k$. Thus $\lambda$ measures how much the optimum improves per unit relaxation of the constraint, the shadow price in economics.其中 $f^{*}(k)$ 是当约束为 $g=k$ 时 $f$ 的最优值。因此 $\lambda$ 度量约束每放松一个单位时最优值的改善量,即经济学中的影子价格(shadow price)。

Worked Example 6.1: minimal-surface box例题 6.1:最小表面积的盒子

Find the dimensions of the closed rectangular box of volume $V=32$ with minimum surface area.求体积 $V=32$ 的封闭长方体盒子中表面积最小者的尺寸。

Minimize $S=2(xy+yz+zx)$ subject to $g=xyz=32$. With $\nabla S=\lambda\nabla g$:在 $g=xyz=32$ 下最小化 $S=2(xy+yz+zx)$。由 $\nabla S=\lambda\nabla g$:

$$2(y+z)=\lambda yz,\quad 2(x+z)=\lambda xz,\quad 2(x+y)=\lambda xy.$$

Multiply the first by $x$, the second by $y$, the third by $z$, and compare; subtracting pairs forces $x=y=z$ by symmetry. Then $x^3=32$... here adjust to a clean cube: with $V=27$, $x=y=z=3$. For $V=32$ the optimum is the cube $x=y=z=32^{1/3}$, the expected result that the cube minimizes surface area for fixed volume.第一式乘 $x$、第二式乘 $y$、第三式乘 $z$ 后比较;逐对相减由对称性迫使 $x=y=z$。于是 $x^3=32$……这里调整为整洁的立方体:当 $V=27$ 时 $x=y=z=3$。对 $V=32$,最优解是立方体 $x=y=z=32^{1/3}$,正是预期结果:体积固定时立方体的表面积最小。

Worked Example 6.2: closest point on a plane例题 6.2:平面上最近的点

Find the point on the plane $x+2y+3z=6$ closest to the origin.求平面 $x+2y+3z=6$ 上离原点最近的点。

Minimize $f=x^2+y^2+z^2$ subject to $g=x+2y+3z=6$. The condition $\nabla f=\lambda\nabla g$ gives $2x=\lambda$, $2y=2\lambda$, $2z=3\lambda$, so $x=\tfrac{\lambda}{2}$, $y=\lambda$, $z=\tfrac{3\lambda}{2}$.在 $g=x+2y+3z=6$ 下最小化 $f=x^2+y^2+z^2$。条件 $\nabla f=\lambda\nabla g$ 给出 $2x=\lambda$、$2y=2\lambda$、$2z=3\lambda$,故 $x=\tfrac{\lambda}{2}$、$y=\lambda$、$z=\tfrac{3\lambda}{2}$。

Substituting into the constraint: $\tfrac{\lambda}{2}+2\lambda+\tfrac{9\lambda}{2}=6$, that is $7\lambda=6$, so $\lambda=\tfrac{6}{7}$. The closest point is代入约束:$\tfrac{\lambda}{2}+2\lambda+\tfrac{9\lambda}{2}=6$,即 $7\lambda=6$,故 $\lambda=\tfrac{6}{7}$。最近点为

$$\left(\tfrac{3}{7},\tfrac{6}{7},\tfrac{9}{7}\right),\qquad \text{distance}=\frac{6}{\sqrt{14}}.$$
Worked Example 6.3: maximum-volume box with fixed surface area例题 6.3:表面积固定下体积最大的盒子

An open-top rectangular box (no lid) is to be built from $12$ square metres of material. Find the dimensions that maximize the volume.用 $12$ 平方米材料制作一个无盖的长方体盒子。求使体积最大的尺寸。

Let the base be $x$ by $y$ and the height $z$. The volume is $V=xyz$. An open-top box has base area $xy$ and four sides of total area $2xz+2yz$, so the surface constraint is设底面为 $x$ 乘 $y$,高为 $z$。体积为 $V=xyz$。无盖盒子的底面积为 $xy$,四个侧面总面积为 $2xz+2yz$,故表面约束为

$$g=xy+2xz+2yz=12.$$

The system $\nabla V=\lambda\nabla g$ is $yz=\lambda(y+2z)$, $xz=\lambda(x+2z)$, $xy=\lambda(2x+2y)$. Subtracting the first two, $z(y-x)=\lambda(y-x)$, so $x=y$ (the other branch $z=\lambda$ leads back to the same place). With $x=y$ the equations reduce, and one finds $x=y=2z$. Substituting into the constraint, $4z^2+4z^2+4z^2=12$, that is $12z^2=12$, so $z=1$, $x=y=2$.方程组 $\nabla V=\lambda\nabla g$ 为 $yz=\lambda(y+2z)$、$xz=\lambda(x+2z)$、$xy=\lambda(2x+2y)$。前两式相减得 $z(y-x)=\lambda(y-x)$,故 $x=y$(另一支 $z=\lambda$ 仍回到同一处)。当 $x=y$,方程简化,可得 $x=y=2z$。代入约束 $4z^2+4z^2+4z^2=12$,即 $12z^2=12$,故 $z=1$,$x=y=2$。

The maximal volume is $V=2\cdot 2\cdot 1=4$ cubic metres. Note the base is square and twice the height, the standard shape for an open box.最大体积为 $V=2\cdot 2\cdot 1=4$ 立方米。注意底面是正方形且为高的两倍,这是无盖盒子的标准形状。

Worked Example 6.4: maximize utility on a budget (economics)例题 6.4:预算下最大化效用(经济学)

A consumer maximizes the Cobb-Douglas utility $U(x,y)=x^{1/2}y^{1/2}$ subject to the budget $p_x x+p_y y=I$, with prices $p_x=2$, $p_y=4$ and income $I=80$.一位消费者在预算 $p_x x+p_y y=I$ 下最大化柯布-道格拉斯效用(Cobb-Douglas utility)$U(x,y)=x^{1/2}y^{1/2}$,价格 $p_x=2$、$p_y=4$,收入 $I=80$。

The condition $\nabla U=\lambda\nabla g$ with $g=2x+4y$ gives取 $g=2x+4y$,条件 $\nabla U=\lambda\nabla g$ 给出

$$\tfrac12 x^{-1/2}y^{1/2}=2\lambda,\qquad \tfrac12 x^{1/2}y^{-1/2}=4\lambda.$$

Dividing the first by the second eliminates $\lambda$: $\dfrac{y}{x}=\dfrac{2}{4}=\dfrac12$, so $x=2y$. The budget $2(2y)+4y=80$ gives $8y=80$, so $y=10$ and $x=20$. The optimal bundle is $(20,10)$ with utility $U=\sqrt{200}=10\sqrt2$.第一式除以第二式消去 $\lambda$:$\dfrac{y}{x}=\dfrac{2}{4}=\dfrac12$,故 $x=2y$。预算 $2(2y)+4y=80$ 给出 $8y=80$,于是 $y=10$、$x=20$。最优组合为 $(20,10)$,效用 $U=\sqrt{200}=10\sqrt2$。

The multiplier here is the marginal utility of income: from the second equation $\lambda=\tfrac{1}{8}x^{1/2}y^{-1/2}=\tfrac{1}{8}\sqrt{20/10}=\tfrac{\sqrt2}{8}$, the rate at which the maximal utility rises per extra dollar of income.这里的乘数就是收入的边际效用marginal utility of income):由第二式 $\lambda=\tfrac{1}{8}x^{1/2}y^{-1/2}=\tfrac{1}{8}\sqrt{20/10}=\tfrac{\sqrt2}{8}$,即收入每增加一美元时最大效用的上升率。

Going deeper: deriving the shadow-price interpretation of $\lambda$深入探讨:推导 $\lambda$ 的影子价格诠释

Let $\mathbf{x}^{*}(k)$ be the optimizer of $f$ subject to $g(\mathbf{x})=k$, and let $f^{*}(k)=f(\mathbf{x}^{*}(k))$ be the optimal value as the constraint level $k$ varies. Differentiate $f^{*}$ using the chain rule:设 $\mathbf{x}^{*}(k)$ 是 $f$ 在 $g(\mathbf{x})=k$ 下的最优解,$f^{*}(k)=f(\mathbf{x}^{*}(k))$ 是约束水平 $k$ 变化时的最优值。用链式法则对 $f^{*}$ 求导:

$$\frac{df^{*}}{dk}=\nabla f(\mathbf{x}^{*})\cdot\frac{d\mathbf{x}^{*}}{dk}.$$

At the optimizer the Lagrange condition gives $\nabla f=\lambda\nabla g$, so在最优解处,拉格朗日条件给出 $\nabla f=\lambda\nabla g$,故

$$\frac{df^{*}}{dk}=\lambda\,\nabla g(\mathbf{x}^{*})\cdot\frac{d\mathbf{x}^{*}}{dk}.$$

Now differentiate the constraint identity $g(\mathbf{x}^{*}(k))=k$ with respect to $k$: the left side is $\nabla g\cdot\tfrac{d\mathbf{x}^{*}}{dk}$ and the right side is $1$. Substituting,现在对约束恒等式 $g(\mathbf{x}^{*}(k))=k$ 关于 $k$ 求导:左边是 $\nabla g\cdot\tfrac{d\mathbf{x}^{*}}{dk}$,右边是 $1$。代入得

$$\frac{df^{*}}{dk}=\lambda\cdot 1=\lambda.$$

So the multiplier is exactly the sensitivity of the optimal value to a unit relaxation of the constraint, the shadow price. In Worked Example 6.4 this is the marginal utility of income; in a production problem it is the marginal value of one more unit of a scarce resource.所以乘数恰是最优值对约束放松一个单位的灵敏度,即影子价格。在例题 6.4 中它是收入的边际效用;在生产问题中它是多一单位稀缺资源的边际价值。

Common error.常见错误。 In applied problems the usual failure is setting up the wrong objective or constraint, for example using the closed-box surface area $2(xy+yz+zx)$ when the problem says open-top, or maximizing what should be minimized. Read carefully which quantity is fixed and which is optimized. A second pitfall is dropping the physical domain restrictions ($x,y,z>0$): a Lagrange candidate with a negative dimension is not a valid box, and ignoring the positivity can let a spurious critical point masquerade as the answer.在应用题中,常见失误是把目标函数或约束设错,例如题目说无盖却用了封闭盒子的表面积 $2(xy+yz+zx)$,或把本该最小化的量去最大化。要仔细读清哪个量固定、哪个量被优化。第二个陷阱是丢掉物理定义域限制($x,y,z>0$):尺寸为负的拉格朗日候选不是合法盒子,忽略正性会让一个伪临界点冒充答案。
In a constrained optimization $f^{*}(k)$ with constraint $g=k$, the Lagrange multiplier $\lambda$ equals:在约束 $g=k$ 的条件最优化 $f^{*}(k)$ 中,拉格朗日乘数 $\lambda$ 等于:
6.1
The optimal value $f^{*}(k)$ itself.最优值 $f^{*}(k)$ 本身。
The constraint level $k$.约束水平 $k$。
The rate of change $\dfrac{d f^{*}}{dk}$ of the optimum with respect to $k$.最优值关于 $k$ 的变化率 $\dfrac{d f^{*}}{dk}$。
Always zero at an interior optimum.在内部最优处恒为零。
Correct. The multiplier is the shadow price: it measures how the optimal value changes per unit change in the constraint level.正确。乘数即影子价格:它度量约束水平每变化一个单位时最优值的变化量。
The multiplier is a sensitivity, the derivative of the optimal value with respect to the constraint level, not the optimal value or the constraint level itself.乘数是一种灵敏度,即最优值对约束水平的导数,而非最优值或约束水平本身。

Going Deeper深入探讨

Key idea.核心思想。 The Lagrange method, the second derivatives test, and the closed-region procedure are unified by the bordered Hessian and by the theory of quadratic forms. Degenerate cases ($D=0$, or vanishing constraint gradient) require higher-order analysis or a direct argument.拉格朗日法、二阶导数判别法与闭区域流程被加边黑塞矩阵(bordered Hessian)和二次型理论统一起来。退化情形($D=0$,或约束梯度为零)需要更高阶的分析或直接论证。
Bordered Hessian (one constraint, two variables)加边黑塞矩阵(单约束,二元)
$$\bar H=\det\begin{pmatrix}0&g_x&g_y\\ g_x&f_{xx}-\lambda g_{xx}&f_{xy}-\lambda g_{xy}\\ g_y&f_{xy}-\lambda g_{xy}&f_{yy}-\lambda g_{yy}\end{pmatrix}.$$

A constrained critical point is a local maximum when $\bar H>0$ and a local minimum when $\bar H<0$, the sign convention being opposite to the unconstrained test because the border row encodes the tangency.当 $\bar H>0$ 时条件临界点是局部极大值,当 $\bar H<0$ 时是局部极小值,其符号约定与无约束判别法相反,因为加边的那一行编码了相切关系。

When the second derivatives test fails ($D=0$) the quadratic terms do not determine the shape and one must inspect $f$ along curves through the point. The function $f(x,y)=x^4+y^4$ has $D=0$ at the origin yet a clear minimum, while $f(x,y)=x^3$ has $D=0$ and no extremum, showing both outcomes are possible.当二阶导数判别法失效($D=0$)时,二次项无法决定形状,必须沿过该点的曲线考察 $f$。函数 $f(x,y)=x^4+y^4$ 在原点 $D=0$ 却有明确的极小值,而 $f(x,y)=x^3$ 在 $D=0$ 时没有极值,说明两种结果都可能出现。

Going deeper: a degenerate critical point深入探讨:一个退化的临界点

Consider $f(x,y)=x^2-y^4$ at the origin. The first partials $f_x=2x$, $f_y=-4y^3$ both vanish at $(0,0)$, so it is critical. The second partials give $f_{xx}=2$, $f_{yy}=-12y^2=0$, $f_{xy}=0$ at the origin, hence考察 $f(x,y)=x^2-y^4$ 在原点处。一阶偏导数 $f_x=2x$、$f_y=-4y^3$ 在 $(0,0)$ 处都为零,故为临界点。二阶偏导数在原点给出 $f_{xx}=2$、$f_{yy}=-12y^2=0$、$f_{xy}=0$,因此

$$D=(2)(0)-0^2=0,$$

and the test is inconclusive. Inspect directly: along the $x$-axis $f=x^2\ge 0$ rises, while along the $y$-axis $f=-y^4\le 0$ falls. Since $f$ takes both signs arbitrarily near the origin, the origin is a saddle even though $D=0$. The lesson is that a zero discriminant demands a direct path analysis.判别法无结论。直接考察:沿 $x$ 轴 $f=x^2\ge 0$ 上升,而沿 $y$ 轴 $f=-y^4\le 0$ 下降。由于 $f$ 在原点任意附近都取两种符号,尽管 $D=0$,原点仍是鞍点。教训是:判别式为零就要求直接的路径分析。

Worked Example 7.1: when Lagrange misses a corner例题 7.1:拉格朗日法漏掉尖点之时

Extremize $f(x,y)=x$ on the constraint $g=x^3-y^2=0$. The gradient $\nabla g=(3x^2,-2y)$ vanishes at the origin $(0,0)$, which lies on the curve. The Lagrange condition $\nabla f=\lambda\nabla g$ gives $1=3\lambda x^2$ and $0=-2\lambda y$, which has no solution at the cusp because $\nabla g=\mathbf{0}$ there.在约束 $g=x^3-y^2=0$ 上求 $f(x,y)=x$ 的极值。梯度 $\nabla g=(3x^2,-2y)$ 在原点 $(0,0)$ 处为零,而原点在曲线上。拉格朗日条件 $\nabla f=\lambda\nabla g$ 给出 $1=3\lambda x^2$ 与 $0=-2\lambda y$,在尖点处无解,因为那里 $\nabla g=\mathbf{0}$。

Yet the curve has a cusp at the origin and $x\ge 0$ along it (since $x^3=y^2\ge 0$), so $x=0$ is the constrained minimum, attained at the very point where $\nabla g=\mathbf{0}$. This shows the Lagrange method can miss extrema where the constraint gradient degenerates, so such points must be checked separately.然而曲线在原点有一个尖点,且沿曲线 $x\ge 0$(因为 $x^3=y^2\ge 0$),所以 $x=0$ 是条件极小值,恰好在 $\nabla g=\mathbf{0}$ 的那一点达到。这说明拉格朗日法会漏掉约束梯度退化处的极值,因此这类点必须单独检查。

Worked Example 7.2: resolving a degenerate test by path analysis例题 7.2:用路径分析解决退化判别

The origin is a critical point of $f(x,y)=x^2+y^4$ and also of $h(x,y)=x^2-y^4$. Both have $D=0$ there, since $f_{yy}=12y^2=0$ and $h_{yy}=-12y^2=0$ at the origin. The discriminant cannot tell them apart, so analyse each directly.原点既是 $f(x,y)=x^2+y^4$ 的临界点,也是 $h(x,y)=x^2-y^4$ 的临界点。两者在那里都有 $D=0$,因为原点处 $f_{yy}=12y^2=0$、$h_{yy}=-12y^2=0$。判别式无法区分它们,故对每个直接分析。

For $f=x^2+y^4$: every term is nonnegative and $f(0,0)=0$, so $f\ge 0$ with equality only at the origin. The origin is a strict local (indeed global) minimum.对 $f=x^2+y^4$:每一项都非负且 $f(0,0)=0$,故 $f\ge 0$,仅在原点取等。原点是严格的局部(实为全局)极小值

For $h=x^2-y^4$: along the $x$-axis $h=x^2\ge 0$, but along the $y$-axis $h=-y^4\le 0$. The function takes both signs in every neighbourhood of the origin, so the origin is a saddle. Identical second-order data, opposite conclusions: this is the precise sense in which $D=0$ carries no information.对 $h=x^2-y^4$:沿 $x$ 轴 $h=x^2\ge 0$,但沿 $y$ 轴 $h=-y^4\le 0$。函数在原点的每个邻域内都取两种符号,故原点是鞍点。二阶数据完全相同,结论却相反:这正是 $D=0$ 不携带任何信息的确切含义。

Worked Example 7.3: confirming a constrained max with the bordered Hessian例题 7.3:用加边黑塞矩阵确认条件极大值

Maximize $f(x,y)=xy$ subject to $g=x+y=10$, and verify the candidate is a maximum using the bordered Hessian.在 $g=x+y=10$ 下最大化 $f(x,y)=xy$,并用加边黑塞矩阵验证该候选是极大值。

Lagrange: $\nabla f=(y,x)$, $\nabla g=(1,1)$, so $y=\lambda$, $x=\lambda$, and $x+y=10$ gives $x=y=5$, $\lambda=5$, with $f=25$. Now build the bordered Hessian. Here $g$ is linear so $g_{xx}=g_{yy}=g_{xy}=0$, and $f_{xx}=0$, $f_{yy}=0$, $f_{xy}=1$. With $g_x=g_y=1$,拉格朗日:$\nabla f=(y,x)$,$\nabla g=(1,1)$,故 $y=\lambda$、$x=\lambda$,而 $x+y=10$ 给出 $x=y=5$、$\lambda=5$,$f=25$。现在构造加边黑塞矩阵。此处 $g$ 是线性的,故 $g_{xx}=g_{yy}=g_{xy}=0$,且 $f_{xx}=0$、$f_{yy}=0$、$f_{xy}=1$。由 $g_x=g_y=1$,

$$\bar H=\det\begin{pmatrix}0&1&1\\ 1&0&1\\ 1&1&0\end{pmatrix}.$$

Expanding along the first row: $\bar H=0\cdot(0-1)-1\cdot(0-1)+1\cdot(1-0)=0+1+1=2>0$. By the sign convention ($\bar H>0$ means a constrained local maximum), $(5,5)$ is a maximum, confirming the obvious result that the product of two numbers with fixed sum is largest when they are equal.沿第一行展开:$\bar H=0\cdot(0-1)-1\cdot(0-1)+1\cdot(1-0)=0+1+1=2>0$。按符号约定($\bar H>0$ 表示条件局部极大值),$(5,5)$ 是极大值,验证了显然的结论:和固定的两数之积在相等时最大。

Common error.常见错误。 Two traps recur in the degenerate cases. First, reading the unconstrained sign convention into the bordered Hessian: there $\bar H>0$ signals a maximum and $\bar H<0$ a minimum, the reverse of the ordinary $D$-test, because the bordering row encodes the tangency. Second, trusting Lagrange blindly at points where $\nabla g=\mathbf{0}$: as Worked Example 7.1 shows, the method silently skips such points, so any place where the constraint gradient vanishes (a cusp, a self-intersection) must be added to the candidate list by hand.退化情形中反复出现两个陷阱。第一,把无约束的符号约定套用到加边黑塞矩阵上:在那里 $\bar H>0$ 表示极大值,$\bar H<0$ 表示极小值,与普通 $D$ 判别相反,因为加边的那一行编码了相切关系。第二,在 $\nabla g=\mathbf{0}$ 的点盲目信任拉格朗日法:如例题 7.1 所示,该方法会悄无声息地跳过这类点,故任何约束梯度为零的地方(尖点、自交点)都必须手动加入候选列表。
If the second derivatives test gives $D=0$ at a critical point, the correct conclusion is:若二阶导数判别法在某临界点给出 $D=0$,正确的结论是:
7.1
The test is inconclusive; analyze $f$ along curves through the point.判别法无结论;沿过该点的曲线分析 $f$。
The point is automatically a saddle.该点自动是鞍点。
The point is automatically a local minimum.该点自动是局部极小值。
The point cannot be a critical point.该点不可能是临界点。
Correct. A zero discriminant gives no information; the behavior must be probed directly along paths through the point.正确。判别式为零不给任何信息;必须沿过该点的路径直接探查其行为。
When $D=0$ the quadratic terms are degenerate, so the point may be a max, a min, or a saddle. Direct path analysis is required.当 $D=0$ 时二次项退化,故该点可能是极大、极小或鞍点。需要直接的路径分析。

Flashcards记忆卡片

0 / 12 flipped0 / 12 已翻转
Definition: a critical point of $f(x,y)$定义:$f(x,y)$ 的临界点
A point where $\nabla f=\mathbf{0}$ (both $f_x=0$ and $f_y=0$) or where a partial fails to exist.满足 $\nabla f=\mathbf{0}$(即 $f_x=0$ 且 $f_y=0$)的点,或某个偏导数不存在的点。
State the discriminant $D$ of the second derivatives test写出二阶导数判别法的判别式 $D$
$D=f_{xx}f_{yy}-f_{xy}^2$, the determinant of the Hessian.$D=f_{xx}f_{yy}-f_{xy}^2$,即黑塞矩阵的行列式。
Second derivatives test: classification rules二阶导数判别法:判别规则
$D>0,f_{xx}>0$: min. $D>0,f_{xx}<0$: max. $D<0$: saddle. $D=0$: inconclusive.$D>0,f_{xx}>0$:极小。$D>0,f_{xx}<0$:极大。$D<0$:鞍点。$D=0$:无结论。
What is the extreme value theorem in two variables?二元极值定理是什么?
A continuous function on a closed, bounded region attains an absolute maximum and an absolute minimum.闭有界区域上的连续函数必取得全局极大值与全局极小值。
Closed-region method for global extrema求全局极值的闭区域法
Compare interior critical values with boundary extreme values (including corners); pick the largest and smallest.比较内部临界值与边界极值(含角点);取其中最大与最小者。
Lagrange condition for one constraint $g=k$单约束 $g=k$ 的拉格朗日条件
$\nabla f=\lambda\nabla g$ together with $g(\mathbf{x})=k$, with $\nabla g\ne\mathbf{0}$.$\nabla f=\lambda\nabla g$ 连同 $g(\mathbf{x})=k$,且 $\nabla g\ne\mathbf{0}$。
Geometric meaning of $\nabla f=\lambda\nabla g$$\nabla f=\lambda\nabla g$ 的几何含义
At a constrained extremum the level set of $f$ is tangent to the constraint set, so their gradients are parallel.在条件极值处,$f$ 的等值集与约束集相切,故两者的梯度平行。
Lagrange condition for two constraints双约束的拉格朗日条件
$\nabla f=\lambda\nabla g+\mu\nabla h$ with $g=k_1$ and $h=k_2$; $\nabla f$ lies in the span of the constraint gradients.$\nabla f=\lambda\nabla g+\mu\nabla h$,且 $g=k_1$、$h=k_2$;$\nabla f$ 落在约束梯度张成的空间内。
Interpretation of the multiplier $\lambda$乘数 $\lambda$ 的诠释
The shadow price: $\lambda=df^{*}/dk$, the rate of change of the optimal value with respect to the constraint level.影子价格:$\lambda=df^{*}/dk$,即最优值关于约束水平的变化率。
When the constraint gradient $\nabla g=\mathbf{0}$ on the feasible set当可行集上约束梯度 $\nabla g=\mathbf{0}$ 时
Lagrange's method can miss extrema there; such degenerate points must be checked separately.拉格朗日法可能漏掉那里的极值;这类退化点必须单独检查。
Why does $\nabla f$ vanish at an interior extremum?为何 $\nabla f$ 在内部极值点为零?
Restricting $f$ to lines through the point reduces to single-variable Fermat, forcing $f_x=f_y=0$.把 $f$ 限制在过该点的直线上即化为单变量费马定理,迫使 $f_x=f_y=0$。
What does $D=0$ require?$D=0$ 要求什么?
Direct analysis of $f$ along curves through the point, since the quadratic terms give no information.沿过该点的曲线直接分析 $f$,因为二次项不提供任何信息。

Unit Quiz单元测验

The critical points of $f(x,y)=x^2-2x+y^2-4y$ are located where:$f(x,y)=x^2-2x+y^2-4y$ 的临界点位于何处:
Q1
$(0,0)$.
$(2,1)$.
$(1,2)$.
There are none.不存在。
Correct. $f_x=2x-2=0$ gives $x=1$ and $f_y=2y-4=0$ gives $y=2$, so the only critical point is $(1,2)$.正确。$f_x=2x-2=0$ 给出 $x=1$,$f_y=2y-4=0$ 给出 $y=2$,故唯一临界点是 $(1,2)$。
Set $f_x=2x-2=0$ and $f_y=2y-4=0$, which yield $x=1$ and $y=2$.令 $f_x=2x-2=0$ 与 $f_y=2y-4=0$,得 $x=1$、$y=2$。
At a critical point, $f_{xx}=-3$, $f_{yy}=-12$, $f_{xy}=0$. The point is a:在某临界点处,$f_{xx}=-3$,$f_{yy}=-12$,$f_{xy}=0$。该点是:
Q2
Local minimum.局部极小值。
Local maximum.局部极大值。
Saddle point.鞍点。
Inconclusive.无法判别。
Correct. $D=(-3)(-12)-0=36>0$ and $f_{xx}=-3<0$, so the point is a local maximum.正确。$D=(-3)(-12)-0=36>0$ 且 $f_{xx}=-3<0$,故该点是局部极大值。
Here $D=36>0$, so it is an extremum, and the negative $f_{xx}$ makes it a local maximum.此处 $D=36>0$,故为极值,而 $f_{xx}$ 为负使其成为局部极大值。
Using Lagrange multipliers, the maximum of $f(x,y)=2x+y$ on $x^2+y^2=5$ is:用拉格朗日乘数法,$f(x,y)=2x+y$ 在 $x^2+y^2=5$ 上的极大值是:
Q3
$\sqrt5$.
$3$.
$\sqrt{10}$.
$5$.
Correct. $\nabla f=(2,1)=\lambda(2x,2y)$ gives $x=2y$; with $x^2+y^2=5$, $y=\pm1$, $x=\pm2$, so the max of $2x+y$ is $2(2)+1=5$.正确。$\nabla f=(2,1)=\lambda(2x,2y)$ 给出 $x=2y$;由 $x^2+y^2=5$ 得 $y=\pm1$、$x=\pm2$,故 $2x+y$ 的极大值为 $2(2)+1=5$。
From $\nabla f=\lambda\nabla g$, $x=2y$. The constraint gives $(2y)^2+y^2=5$, so $y=1$, $x=2$, and $f=5$.由 $\nabla f=\lambda\nabla g$ 得 $x=2y$。约束给出 $(2y)^2+y^2=5$,故 $y=1$、$x=2$,$f=5$。
To find the absolute maximum of a continuous $f$ on a closed disk, you must check:要求连续函数 $f$ 在闭圆盘上的全局极大值,你必须检查:
Q4
Interior critical points and the boundary circle.内部临界点与边界圆周。
Only the center of the disk.仅圆盘中心。
Only the boundary circle.仅边界圆周。
Only where $D>0$.仅 $D>0$ 之处。
Correct. The global max sits at an interior critical point or on the boundary circle, so both must be examined.正确。全局极大值落在内部临界点或边界圆周上,故两者都必须考察。
The extreme value theorem guarantees an absolute max somewhere on the closed disk, which could be interior or on the boundary, so both sets of candidates are needed.极值定理保证闭圆盘上某处存在全局极大值,它可能在内部或边界,故两组候选都需要。
With two constraints $g=k_1$ and $h=k_2$, how many scalar unknowns appear in the Lagrange system in $\mathbb{R}^3$?在两个约束 $g=k_1$ 与 $h=k_2$ 下,$\mathbb{R}^3$ 中的拉格朗日方程组有多少个标量未知量?
Q5
Three: $x,y,z$.三个:$x,y,z$。
Four: $x,y,z,\lambda$.四个:$x,y,z,\lambda$。
Five: $x,y,z,\lambda,\mu$.五个:$x,y,z,\lambda,\mu$。
Six.六个。
Correct. Two constraints introduce two multipliers $\lambda,\mu$, giving five unknowns $x,y,z,\lambda,\mu$ and five equations.正确。两个约束引入两个乘数 $\lambda,\mu$,共五个未知量 $x,y,z,\lambda,\mu$ 与五个方程。
Each constraint contributes one multiplier, so $x,y,z$ plus $\lambda$ and $\mu$ make five unknowns.每个约束贡献一个乘数,故 $x,y,z$ 加上 $\lambda$ 与 $\mu$ 共五个未知量。
The Lagrange multiplier $\lambda$ at the optimum is best interpreted as:最优处的拉格朗日乘数 $\lambda$ 最好诠释为:
Q6
The number of constraints.约束的个数。
The sensitivity of the optimal value to the constraint level.最优值对约束水平的灵敏度。
The Hessian determinant.黑塞行列式。
The Euclidean distance to the origin.到原点的欧几里得距离。
Correct. The multiplier is the shadow price, equal to the derivative of the optimal value with respect to the constraint level.正确。乘数即影子价格,等于最优值关于约束水平的导数。
By the envelope theorem, $\lambda=df^{*}/dk$, the rate at which the optimum changes as the constraint is relaxed.由包络定理,$\lambda=df^{*}/dk$,即约束放松时最优值的变化率。

Readiness Checklist备考清单

Tap each item you can do without notes. 点击你无需参考资料即可完成的项目。0 / 8 mastered0 / 8 已掌握