Proof of the Implicit Function Theorem
Optional (starred) - How a nonzero partial derivative forces a unique solution into existence - Kaplan §2.11
Prereq: §2.10 Implicit Functions
We know $F_y \neq 0$ guarantees a local function exists. But what's the actual mechanism? How does a nonzero partial derivative force a unique solution into existence?
The setup - monotonicity is everything
Let's start with a picture, not a proof. Imagine $F$ is temperature in a room. At our special point $(x_0, y_0)$, the temperature is exactly zero - $F(x_0, y_0) = 0$. And as you walk north (increase $y$), the temperature rises - $F_y > 0$.
So just south of our point, the temperature is negative. Just north, it's positive. Now pick any vertical line through a nearby $x$-value. As you walk north along it, you cross from cold to hot, passing through zero exactly once. That crossing IS the implicit function $y = f(x)$.
That's the entire proof in one breath. Let's formalize it.
Step 1: Find a rectangle where $F_y > 0$
Assume $F_y(x_0, y_0) > 0$. (If it's negative, just replace $F$ by $-F$ - everything works the same.) Since $F_y$ is continuous, there's a rectangle
throughout which $F_y > 0$. Inside this rectangle, for each fixed $x$, the function $y \mapsto F(x, y)$ is strictly increasing.
Step 2: Signs at the boundary
At our base point, $F(x_0, y_0) = 0$ and $F$ is strictly increasing in $y$. So:
- At the bottom of the rectangle: $F(x_0, y_0 - \eta) \lt 0$
- At the top: $F(x_0, y_0 + \eta) > 0$
Since $F$ is continuous in $x$ as well, for $x$ sufficiently close to $x_0$ (shrink $\delta$ if needed), we still have:
- $F(x, y_0 - \eta) \lt 0$ - still cold at the bottom
- $F(x, y_0 + \eta) > 0$ - still hot at the top
Step 3: Existence and uniqueness
For each such $x$, look at $y \mapsto F(x, y)$ on $[y_0 - \eta,\; y_0 + \eta]$:
- Intermediate Value Theorem: $F$ is continuous, negative at the bottom, positive at the top - so it crosses zero at least once.
- Strict monotonicity: $F_y > 0$ means $F$ is strictly increasing in $y$ - so it crosses zero at most once.
At least one + at most one = exactly one. Call this unique zero $y = f(x)$.
Drag the slider to move $x$. Each vertical slice crosses $F=0$ exactly once.
The squeeze - bounding the solution curve
We know $y = f(x)$ exists and is unique near $(x_0, y_0)$. But is $f$ continuous? Differentiable? Here's where the squeeze argument earns its name.
The implicit derivative is bounded
Define the "slope function":
Since $F_y > 0$ throughout our rectangle $E$, and both $F_x$ and $F_y$ are continuous, $g$ is continuous on a closed, bounded set. So $g$ achieves a minimum $m$ and maximum $M$ on $E$.
What does this mean? Wherever the solution curve lives inside $E$, its slope $f'(x) = g(x, f(x))$ is squeezed between $m$ and $M$. The curve is trapped between two straight lines:
Shrink $\delta$ to stay inside the rectangle
We need the bounding lines to stay within the rectangle. Choose $\delta$ small enough that $|M| \cdot \delta \lt \eta$ and $|m| \cdot \delta \lt \eta$. Then both lines (and therefore the solution curve between them) remain within the vertical bounds $|y - y_0| \lt \eta$.
Continuity and differentiability, for free
Continuity: As $x \to x_0$, the two bounding lines both pass through $(x_0, y_0)$, and $f(x)$ is squeezed between them. So $f(x) \to y_0 = f(x_0)$. That's continuity.
Differentiability: As $\delta \to 0$, the bounding slopes $m$ and $M$ can be made as close together as we like (since $g$ is continuous and they both approach $g(x_0, y_0)$). In the limit, both bounds converge to the single slope $-F_x(x_0,y_0)/F_y(x_0,y_0)$. The solution curve, squeezed between them, must have this same slope. That's differentiability, with
Shrink $\delta$ - the bounding lines narrow, squeezing the curve toward a single tangent slope.
What the theorem actually guarantees
Let's take stock. From a single condition - $F_y(x_0, y_0) \neq 0$ - we proved four things:
The proof generalizes to systems of $m$ equations $F_1 = 0, \ldots, F_m = 0$ in $n + m$ variables by induction on $m$. The single condition $F_y \neq 0$ is replaced by: the Jacobian determinant $\partial(F_1, \ldots, F_m) / \partial(y_1, \ldots, y_m) \neq 0$. Same logic, bigger notation.
Practice Problems - §2.11
From Kaplan, problems after §2.11
Four thermodynamic variables $p$, $T$, $U$, $V$ are related by two equations (so two degrees of freedom). With $V$, $T$ as independent variables, one has:
$$\frac{\partial U}{\partial V}\bigg|_T - T\,\frac{\partial p}{\partial T}\bigg|_V + p = 0$$Show this can be rewritten with $U$, $V$ as independent variables as:
$$\frac{\partial T}{\partial V}\bigg|_U + T\,\frac{\partial p}{\partial U}\bigg|_V - p\,\frac{\partial T}{\partial U}\bigg|_V = 0$$Four thermodynamic variables related by two equations means two degrees of freedom. We can choose any two variables as independent - the underlying physics doesn't change. The IFT guarantees we can switch between coordinate systems whenever the Jacobian is nonzero.
With $(V, T)$ independent, we have $U = U(V, T)$ and $p = p(V, T)$. The given relation is (subscripts denote partials with the other variable held fixed): $$U_V - T \cdot p_T + p = 0$$
With $(U, V)$ independent, $T = T(U, V)$ and $p = p(U, V)$.
Converting $\partial U/\partial V|_T$: Differentiate $T(U, V) = \text{const}$ along a path where $T$ is fixed: $$\frac{\partial T}{\partial U}\bigg|_V dU + \frac{\partial T}{\partial V}\bigg|_U dV = 0 \quad\Longrightarrow\quad \frac{\partial U}{\partial V}\bigg|_T = -\frac{T_V}{T_U}$$ Converting $\partial p/\partial T|_V$: Using the chain rule at fixed $V$: $$\frac{\partial p}{\partial T}\bigg|_V = \frac{\partial p}{\partial U}\bigg|_V \cdot \frac{\partial U}{\partial T}\bigg|_V = \frac{p_U}{1/T_U} = \frac{p_U}{T_U^{-1}}$$ Since $\,\partial U/\partial T|_V = 1/(\partial T/\partial U|_V) = 1/T_U\,$, we get: $$\frac{\partial p}{\partial T}\bigg|_V = \frac{p_U}{T_U}$$ (Here $p_U = \partial p/\partial U|_V$ and $T_U = \partial T/\partial U|_V$.)
Plugging into $U_V - T \cdot p_T + p = 0$: $$-\frac{T_V}{T_U} - T \cdot \frac{p_U}{T_U} + p = 0$$ Multiply through by $T_U$ (which is nonzero by the IFT): $$-T_V - T \cdot p_U + p \cdot T_U = 0$$ Rearrange: $$T_V + T \cdot p_U - p \cdot T_U = 0 \quad\checkmark$$
Same physics, different coordinates. The Implicit Function Theorem provides the change-of-variables formulas that translate partial derivatives between coordinate systems. The key identity used was $\partial U/\partial V|_T = -(T_V)/(T_U)$, which comes directly from implicit differentiation of $T(U, V) = \text{const}$.