Canonical Formalism and the Hamilton–Jacobi Equation

The Connection Between Optics and Mechanics

In the 1820s, William Hamilton noticed a striking analogy: the mathematics of geometrical optics (the path of a light ray) formally coincides with that of classical mechanics. This led him to create a unified formalism—the Hamilton–Jacobi equation. Later, in 1926, Schrödinger showed that quantum mechanics, as $\hbar\to 0$, transitions into classical mechanics precisely through this equation: the wave function is the "quantum" analogue of the Jacobi action function. The HJ equation serves as a bridge between classical and quantum physics, between variational calculus and PDE theory.

Legendre Transformation and the Hamiltonian

For the functional $J[y] = \int L(x, y, y') , dx$ ($L$ is the Lagrangian):

Generalized momentum: $p = \partial L / \partial y'$—the variable conjugate to $y'$.

Hamiltonian: $H(x, y, p) = p\cdot y' - L(x, y, y')$, where $y'$ is expressed through $p$ from $p = \partial L/\partial y'$.

This is the Legendre transformation: we move from the variable $y'$ to $p$.

Canonical equations: $\dot{y} = \partial H / \partial p$, $\dot{p} = -\partial H / \partial y$.

Example: $L = (1/2)y'^2 - U(y)$ (one-dimensional mechanics). $p = y'$. $H = p\cdot y' - (1/2)y'^2 + U = (1/2)p^2 + U$. Canonical equations: $\dot{y} = p$, $\dot{p} = -U'(y)$. This is Newton's equation!

The Jacobi Action Function

Consider a family of extremals emanating from a fixed point $A = (x_0, y_0)$. For each point $B = (x, y)$, define the action function: $S(x, y) =$ the value of the functional $J$ along the extremal from $A$ to $B$.

Jacobi’s Theorem: along an extremal, $p = \partial S / \partial y$ (momentum = partial derivative of the action with respect to the configuration).

A second relation: $\partial S /\partial x = -H(x, y, \partial S/\partial y)$.

The Hamilton–Jacobi Equation

Substituting $\partial S/\partial y$ for $p$ in the expression for $H$:

$ \frac{\partial S}{\partial x} + H\left(x, y, \frac{\partial S}{\partial y}\right) = 0 $

This is a nonlinear first-order PDE—the Hamilton–Jacobi equation (HJ).

Geometric meaning: level surfaces $S(x, y) = C$ are called "wavefronts." Extremals ("trajectories," "rays") are perpendicular to the wavefronts—they point along $\nabla S$.

Analogy Between Mechanics and Geometrical Optics

Fermat’s principle: light travels along the path that minimizes travel time. The functional: $J = \int n(x, y), ds$, where $n$ is the refractive index, $ds$ is the element of length.

Eikonal equation: $|\nabla S|^2 = n^2(x, y)$. This is a special case of HJ for optics.

Analogy:

Wavefronts = surfaces of equal phase
Rays = extremals (gradients of phase)
Refractive index $n$ = "slowness" (analogue of potential)

Quantum mechanics: the wave function $\psi = \exp(iS/\hbar)$. Schrödinger’s equation as $\hbar\to0$: $[(\partial S/\partial t) + H(x, \nabla S)],\psi \to 0$. This is the HJ equation! Quantum interference effects are a consequence of finite $\hbar$.

The Method of Characteristics

The HJ equation $F(x, y, \partial S/\partial x, \partial S/\partial y) = 0$ is a nonlinear PDE. It is solved by the method of characteristics.

System of characteristics (5 ODEs with respect to parameter $s$):

$ \frac{dx}{ds} = \frac{\partial F}{\partial p}, \quad \frac{dy}{ds} = \frac{\partial F}{\partial q}, \quad \frac{dp}{ds} = -\frac{\partial F}{\partial x} - p\frac{\partial F}{\partial S}, \quad \frac{dq}{ds} = -\frac{\partial F}{\partial y} - q\frac{\partial F}{\partial S}, \quad \frac{dS}{ds} = p\frac{\partial F}{\partial p} + q\frac{\partial F}{\partial q} $

where $p = \partial S/\partial x$, $q = \partial S/\partial y$. The characteristics are precisely the extremals!

Complete Analysis: The Pendulum via HJ

Problem: free oscillations of a pendulum. $H = p^2/(2ml^2) + mgl(1 - \cos\theta)$.

HJ equation: $\partial S/\partial t + (\partial S/\partial \theta)^2/(2ml^2) + mgl(1 - \cos\theta) = 0$.

For total energy $E$: $\partial S/\partial t = -E$ (stationary problem). Then $\partial S/\partial \theta = p = \pm\sqrt{2ml^2(E - mgl(1 - \cos\theta))}$.

Action function: $S = -Et + \int p(\theta,E), d\theta$.

Period of oscillations: $T = \partial S/\partial E = 2 \oint d\theta / |p/ml^2| = 2ml^2 \oint d\theta/\sqrt{2ml^2(E-mgl(1-\cos\theta))}$.

This is an elliptic integral! For small oscillations ($E \ll mgl$): $T \approx 2\pi\sqrt{l/g}$—the standard formula.

Dynamic Programming as Discrete HJ

In discrete time, Bellman's principle of optimality: $V(x,t) = \min_u {f(x,u)\cdot dt + V(x + g(x,u)\cdot dt, t+dt)}$.

Limit as $dt\to0$: $\partial V/\partial t + \min_u {f(x,u) + \nabla V \cdot g(x,u)} = 0$—the Hamilton–Jacobi–Bellman equation (HJB).

This is a nonlinear PDE—the equation of optimal control. Connection: variational calculus $\to$ Pontryagin’s principle $\to$ Hamilton–Jacobi–Bellman.

Method of Characteristics

The Hamilton–Jacobi equation is a nonlinear first-order PDE. It is solved by the method of characteristics: along the characteristics, the problem reduces to a system of ODEs—the canonical Hamilton equations. A solution exists locally, but in the general case, singularities may develop: intersection of characteristics produces "focal points" and caustics (envelopes of lines). This has a clear physical meaning: a caustic is a place where light energy concentrates (for example, a sun "patch" on the bottom of a pool).

Viscosity Solutions

Because of singularities, the classical HJ solution does not always exist. Modern theory (Crandall–Lions, 1983) introduced the concept of viscosity solutions—generalized solutions, unique under reasonable conditions. They are obtained as the limit as $\varepsilon \to 0$ of solutions to the "viscous" equation $V_t + H(x, \nabla V) = \varepsilon \Delta V$. Numerical methods (Godunov schemes, ENO, WENO) approximate precisely the viscosity solution and are robust to derivative discontinuities.

Modern Applications

HJ is used in robotics (path planning with obstacles via level sets), in finance (option valuation—the Black–Scholes equation is an HJ for the pricing function), in image processing (anisotropic diffusion for edge detection), and in medical imaging (tomographic reconstruction through front propagation).