Module II·Article II·~5 min read
Fenchel Conjugate Functions
Duality and Optimality Conditions
Turn this article into a podcast
Pick voices, format, length — AI generates the audio
The Idea of the Legendre-Fenchel Transform
Imagine that you want to characterize a convex function not through its values at points, but through the hyperplanes supporting it. Each tangent line to a convex function is defined by its slope y and “intercept point.” The conjugate function f*(y) fixes this "intersection" for the hyperplane with slope y. This dual representation is extremely useful: it transforms multiplication into addition, turns complicated constraints into simple ones, and appears in dual optimization problems.
Definition of a Conjugate Function
Fenchel conjugate function (or the Legendre-Fenchel):
f*(y) = sup_{x ∈ dom(f)} {yᵀx − f(x)}
Meaning of each symbol:
- y — “dual variable,” direction (hyperplane slope)
- yᵀx — scalar product (linear function of x)
- f(x) — “subtract” the function
- sup — take the greatest value over all x
Geometrically: f*(y) is the maximal “gap” between the linear function yᵀx and f(x).
Key property: f* is always a convex function, even if the original f is non-convex! (As the supremum of affine functions over y.)
Fenchel-Young Inequality
It immediately follows from the definition:
f*(y) ≥ yᵀx − f(x) for all x, y
which is equivalent to the Fenchel-Young inequality:
f(x) + f*(y) ≥ yᵀx
Bipolar theorem: if f is a closed convex function, then f** = f (the double conjugate coincides with the original). For non-convex f: f** = cl(conv(f)) — closure of the convex hull.
Calculation Examples
Quadratic function f(x) = (1/2)‖x‖² (for n=1: f(x) = x²/2):
f*(y) = sup_x {yx − x²/2} = sup_x {−(x − y)²/2 + y²/2} = y²/2
Result: f*(y) = (1/2)‖y‖² — self-conjugate! The maximum is achieved at x = y.
L1-norm f(x) = ‖x‖₁ = Σᵢ|xᵢ|:
f*(y) = sup_x {yᵀx − ‖x‖₁} = δ_{‖y‖_∞ ≤ 1}(y)
(0 if ‖y‖_∞ = max|yᵢ| ≤ 1, otherwise +∞)
Calculation: if |yᵢ| > 1 for some i, take xᵢ → ±∞ → supremum = +∞. If |yᵢ| ≤ 1 for all i, then yᵀx ≤ ‖y‖_∞‖x‖₁ ≤ ‖x‖₁ → yᵀx − ‖x‖₁ ≤ 0, maximum = 0 (at x = 0).
Exponent f(x) = eˣ (n=1):
f*(y) = sup_x {yx − eˣ}. Derivative: y − eˣ = 0 → x = log y (for y > 0). Value: y·log y − e^{log y} = y·log y − y. For y ≤ 0: supremum = +∞.
Result: f*(y) = y log y − y for y > 0, +∞ for y ≤ 0.
Negative entropy f(x) = Σᵢ xᵢ log xᵢ (for xᵢ > 0):
f*(y) = log(Σᵢ eʸⁱ) — this is log-sum-exp! A soft approximation of the maximum.
Duality via Conjugate Functions
Dual problem for min_x {f(x) + g(Bx)}:
Via conjugates: max_y {−f*(−Bᵀy) − g*(y)}.
LASSO problem: min_x {(1/2)‖Ax−b‖² + λ‖x‖₁}
Conjugates: ((1/2)‖·‖²)* = (1/2)‖·‖², (λ‖·‖₁)* = λ δ_{‖·‖_∞ ≤ 1}.
Dual: max_y {bᵀy − (1/2)‖y‖²} subject to ‖Aᵀy‖_∞ ≤ λ — this is QP with L∞-constraint.
Complete Example Analysis: Computing f* for the Log-Barrier
Task: Find the conjugate for f(x) = −log x (x > 0).
f*(y) = sup_{x > 0} {yx − (−log x)} = sup_{x > 0} {yx + log x}
Step 1: Derivative with respect to x: y + 1/x = 0 → x* = −1/y (only for y < 0).
Step 2: For y ≥ 0: yx + log x → +∞ as x → +∞ (for y > 0) or x → +∞ for y = 0 → supremum = +∞.
Step 3: For y < 0: f*(y) = y·(−1/y) + log(−1/y) = −1 + log(−1/y) = −1 − log(−y).
Result: f*(y) = −1 − log(−y) for y < 0, +∞ for y ≥ 0.
Applications
Duality in optimization: the conjugate function automatically generates the dual problem. This is used in optimal portfolio calculations (duality of the Markowitz problem), SVM (kernel trick through duality), and in the ADMM method.
Prox-operator via conjugate: by Moreau's theorem, prox_{τf}(x) + τ prox_{f*/τ}(x/τ) = x. If computing the prox operator of one function is hard, compute the prox of the conjugate.
Properties of the Fenchel Transform
The conjugate transformation has remarkable properties:
- The conjugate function is always convex (even if f is not convex) — the transformation “convexifies” the function
- Double conjugate f** = f, if f is convex and closed; in general, f** is the convex hull of f
- Order correspondence: f₁ ≤ f₂ → f₂* ≤ f₁*
- Conjugate of the sum: (f₁ + f₂)* = f₁* □ f₂* (infimal convolution)
- Connection with subdifferential: y ∈ ∂f(x) ⟺ x ∈ ∂f*(y) ⟺ f(x) + f*(y) = xᵀy
Examples of Conjugate Pairs
- f(x) = (1/2)xᵀPx (quadratic) ↔ f*(y) = (1/2)yᵀP⁻¹y
- f(x) = exp(x) ↔ f*(y) = y log y − y (for y > 0)
- f(x) = log(1 + eˣ) (softplus) ↔ f*(y) = y log y + (1−y) log(1−y) (binary entropy) for y ∈ [0,1]
- f(x) = ‖x‖p ↔ f*(y) = δ{‖·‖_q ≤ 1}(y), where 1/p + 1/q = 1 (dual norms)
- f(x) = max(x₁,...,xₙ) ↔ f*(y) = δ_Δ(y) (indicator of the simplex)
Applications
The Fenchel transform is the foundation of Fenchel-Rockafellar duality, which generalizes Lagrange duality. In thermodynamics, Legendre conjugation connects Helmholtz and Gibbs free energy, internal energy and enthalpy. In convex analysis, Fenchel’s image is used to construct “dual” algorithms: proximal splitting, primal-dual hybrid gradient, Chambolle-Pock — all operate with conjugate functions for efficiently solving problems with non-separable terms.
§ Act · what next