Module II·Article II·~5 min read

Fenchel Conjugate Functions

Duality and Optimality Conditions

Turn this article into a podcast

Pick voices, format, length — AI generates the audio

The Idea of the Legendre-Fenchel Transform

Imagine that you want to characterize a convex function not through its values at points, but through the hyperplanes supporting it. Each tangent line to a convex function is defined by its slope y and “intercept point.” The conjugate function f*(y) fixes this "intersection" for the hyperplane with slope y. This dual representation is extremely useful: it transforms multiplication into addition, turns complicated constraints into simple ones, and appears in dual optimization problems.

Definition of a Conjugate Function

Fenchel conjugate function (or the Legendre-Fenchel):

f*(y) = sup_{x ∈ dom(f)} {yᵀx − f(x)}

Meaning of each symbol:

  • y — “dual variable,” direction (hyperplane slope)
  • yᵀx — scalar product (linear function of x)
  • f(x) — “subtract” the function
  • sup — take the greatest value over all x

Geometrically: f*(y) is the maximal “gap” between the linear function yᵀx and f(x).

Key property: f* is always a convex function, even if the original f is non-convex! (As the supremum of affine functions over y.)

Fenchel-Young Inequality

It immediately follows from the definition:

f*(y) ≥ yᵀx − f(x) for all x, y

which is equivalent to the Fenchel-Young inequality:

f(x) + f*(y) ≥ yᵀx

Bipolar theorem: if f is a closed convex function, then f** = f (the double conjugate coincides with the original). For non-convex f: f** = cl(conv(f)) — closure of the convex hull.

Calculation Examples

Quadratic function f(x) = (1/2)‖x‖² (for n=1: f(x) = x²/2):

f*(y) = sup_x {yx − x²/2} = sup_x {−(x − y)²/2 + y²/2} = y²/2

Result: f*(y) = (1/2)‖y‖² — self-conjugate! The maximum is achieved at x = y.

L1-norm f(x) = ‖x‖₁ = Σᵢ|xᵢ|:

f*(y) = sup_x {yᵀx − ‖x‖₁} = δ_{‖y‖_∞ ≤ 1}(y)

(0 if ‖y‖_∞ = max|yᵢ| ≤ 1, otherwise +∞)

Calculation: if |yᵢ| > 1 for some i, take xᵢ → ±∞ → supremum = +∞. If |yᵢ| ≤ 1 for all i, then yᵀx ≤ ‖y‖_∞‖x‖₁ ≤ ‖x‖₁ → yᵀx − ‖x‖₁ ≤ 0, maximum = 0 (at x = 0).

Exponent f(x) = eˣ (n=1):

f*(y) = sup_x {yx − eˣ}. Derivative: y − eˣ = 0 → x = log y (for y > 0). Value: y·log y − e^{log y} = y·log y − y. For y ≤ 0: supremum = +∞.

Result: f*(y) = y log y − y for y > 0, +∞ for y ≤ 0.

Negative entropy f(x) = Σᵢ xᵢ log xᵢ (for xᵢ > 0):

f*(y) = log(Σᵢ eʸⁱ) — this is log-sum-exp! A soft approximation of the maximum.

Duality via Conjugate Functions

Dual problem for min_x {f(x) + g(Bx)}:

Via conjugates: max_y {−f*(−Bᵀy) − g*(y)}.

LASSO problem: min_x {(1/2)‖Ax−b‖² + λ‖x‖₁}

Conjugates: ((1/2)‖·‖²)* = (1/2)‖·‖², (λ‖·‖₁)* = λ δ_{‖·‖_∞ ≤ 1}.

Dual: max_y {bᵀy − (1/2)‖y‖²} subject to ‖Aᵀy‖_∞ ≤ λ — this is QP with L∞-constraint.

Complete Example Analysis: Computing f* for the Log-Barrier

Task: Find the conjugate for f(x) = −log x (x > 0).

f*(y) = sup_{x > 0} {yx − (−log x)} = sup_{x > 0} {yx + log x}

Step 1: Derivative with respect to x: y + 1/x = 0 → x* = −1/y (only for y < 0).

Step 2: For y ≥ 0: yx + log x → +∞ as x → +∞ (for y > 0) or x → +∞ for y = 0 → supremum = +∞.

Step 3: For y < 0: f*(y) = y·(−1/y) + log(−1/y) = −1 + log(−1/y) = −1 − log(−y).

Result: f*(y) = −1 − log(−y) for y < 0, +∞ for y ≥ 0.

Applications

Duality in optimization: the conjugate function automatically generates the dual problem. This is used in optimal portfolio calculations (duality of the Markowitz problem), SVM (kernel trick through duality), and in the ADMM method.

Prox-operator via conjugate: by Moreau's theorem, prox_{τf}(x) + τ prox_{f*/τ}(x/τ) = x. If computing the prox operator of one function is hard, compute the prox of the conjugate.

Properties of the Fenchel Transform

The conjugate transformation has remarkable properties:

  • The conjugate function is always convex (even if f is not convex) — the transformation “convexifies” the function
  • Double conjugate f** = f, if f is convex and closed; in general, f** is the convex hull of f
  • Order correspondence: f₁ ≤ f₂ → f₂* ≤ f₁*
  • Conjugate of the sum: (f₁ + f₂)* = f₁* □ f₂* (infimal convolution)
  • Connection with subdifferential: y ∈ ∂f(x) ⟺ x ∈ ∂f*(y) ⟺ f(x) + f*(y) = xᵀy

Examples of Conjugate Pairs

  • f(x) = (1/2)xᵀPx (quadratic) ↔ f*(y) = (1/2)yᵀP⁻¹y
  • f(x) = exp(x) ↔ f*(y) = y log y − y (for y > 0)
  • f(x) = log(1 + eˣ) (softplus) ↔ f*(y) = y log y + (1−y) log(1−y) (binary entropy) for y ∈ [0,1]
  • f(x) = ‖x‖p ↔ f*(y) = δ{‖·‖_q ≤ 1}(y), where 1/p + 1/q = 1 (dual norms)
  • f(x) = max(x₁,...,xₙ) ↔ f*(y) = δ_Δ(y) (indicator of the simplex)

Applications

The Fenchel transform is the foundation of Fenchel-Rockafellar duality, which generalizes Lagrange duality. In thermodynamics, Legendre conjugation connects Helmholtz and Gibbs free energy, internal energy and enthalpy. In convex analysis, Fenchel’s image is used to construct “dual” algorithms: proximal splitting, primal-dual hybrid gradient, Chambolle-Pock — all operate with conjugate functions for efficiently solving problems with non-separable terms.

§ Act · what next