Module I·Article I·~6 min read
Convex Sets: Properties and Operations
Convex Sets and Functions
Turn this article into a podcast
Pick voices, format, length — AI generates the audio
Why Is Convexity Important?
Imagine you are searching for a path in the mountains. If the terrain is convex (with no depressions or pockets), any trail without dead ends will lead you to the unique lowest point. This very idea underlies convex analysis: optimization problems on convex sets have a unique global minimum, and it can be found using reliable algorithms. All modern optimization — from machine learning to finance — relies on convex structures.
Definition of a Convex Set
A set $C \subseteq \mathbb{R}^n$ is called convex if for any two points $x, y \in C$ and any number $\theta \in [0,1]$ it holds that:
$\theta x + (1-\theta)y \in C$
What does this mean in words? Take any two points in the set. Connect them by a segment. If the entire segment lies inside the set — it is convex. The parameter $\theta$ "runs" from 0 to 1, describing all points between $x$ and $y$: at $\theta=1$ we get $x$, at $\theta=0$ — $y$, at $\theta=0.5$ — the midpoint of the segment.
Geometric test: draw the figure. If there exist two points inside it, connected by a segment that partially goes outside the boundary — the figure is nonconvex. The letter "C" is nonconvex. A circle, square, triangle — are convex.
Classic Examples of Convex Sets
Hyperplane: ${x \in \mathbb{R}^n : a^\top x = b}$, where $a \neq 0$ is a fixed vector, $b$ is a number. This is an $n-1$ dimensional “plane” in space. Example in $\mathbb{R}^2$: line $2x_1 + 3x_2 = 6$. Any two points on the line are connected by a segment lying on the line — convexity is obvious.
Halfspace: ${x : a^\top x \leq b}$ — one "side" from the hyperplane. In $\mathbb{R}^2$ this is a halfplane on one side of a line.
Ball: ${x : |x - x_c| \leq r}$ — all points at distance no more than $r$ from the center $x_c$. Convexity: if two points lie in the ball (distance to $x_c \leq r$), then any point between them also lies in the ball — this follows from the triangle inequality.
Ellipsoid: ${x : (x-x_c)^\top P^{-1}(x-x_c) \leq 1}$, where $P$ is a positive definite matrix. This is a “stretched ball” along different axes. Actively used in control theory for describing admissible regions of states.
Second-order cone (SOCP): ${(x, t) : |x| \leq t,, t \geq 0}$ — "ice cream cone" in space. The surface of the cone is points with $|x| = t$, interior — with $|x| < t$.
Set of positive semidefinite matrices: $S^n_+ = {X \in \mathbb{R}^{n \times n} : X = X^\top,, u^\top X u \geq 0,, \text{for all } u}$. This is a convex cone in the space of symmetric matrices.
Operations Preserving Convexity
One of the key properties: many operations on convex sets yield new convex sets. This allows building complex convex structures from simple building blocks.
Intersection: if $C_1$ and $C_2$ are convex, then $C_1 \cap C_2$ is also convex. Proof: if $x, y \in C_1 \cap C_2$, then $x, y \in C_1$ $\rightarrow$ segment $xy \subseteq C_1$, and $x, y \in C_2$ $\rightarrow$ segment $xy \subseteq C_2$, thus segment $xy \subseteq C_1 \cap C_2$. Consequence: polyhedron (polytope) — intersection of a finite number of halfspaces — is convex.
Image under linear transformation: $f(C) = {Ax + b : x \in C}$ — is convex if $C$ is convex. Affine transformations "preserve" convexity.
Preimage: if $f: \mathbb{R}^n \rightarrow \mathbb{R}^m$ is affine ($f(x) = Ax+b$) and $D$ is convex, then $f^{-1}(D) = {x : f(x) \in D}$ is convex.
Minkowski sum: $C_1 + C_2 = {x + y : x \in C_1,\ y \in C_2}$ — is convex if $C_1$, $C_2$ are convex.
The Separating Hyperplane Theorem
This is one of the fundamental results of convex analysis, with deep geometric meaning and practical applications.
Theorem: if $C_1$ and $C_2$ are nonempty convex sets with empty intersection ($C_1 \cap C_2 = \varnothing$), then there exists a vector $a \neq 0$ and a number $b$ such that:
- $a^\top x \leq b$ for all $x \in C_1$
- $a^\top x \geq b$ for all $x \in C_2$
Meaning: the hyperplane ${x : a^\top x = b}$ “separates” the two sets. This is geometrically obvious in $\mathbb{R}^2$: two non-intersecting convex sets on the plane can always be separated by a line.
Supporting hyperplane: at a boundary point $x_0$ of convex $C$ there exists a vector $g \neq 0$ such that $g^\top(x - x_0) \leq 0$ for all $x \in C$. This is a "tangent" hyperplane to $C$ at point $x_0$, lying "outside".
Projection onto a Convex Set
For a closed convex set $C$, the orthogonal projection of point $x$ is defined as:
$P_C(x) = \underset{y \in C}{\arg\min}; |y - x|^2$
Existence and uniqueness: such a point always exists and is unique. Uniqueness is a consequence of the strict convexity of the function $|y - x|^2$.
Characterization: $y^* = P_C(x)$ if and only if $(x - y^)^\top(z - y^) \leq 0$ for all $z \in C$. This means that the vector $(x - y^)$ forms an obtuse angle with all directions inside $C$ from the point $y^$.
Practical importance: projection gradient descent — a fundamental algorithm for optimization with constraints. At each step, take a gradient step, then "project" back onto the admissible set $C$.
Full Analysis of an Example
Problem: find the point closest to $x = (3, 4)$ in the square $C = {y \in \mathbb{R}^2 : 0 \leq y_1 \leq 1,, 0 \leq y_2 \leq 1}$.
Solution: Projection onto the square is coordinate-wise cutoff: $P_C(x) = (\min(\max(x_1, 0), 1),, \min(\max(x_2, 0), 1))$.
For $x = (3, 4)$: $y_1^* = \min(\max(3, 0), 1) = 1$, $y_2^* = \min(\max(4, 0), 1) = 1$. Answer: $P_C(3, 4) = (1, 1)$ — the corner point of the square.
Checking the condition: vector $x - y^* = (2, 3)$. For any $z \in C$: $(x - y^)^\top(z - y^) = 2(z_1 - 1) + 3(z_2 - 1)$. Since $z_1 \leq 1$ and $z_2 \leq 1$, both terms $\leq 0$. Condition is satisfied.
Real Applications
Convex sets appear everywhere. In Markowitz portfolio theory, the set of admissible portfolios (with nonnegative weights summing to one) is convex. In machine learning, the domain of neural network weights with an $L2$ constraint is a ball in parameter space. In control theory, admissible controls are often specified as convex polytopes (constraints on engine power, steering angle). Understanding convexity allows us to guarantee that an optimization algorithm will find the global optimum, rather than getting stuck in a local one.
§ Act · what next