Module I·Article III·~5 min read
Linear Programming as the Foundation of ILP
Fundamentals of Discrete Optimization and Combinatorics
Turn this article into a podcast
Pick voices, format, length — AI generates the audio
The Bridge Between Continuous and Discrete
Integer optimization problems are discrete by their very nature. Yet the key tool for solving them is linear programming, a continuous problem. The idea: “relax” the integer constraints to continuous ones, solve a much simpler problem, and use the result to find the integer optimum. This is “LP relaxation”—the foundation of Branch-and-Bound methods, cutting planes, and approximation algorithms.
The Geometry of Linear Programming
Standard form of LP: min $c^\top x$ subject to $Ax \leq b$, $x \geq 0$.
The feasible set—the intersection of half-spaces—is a polytope (a convex polyhedron).
Key theorem: the optimal solution to an LP is always achieved at a vertex of the polytope (provided an optimum exists). A vertex is a point where $n$ constraints are “active” (hold as equalities).
Number of vertices: up to $C_{m+n}^n \approx m^n/n!$ (exponential). But in practice, the simplex method traverses only $O(m)$ vertices. Karmarkar’s method (1984): polynomial—$O(n^{3.5})$, moves “through the center” of the polytope.
LP Relaxation of ILP
ILP: min $c^\top x$ subject to $Ax \leq b$, $x \in {0,1}^n$.
LP relaxation: min $c^\top x$ subject to $Ax \leq b$, $0 \leq x \leq 1$. (Drop integrality)
Key facts:
- LP optimum $\leq$ ILP optimum (a lower bound for minimization!)
- If the LP optimum is integer-valued, then it is the ILP optimum as well
- “Integrality gap”: $\text{OPT}{ILP} / \text{OPT}{LP}$—shows the “loss” from fractionalness
Example: knapsack problem with $W=5$, $n=3$: $(c_1, w_1)=(4,3)$, $(c_2, w_2)=(3,2)$, $(c_3, w_3)=(2,1)$.
ILP optimum: $x=(0,1,1)$, $c=5$. LP relaxation: the optimum $x=(0, 1, 1, 2/3)$ is actually achieved at $x_1=2/3$, $x_2=1$, $x_3=0$ by greedy reasoning—should be computed more precisely, but LP optimum can be 5.67, ILP = 5. The gap is ≈13%.
Totally Unimodular Matrices
Definition: a matrix $A$ is called totally unimodular (TU) if the determinant of every square submatrix is in ${-1, 0, +1}$.
Theorem (Hoffman-Kruskal): if $A$ is a TU matrix and $b$ is integer, then the vertices of the polytope ${x: Ax \leq b, x \geq 0}$ are all integer. Thus, LP optimum = ILP optimum—you can solve the ILP via LP!
Examples of TU matrices:
- The incidence matrix of a bipartite graph (rows—vertices, columns—edges)
- The “constraints” matrix of the maximum flow problem
- The “schedule” matrix for problems with ordered intervals
Corollary: the maximum matching problem, the maximum flow problem, and the shortest path problem—all can be solved by LP!
LP Duality for Approximation Algorithms
LP duality: for the problem (P): min $c^\top x$ subject to $Ax \geq b$, $x \geq 0$ the dual (D) is: max $b^\top y$ subject to $A^\top y \leq c$, $y \geq 0$.
Weak duality: any feasible $y$ gives $b^\top y \leq c^\top x$ for any feasible $x$. Corollary: $b^\top y$ is a lower bound on the optimum of (P).
Primal-Dual Method: simultaneously construct a feasible integer solution $\hat{x}$ and a feasible dual solution $\hat{y}$. Guarantee: $c^\top \hat{x} \leq \rho \cdot b^\top \hat{y} \leq \rho \cdot OPT \rightarrow \rho$-approximation.
Example: Set Cover via Primal-Dual
Each element $u \in U$ must be covered by at least one selected set. ILP: min $\sum_j c_j x_j$ subject to $\sum_{j: u \in S_j} x_j \geq 1$, $x_j \in {0,1}$.
Dual: max $\sum_u y_u$ subject to $\sum_{u: u \in S_j} y_u \leq c_j$, $y_u \geq 0$.
Primal-Dual Algorithm: while there exists an uncovered $u$: increase $y_u$ until some dual constraint becomes “tight” → add this $S_j$ to the solution.
Guarantee: $\ln n$-approximation ($\ln n$ is the maximum degree of a vertex in the cover hypergraph).
Full Analysis: Shortest Path Problem via LP
Graph: $V = {1,2,3,4}$, edges: $(1,2)=2$, $(1,3)=4$, $(2,3)=1$, $(2,4)=5$, $(3,4)=2$. Find the shortest path from 1 to 4.
Flow LP formulation: $x_{ij} \geq 0$—flow through edge $(i,j)$. Problem: min $\sum d_{ij} x_{ij}$ subject to flow-balance constraints (flow 1 from source, $-1$ at sink, 0 elsewhere).
Solution: simplex yields $x_{12}=1$, $x_{23}=1$, $x_{34}=1$ (path $1 \rightarrow 2 \rightarrow 3 \rightarrow 4$, cost = $2+1+2=5$). Solution is integer (network matrix is TU)!
Dual LP—vertex potentials: max $\pi_4-\pi_1$ subject to $\pi_j-\pi_i \leq d_{ij}$. Optimum: $\pi_1=0$, $\pi_2=2$, $\pi_3=3$, $\pi_4=5$. Dual optimum = 5 = primal ✓ (strong duality).
Simplex Method: Key Details
The Simplex Method (Dantzig, 1947) successively moves between vertices of the polytope, improving the objective at each step. Algorithm:
- Find a starting vertex (Phase I)
- Check optimality condition: reduced cost coefficients $\geq 0$
- If not—choose a variable with negative reduced cost (Bland’s, Dantzig’s, or steepest edge rule) and perform a pivot
In the worst case, simplex visits an exponential number of vertices (Klee-Minty example, 1972), but in practice typically requires only $O(m+n)$ steps. This phenomenon still lacks a full theoretical explanation—the Hirsch conjecture about polytope diameter was disproved only in 2010.
Interior Methods
Karmarkar’s method (1984) and its descendants (primal-dual path-following methods) travel through the interior of the polytope, ensuring guaranteed polynomial complexity $O(n^{3.5} L)$, where $L$ is the input encoding size. In practice, for very large LPs (millions of variables) interior methods compete with simplex—the choice depends on the sparsity structure of matrix $A$.
LP in Machine Learning
LP underpins many methods: SVM with L1 penalty reduces to LP, optimal transport between discrete distributions is an LP, document ranking by LP-relaxation. Modern neural networks use LP relaxations inside layers (for example, for differentiable combinatorial optimization—Pointer Networks).
§ Act · what next