Theory of Approximation: Guarantees and Boundaries

A Good Solution Instead of a Perfect One

If a problem is NP-hard, we cannot (presumably) find an optimal solution in polynomial time. But often we do not need perfect precision—"good enough" is sufficient. Approximation algorithms find solutions with guaranteed quality: no worse than $\rho$ times the optimum. For $\rho = 1.5$ this means "one and a half times worse than optimum"—often acceptable for practical problems. The theory of approximation studies how well NP-hard problems can be approximated.

Approximation Ratio

An algorithm $A$ is called $\rho$-approximation ($\rho \geq 1$), if for any input $I$:

$ \max\left(\frac{A(I)}{OPT(I)}, \frac{OPT(I)}{A(I)}\right) \leq \rho $

For minimization problems: $A(I) \leq \rho \cdot OPT(I)$. For maximization: $A(I) \geq OPT(I)/\rho$.

Examples:

$\rho = 1$: exact algorithm. Exists only for P-problems.
$\rho = 2$: solution no more than twice as bad as optimum.
$\rho = O(\log n)$: logarithmic approximation (characteristic for Set Cover).

Vertex Cover: 2-Approximation

Vertex Cover problem: graph $G = (V, E)$. Find a minimal cardinality set of vertices $C \subseteq V$ such that every edge $(u,v)$ has at least one endpoint in $C$.

Greedy 2-approximation algorithm:

$C = \emptyset$, $M = \emptyset$ ($M$ — matching)
While there exists an edge not covered by $C$:
- Select any uncovered edge $(u,v)$
- Add both endpoints to $C$: $C = C \cup {u,v}$
- Add edge to $M$: $M = M \cup {(u,v)}$
Return $C$

Proof of 2-approximation: $M$ is a matching (edges in $M$ have no common vertices). Therefore, $|M| \leq OPT$ (the optimum must cover all edges in $M$ $\rightarrow$ needs $\geq |M|$ vertices). $|C| = 2|M| \leq 2 \cdot OPT$. ✓

Example: $K_4$ (complete graph with 4 vertices). $OPT = 2$ (do any 2 vertices cover all 6 edges? No: 2 vertices cover 5 edges. $OPT = 3$). The algorithm picks 2 edges $\rightarrow$ 4 vertices. Approximation $= 4/3 < 2$. ✓

APX-hardness and Approximation Barriers

APX-hard problems: there is no PTAS (Polynomial-Time Approximation Scheme) if $P \neq NP$.

Unique Games Conjecture (UGC, Khot 2002): strong version of $P \neq NP$. If UGC is true:

Vertex Cover cannot be approximated better than $(2-\epsilon)$ for any $\epsilon > 0$
MAX-CUT cannot be approximated better than $0.878$

Max-3-SAT: APX-hard. Best known approximation: $7/8$ (randomized—assign each variable randomly). Impossible to do better if $P \neq NP$ (Håstad 2001).

Bin Packing: Algorithm Analysis

Problem: $n$ items with sizes $s_i \in (0,1]$. Minimal number of unit bins to fit all items.

First-Fit (FF): for each item, place in first bin that fits. If none—open new bin. Approximation: $\leq \frac{17}{10} OPT + 2$ — not very good.

First-Fit Decreasing (FFD): sort items in decreasing order, then apply FF.

Theorem (Johnson, 1974): $FFD(I) \leq \frac{11}{9} OPT(I) + 4$.

How it is proven: analyze how much "space" is wasted in each bin. When sorted in decreasing order, losses are minimized.

FPTAS for Bin Packing: if sizes are known in advance, there exists a $(1+\epsilon)$-approximation in $O(n \log n + f(1/\epsilon))$ — Fernandez de la Vega & Lueker (1981). Split items into "large" ($s_i > \epsilon$) and "small" ($s_i \leq \epsilon$), process differently.

Full Analysis: Metric Facility Location Problem

$k$-Median problem: given $n$ clients and $n$ "warehouse" points, metric $d$. Choose $k$ warehouses to minimize the sum of distances from clients to nearest warehouse.

Local Search algorithm (Arya et al., 2001): $(3+\epsilon)$-approximation.

Initial solution: pick $k$ arbitrary warehouses $S$
Improvement: for each swap (one warehouse in $S$ for one outside $S$):
- If the swap decreases cost—make the swap
Stop: if no improving swaps (local optimum)

Guarantee: in local optimum, the algorithm gives $\leq (3+\epsilon) \cdot OPT$. Proof: amortized analysis of swaps. Application: optimal server placement, regional center distribution.

Classification by Guarantees

Approximation algorithms are divided by strength of guarantee:

Constant approximation ($\rho$): $ALG \leq \rho \cdot OPT$ for some fixed $\rho$. For example, Vertex Cover—2-approximation (Bar-Yehuda, 1981).
Logarithmic ($O(\log n)$): Set Cover—$\ln n$-approximation via greedy (Johansson, Lovász, Chvátal).
PTAS (Polynomial-Time Approximation Scheme): for any $\epsilon > 0$ there exists an algorithm with guarantee $(1+\epsilon)$, polynomial in $n$ (but possibly exponential in $1/\epsilon$).
FPTAS (Fully PTAS): polynomial in both $n$ and $1/\epsilon$. Exists for knapsack, does not exist for TSP (if $P \neq NP$).

Lower Bounds of Approximability

Many problems have proven inapproximability thresholds (assuming $P \neq NP$):

MAX-3SAT: cannot approximate better than $7/8$ (Håstad, 1997)
Vertex Cover: not better than $1.36$ (Dinur-Safra)
Set Cover: not better than $(1-\epsilon)\ln n$ (Feige, Moskovitz)

These results are obtained through the PCP theorem—one of the deepest theorems in complexity theory.

Applications

Approximation algorithms are critical for big data, where even Branch & Bound is too slow: recommender systems ($k$-medoids, $k$-means via 2-approximation), social networks (influence maximization via greedy submodular algorithm with guarantee $1-1/e$), bioinformatics (sequence alignment).