Module IV·Article II·~5 min read

Central Limit Theorem

Limit Theorems

Turn this article into a podcast

Pick voices, format, length — AI generates the audio

Central Limit Theorem (CLT) is one of the most important results in mathematics. It explains why the normal distribution is ubiquitous: the sum of a large number of independent random variables approaches the normal distribution.

Classical CLT

Theorem (Lindeberg-Levy): For i.i.d. $X_1,\ldots,X_n$ with $E[X_i]=\mu$, $Var[X_i]=\sigma^2 < \infty$: $(\bar{X}_n - \mu)/(\sigma/\sqrt{n}) \xrightarrow{d} N(0,1)$. Equivalently: $(S_n - n\mu)/(\sigma\sqrt{n}) \xrightarrow{d} N(0,1)$.

Rate of convergence (Berry-Esseen): $|P((S_n-n\mu)/(\sigma\sqrt{n}) \leq x) - \Phi(x)| \leq C\cdot\rho/(\sigma^3\sqrt{n})$, where $\rho = E[|X-\mu|^3]$. With $C \leq 0.4748$. Rate $O(1/\sqrt{n})$.

Proof via characteristic functions: $\varphi_{(S_n-n\mu)/(\sigma\sqrt{n})}(t) = [\varphi_X(t/(\sigma\sqrt{n}))]^n$. Expanding $\varphi_X(t) \approx 1 - t^2/2 + \ldots$ for small $t$: $\rightarrow \exp(-t^2/2) = \varphi_{N(0,1)}(t)$. Lévy's continuity theorem $\rightarrow$ CDF converges to normal.

Generalizations of CLT

Lindeberg's Theorem: For non i.i.d. independent $X_k$ (under Lindeberg's “homogeneity” condition): $S_n/(Var[S_n])^{1/2} \xrightarrow{d} N(0,1)$. Lindeberg condition: $1/B_n^2 \sum E[X_k^2 \cdot 1_{|X_k|>\varepsilon B_n}] \rightarrow 0$.

Multivariate CLT: For vectors $X_i$: $n^{-1/2}(S_n - n\mu) \xrightarrow{d} N(0, \Sigma)$, $\Sigma = Cov(X)$.

Applications of CLT

Normal approximation in statistics: $t$-test, $z$-test are based on the CLT. For $n\geq30$: $\bar{X} \sim N(\mu, \sigma^2/n)$ approximately.

Insurance: Total losses $S_n = X_1+\ldots+X_n \sim N(n\mu, n\sigma^2)$. The company reserves $P(S_n > R) < \alpha \rightarrow R = n\mu + z_\alpha \cdot \sigma\sqrt{n}$.

Exercise: (a) $X \sim Bin(100, 0.3)$. Normal approximation: $P(X\leq25)$. Compare with exact value. Is a continuity correction needed? (b) Stock price is complicated multiplicatively: $S_n = S_0\cdot\prod_i R_i$, $R_i$ i.i.d. CLT for $\ln(S_n/S_0)$. What does this mean? (c) $10,000$ insurance policies, $\mu=500$ rub., $\sigma=200$ rub. What reserve is needed to cover with $P=0.999$?

Rate of convergence in CLT

Berry-Esseen Theorem: Let $X_i$ i.i.d., $E[X]=0$, $E[X^2]=\sigma^2$, $E[|X|^3]=\rho<\infty$. Then $\sup_x |P(S_n/(\sigma\sqrt{n}) \leq x) - \Phi(x)| \leq C\rho/(\sigma^3\sqrt{n})$, where $C \leq 0.4748$. Rate of convergence $O(1/\sqrt{n})$. For Bernoulli$(p)$: $\rho/\sigma^3 = (p^2+(1-p)^2)/(p(1-p))^{1/2}$ is maximal as $p \rightarrow 0$ or $1$—the normal approximation is worse for nonsymmetric distributions.

Edgeworth Expansion: Extends CLT, including corrections of order $n^{-1/2}$: $F_n(x) = \Phi(x) - \varphi(x)[\kappa_3/(6\sigma^3\sqrt{n})\cdot H_2(x)] + O(n^{-1})$, where $H_k$ are Hermite polynomials, $\kappa_3$ is the third cumulant. Allows construction of more precise confidence intervals (bootstrap studentized CI).

Multivariate CLT and Functional Limits

For vector $X = (X_1,\ldots,X_k)$ i.i.d. with covariance matrix $\Sigma$: $\sqrt{n}(\bar{X}_n - \mu) \rightarrow N_k(0, \Sigma)$ in distribution. Delta method: if $g: \mathbb{R}^k \rightarrow \mathbb{R}$ is differentiable, then $\sqrt{n}(g(\bar{X}_n) - g(\mu)) \rightarrow N(0, \nabla g(\mu)^\top \Sigma \nabla g(\mu))$. Application: asymptotic confidence intervals for functions of sample moments.

Functional CLT (Donsker's theorem): Let $S_n(t) = S_{\lfloor nt \rfloor}/(\sigma\sqrt{n})$—normalized random process of partial sums. As $n\rightarrow\infty$: $S_n(\cdot) \rightarrow W(\cdot)$ (Brownian motion) in the space $C[0,1]$ with the topology of uniform convergence. This is the "continuous version" of the CLT—a limiting process, not merely a limiting distribution.

Poisson Approximation: Chen-Stein Theorem

Stein-Chen method gives a bound on the rate of convergence to Poisson. For sum of dependent indicators $S = \sum X_i$, $\lambda = E[S]$: $d_{TV}(S, Poisson(\lambda)) \leq (b_1 + b_2)/\lambda$, where $b_1, b_2$ are expressed via pairwise dependencies. Applied in the analysis of the number of rare coincidences: the number of “matches” in a random permutation, number of cliques in a random graph.

Generalized CLT and Stable Distributions

Stable distributions: $X$ is stable if $X = aX_1 + bX_2$ (i.i.d.) has the same distribution up to shift. Normal, Cauchy, Lévy distributions are stable. Stability parameter $\alpha \in (0,2]$: $\alpha=2 \rightarrow$ normal. Domain of attraction: distributions with heavy tails $P(X>x) \sim Cx^{-\alpha}$ for $\alpha < 2$ are attracted to stable laws. Generalized CLT: Gnedenko-Kolmogorov (1954)—complete characterization of limits of sums of i.i.d. via stable laws.

Random Matrices and Spectral Distributions

Wigner's law (semicircular): for symmetric matrix $W_n$ with i.i.d. elements (above the diagonal) and zero means: as $n\rightarrow\infty$, empirical distribution of eigenvalues $\rightarrow$ semicircular: $f(x) = (2/\pi)\sqrt{1-x^2}$ on $[-1,1]$. Marchenko-Pastur law: for rectangular matrices $X$ ($n\times p$, $p/n\rightarrow c$) spectral distribution of matrix $X^\top X/n$ has Marchenko-Pastur density. Application: random matrices in PCA—noisy eigenvalues follow MP law.

Lévy's Continuity Theorem

A sequence of distributions $F_n$ converges weakly to $F$ if and only if the characteristic functions $\varphi_n(t) \rightarrow \varphi(t)$ pointwise for all $t$, and $\varphi$ is continuous at $0$. This is a powerful tool: allows proving CLT and other limit theorems through the analysis of CF, bypassing direct work with the distributions.

Rate of Convergence in CLT: Berry-Esseen Inequality

Estimate of the rate of convergence: $\sup_x |P(\bar{X}_n - \mu)/\sigma\cdot\sqrt{n} \leq x) - \Phi(x)| \leq C\cdot E[|X|^3]/(\sigma^3\sqrt{n})$. Constant $C \leq 0.4748$ (Shevtsova, 2011). Practical implication: normal approximation is good for $n \geq 30$ for symmetric and $n \geq 100$ for highly asymmetric distributions. For heavy tails (third moment not finite) CLT is not applicable in pure form.

Numerical Example: Normal Approximation of Sum

Problem: $X_1,\ldots,X_{36} \sim Uniform(0,1)$. Find $P(S_{36} > 20)$, where $S = X_1 + \ldots + X_{36}$.

Step 1: $E[X_i]=1/2$, $Var[X_i]=1/12$. For the sum: $E[S_{36}]=36 \cdot (1/2)=18$, $Var[S_{36}]=36 \cdot (1/12)=3$.

Step 2: By CLT: $S_{36} \approx N(18, 3)$. Standardize: $Z = (S_{36} - 18)/\sqrt{3}$.

Step 3: $P(S_{36}>20) = P(Z>(20-18)/1.732) = P(Z>1.155) = 1-\Phi(1.155) \approx 1-0.876 = 0.124$.

Step 4: Accuracy estimate via Berry-Esseen: $E[|X-1/2|^3]=1/32$; $\sigma^3=(1/12)^{3/2}\approx0.0241$; $n=36$. Error $\leq 0.4748\cdot(1/32)/(0.0241\cdot6) \approx 0.010$. CLT approximation is off by less than $1%$.

§ Act · what next