Extreme Value Theory — Risk Theory & Actuarial Math

Most classical statistical methods—estimation of the mean, variance, correlation—work “in the center” of the distribution. But in risk management, extreme, rare events are important: “100-year flood”, market crash like Black Monday, catastrophe on the scale of Chernobyl. Extrapolation based on the normal distribution is not possible here—the real tails are much “fatter.” Extreme Value Theory (EVT) is a branch of probability, providing a universal apparatus for analyzing such events. Analogously to how the Central Limit Theorem describes the behavior of means, Gnedenko’s and Pickands–Balkema–de Haan’s theorems describe the behavior of maxima and tails.

Gnedenko–Fisher–Tippett Theorem (GEV)

Problem. We have n iid observations $X_1, ..., X_n$. How is the maximum $M_n = \max(X_1, ..., X_n)$ distributed as $n \to \infty$?

Without normalization, $M_n \to +\infty$ (trivially). We normalize: seek sequences $a_n > 0$, $b_n$ such that $(M_n − b_n)/a_n$ converges to a non-degenerate distribution.

Gnedenko Theorem (1943): If the limit exists, it belongs to the generalized extreme value distribution (GEV):

$ G_\xi(x) = \exp\left(−(1 + \xi·x)^{−1/\xi}\right), \quad 1 + \xi·x > 0. $

For $\xi = 0$ (limit case): $G_0(x) = \exp(−e^{−x})$.

Three classes by the shape parameter $\xi$:

$\xi = 0$ (Gumbel): Light tails. Distributions: normal, log-normal, exponential, gamma.
$\xi > 0$ (Fréchet): Heavy tails, power-law decay $P(X > x) \sim x^{−1/\xi}$. Distributions: Pareto, Student’s t with finite df, Cauchy. Financial losses, floods, insurance claims.
$\xi < 0$ (Weibull): Bounded tail ($X ≤ x_\text{max}$). Distributions: uniform, beta. Rare in natural phenomena.

Parameterization with location $\mu$ and scale $\sigma$: $G_{\xi, \mu, \sigma}(x) = G_\xi((x − \mu)/\sigma)$.

Block Maxima Method

Algorithm.

Split the data into blocks of length $k$ (e.g., annual maxima for natural data).
Take the maximum in each block: $M_1, M_2, ..., M_n$.
Fit GEV$(\xi, \mu, \sigma)$ via maximum likelihood to ${M_i}$.

Drawback. Uses only $n$ out of $N·k$ data—much information is lost, especially with short time series.

Generalized Pareto Distribution (GPD)

POT (Peaks Over Threshold) approach. We use all observations above threshold $u$—more data.

Pickands–Balkema–de Haan Theorem (1974–75): For a wide class of distributions, for sufficiently high $u$, the conditional distribution of exceedances $(X − u \mid X > u)$ converges to GPD:

$ H_{\xi, \sigma}(y) = 1 − (1 + \xi·y/\sigma)^{−1/\xi}, \quad y ≥ 0. $

For $\xi = 0$: $H_0(y) = 1 − e^{−y/\sigma}$ (exponential).

Connection between GEV and GPD: $\xi_\text{GPD} = \xi_\text{GEV}$. If maxima are Fréchet, exceedances are Pareto.

Tail VaR Estimation

With threshold exceedance probability $p_u = P(X > u)$ and GPD parameters $\xi$, $\sigma$:

$ \mathrm{VaR}_\alpha(X) = u + \frac{\sigma}{\xi} \left[ \left( \frac{p_u}{1 − \alpha} \right)^{\xi} − 1 \right] $ (for $\xi \neq 0$).

For $\xi = 0$: $\mathrm{VaR}_\alpha(X) = u + \sigma·\ln(p_u/(1 − \alpha))$.

$ \mathrm{CVaR}\alpha(X) = \frac{\mathrm{VaR}\alpha + \sigma − \xi·u}{1 − \xi} $ for $\xi < 1$.

Numerical Example

Daily S&P 500 returns over 10 years (2014–2024), ~2520 points. Take losses ($-r$), threshold $u = 2%$ (i.e., days with loss

gt;2%$).

Exceedances $n_u = 65$ (≈ 2.6% of days). $p_u = 65/2520 = 0.0258$.

GPD fit by MLE: $\hat{\xi} = 0.18$ (positive—heavy tail, typical for finance), $\hat{\sigma} = 1.42%$.

$\mathrm{VaR}{0.99}$: $(1 − \alpha)/p_u = 0.01/0.0258 = 0.388$. $ \mathrm{VaR}{0.99} = 2% + (1.42/0.18)·[0.388^{−0.18} − 1] = 2% + 7.89·[1.171 − 1] = 2% + 1.35% = 3.35% $

$\mathrm{VaR}{0.999}$: $(1 − \alpha)/p_u = 0.001/0.0258 = 0.0388$. $ \mathrm{VaR}{0.999} = 2% + 7.89·[0.0388^{−0.18} − 1] = 2% + 7.89·[1.823 − 1] = 2% + 6.49% = 8.49% $

$\mathrm{CVaR}_{0.99} = (3.35% + 1.42% − 0.18·2%)/(1 − 0.18) = 4.41%/0.82 = 5.38%$

Comparison with normal approximation. Standard deviation of historical returns $\approx 1.0%$. Normal $\mathrm{VaR}_{0.999} = 2.326·1% = 3.10%$—underestimates real tail risk by 2.7 times (8.49% vs. 3.10%).

This is “black swans”: events, incredible for the normal model, but real.

Mean Excess Function

MEF: $e(u) = E[X − u | X > u]$. A graphical tool for threshold $u$ selection.

For GPD: $e(u) = (\sigma + \xi·u)/(1 − \xi)$—linear in $u$. If empirical MEF becomes linear above $u_0$, that's a good threshold choice.

EVT Applications

1. Hydrology. “100-year flood”—$\mathrm{VaR}_{0.99}$ for annual maximum water levels. Without EVT: 50 years of data yield 0 observations above “100-year” on average (though 1–2 in one sample possible). EVT allows tail extrapolation.

Example: Design of the Three Gorges Dam in China used EVT to estimate 1-in-1000-year flood.

2. Catastrophe insurance. PML (Probable Maximum Loss)—losses for 1-in-200-year (Solvency II) or 1-in-250-year (Lloyd’s RDS) scenario. EVT calibration based on historical catastrophes (hurricanes, earthquakes) adjusted for inflation and exposure growth.

3. Hydrometeorology. Wave heights (for North Sea platform design), wind speed (for wind turbines), precipitation (for sewer design).

4. Cyber-security. Data breach size: Pareto distribution with very heavy tails ($\xi > 0.5$). Equifax 2017—147 million records, Yahoo 2016—3 billion.

Real-world Applications

Solvency II SCR. Insurers use EVT to calibrate 1-in-200-year catastrophe scenarios. RMS, AIR Worldwide—main model providers.
Basel FRTB. Stressed Expected Shortfall—calibration on stressed periods, often with EVT methods for the tail.
Reinsurance pricing (Munich Re, Swiss Re). Tail loss extrapolation—basis for CatXL and Stop Loss premiums.
Climate risk modelling. IPCC uses EVT to assess changes in extreme temperatures, precipitation under climate change.
Operational risk. Severe loss events (rogue trader Société Générale 2008—€4.9 billion, JPMorgan London Whale 2012—$6.2 billion)—heavy tail extrapolation via POT–EVT.

Assignment. Daily S&P 500 returns over 10 years (can be downloaded from Yahoo Finance or generated via GARCH(1,1) simulation). Take losses ($−r$). (a) Build Mean Excess Plot $e(u)$ for $u ∈ [0, 5%]$. Pick suitable threshold $u^$. (b) Fit GPD via MLE (scipy.stats.genpareto.fit) to exceedances above $u^$. (c) Compute $\mathrm{VaR}{0.99}$, $\mathrm{VaR}{0.999}$, $\mathrm{CVaR}{0.99}$ by two methods: EVT and normal approximation. (d) Compare estimates and discuss the differences. (e) Bonus: backtesting—how many observations exceeded $\mathrm{VaR}{0.99}$ in the out-of-sample period? Does it match the 1% level?