Stochastic Systems and the Kalman Filter

Real-world systems are subject to random disturbances (gusts of wind, thermal noise, unaccounted dynamics), and measurements are affected by sensor noise. Deterministic models and Luenberger observers are insufficient here: it is necessary to explicitly account for the probabilistic nature of uncertainty. The Kalman filter, published by Rudolf Kalman in 1960, provided the optimal (in the mean squared sense) solution for linear stochastic systems. Without exaggeration, this is one of the most influential algorithms of the 20th century: without it, the Apollo program (according to NASA engineers, the Kalman filter was a critical component of navigation to the Moon), GPS, modern radar, and signal processing would not have been possible.

Linear Stochastic System

Discrete model: x_{t+1} = A·x_t + B·u_t + w_t (system noise), y_t = C·x_t + v_t (measurement noise).

Noises: w_t ~ N(0, Q), v_t ~ N(0, R), independent from each other and from x_t. Q (n×n), R (p×p) are covariance matrices.

Problem: Based on all measurements y_0, y_1, ..., y_t, reconstruct the best estimate x̂_{t|t} = E[x_t | y_0, ..., y_t].

In the Bayesian interpretation: compute the posterior distribution p(x_t | y_{0:t}). For linear-Gaussian systems, this remains Gaussian, and the filter reduces to recursive updating of the mean and covariance.

Kalman Filter Algorithm

Each step is divided into two sub-steps.

Prediction step (time update):

x̂_{t|t−1} = A·x̂_{t−1|t−1} + B·u_{t−1} (mean by dynamics).
P_{t|t−1} = A·P_{t−1|t−1}·Aᵀ + Q (covariance: dynamics + noise).

Update step (measurement update):

Innovation: r_t = y_t − C·x̂_{t|t−1}.
Kalman gain: K_t = P_{t|t−1}·Cᵀ·(C·P_{t|t−1}·Cᵀ + R)⁻¹.
Updated estimate: x̂_{t|t} = x̂_{t|t−1} + K_t·r_t.
Updated covariance: P_{t|t} = (I − K_t·C)·P_{t|t−1}.

Interpretation of K_t. This is the "trust" in the measurement versus the prediction. If the measurement noise R is small, K_t ≈ C⁻¹ — we almost completely rely on y_t. If R is large, K_t ≈ 0 — we trust the dynamic model more.

Properties

Optimality. Among all linear estimators, the Kalman filter minimizes E[||x_t − x̂_t||²] (BLUE — Best Linear Unbiased Estimator). For Gaussian noises, it is optimal among all (nonlinear) estimators.

Stationary filter. As t → ∞ (for time-invariant A, B, C) the covariance P_{t|t} → P_∞ — the solution to the algebraic Riccati equation of the filter. K → K_∞ — constant gain. This yields a device with a fixed structure — extremely important for embedded systems.

Numerical Example: Object Tracking

State x = (px, py, vx, vy) — position and velocity. Constant velocity dynamics with acceleration noise: x_{t+1} = [I_2, dt·I_2; 0, I_2]·x_t + w_t, dt = 0.1 s, Q = diag(0, 0, 0.01, 0.01). Measurements y = (px, py) (position only, for example, GPS): C = [I_2, 0], R = diag(5, 5) — noise 5 m (standard deviation).

Initial estimate: x̂_{0|0} = (0, 0, 0, 0), P_{0|0} = diag(100, 100, 10, 10).

After 100 steps, the filter stabilizes: K_∞ ≈ [[0.78, 0; 0, 0.78; 5.7, 0; 0, 5.7]] — strong trust in measurements for position, velocity estimate is the derivative of the smoothed measurements. Root mean square error of position ≈ 1.5 m (instead of 5 m without filtering), velocity ≈ 0.3 m/s.

Nonlinear Extensions

Extended Kalman Filter (EKF). System: x_{t+1} = f(x_t, u_t) + w_t, y_t = h(x_t) + v_t. Linearization at each step: F_t = ∂f/∂x|{x̂_t}, H_t = ∂h/∂x|{x̂_{t|t−1}}. Apply the Kalman algorithm with F_t, H_t in place of A and C.

EKF is suboptimal due to linearization errors. It may diverge in the presence of strong nonlinearities.

Unscented Kalman Filter (UKF, Julier-Uhlmann 1997). Instead of linearization, a set of deterministic "sigma-points" is passed through the nonlinear f, h, and output statistics are estimated. Accuracy up to the 3rd moment inclusive. Often outperforms EKF at moderate computational cost.

Particle Filter (Sequential Monte Carlo). Approximation of the posterior distribution p(x_t | y_{0:t}) by a set of weighted "particles" {x_t^{(i)}, w_t^{(i)}}. Does not assume Gaussianity, works for any distributions. Computationally more expensive (1000–10,000 particles), but universal.

Real-world Applications

GPS and inertial navigation (INS). Integration of accelerometers (accurate in the short term, drift) and GPS (accurate once per second, no drift) via EKF. Accuracy 1–10 m in a smartphone, < 0.1 m in aviation.
Apollo (1969) and Space Shuttle. Navigation to the Moon was computed onboard by the Kalman filter. MIT program (Battin), project lead — Kalman personally consulted.
Finance. Estimation of hidden volatility (stochastic volatility models) — nonlinear Kalman filter. Market-making trading strategies use the KF to estimate fair value.
Robotics and SLAM. Simultaneous Localization and Mapping: a robot builds a map and simultaneously localizes itself. EKF-SLAM, GraphSLAM — core of autonomous Roomba vacuum cleaners and Amazon Kiva warehouse robots.
Biomedicine. Wearable ECG monitors with motion artifact filtering. Estimation of blood oxygen saturation in pulse oximeters.

Assignment. A ship with state x = (px, py, vx, vy). Constant velocity dynamics + process noise Q = 0.1·I (on velocity). GPS measurements: y = (px, py) + N(0, R = 25·I). Step dt = 1 s, simulation for 200 s. True trajectory: circular motion of radius 100 m with angular velocity 0.05 rad/s (start at (100, 0)). (a) Implement the discrete Kalman filter. (b) Generate true data, noisy GPS measurements, and filter estimates. (c) Plot all three trajectories on a single graph. (d) Compute position RMSE for measurements and for the filter. (e) Experiment: change Q = 0.01·I ("we are confident" that velocity is constant) — how does this affect maneuver tracking?