Module II·Article III·~4 min read
Differential Games with a Finite Horizon
Zero-Sum Games and Minimax
Turn this article into a podcast
Pick voices, format, length — AI generates the audio
Terminal Moment: What Happens "at the End"
In problems with a finite horizon [0, T], the system state x(T) occupies a special status: the function g(x(T)) defines the "terminal payoff". This determines the "boundaries"—the sets of initial states x₀ from which P or E can guarantee the desired outcome. Mathematically: the value function is defined from the HJI with terminal condition V(x,T) = g(x) and is "computed backwards" down to t = 0.
Solution Structure via Backward Induction
The principle of optimality gives: V(x,t) = min_u max_v {F(x,u,v,t)dt + V(x+f·dt, t+dt)}.
This is the "Bellman equation" in reverse time: knowing V at t+dt, we compute V at t. We start from T (where V(x,T) = g(x)) and "move backward".
For LQ games: V(x,t) = xᵀP(t)x (quadratic in x), P(t) is the Riccati-Isaacs matrix satisfying an ODE.
LQ Game with Finite Horizon
ẋ = Ax + Bu + Cv, J = xᵀ(T)Gx(T) + ∫₀ᵀ [xᵀQx + uᵀRu − vᵀSv] dt.
Riccati-Isaacs equation:
−Ṗ = AᵀP + PA + Q − PBR⁻¹BᵀP + PCS⁻¹CᵀP, P(T) = G.
(Integrate backward in time from T to 0.)
Optimal strategies (linear feedbacks):
u*(t) = −R⁻¹Bᵀ P(t) x(t) v*(t) = S⁻¹Cᵀ P(t) x(t)
Solvability condition: the Riccati equation must have a solution over all [0, T]. For large T (or large C), the solution may "blow up"—the value of the game is unbounded.
Survival and Capture Zones
For problems with a terminal "target": T = the "target set" (for example, a ball around E).
V(x, T) = 1 if x ∈ T, else −1.
The value function takes values in {−1, +1}: P captured (±1 for P) or E escaped.
Reach-Avoid set: initial conditions x₀ from which P can guarantee capture. The boundary = barrier surface.
Computation: level-set method (Hamilton-Jacobi toolbox). V(x,t) = 0 — boundary. V(x,t) < 0 — P’s zone, V(x,t) > 0 — E’s zone.
Complete Analysis: Two-Dimensional Capture Game
System: ẋP = uP, ẋE = uE, |uP| ≤ αP, |uE| ≤ αE. Capture: |xP − xE| ≤ l.
Change of variable: r = xP − xE (relative position). ṙ = uP − uE.
J = 0 if |r(T)| ≤ l (capture), 1 otherwise (avoidance).
Simplest 1D case (r ∈ ℝ): HJI: ∂V/∂t + min_u max_v [(u−v) ∂V/∂r] = 0.
Value function V(r,t): V = 0 if |r| ≤ l, and V(r,t) = V(|r| − (αP−αE)(T−t), T) for |r| > l.
If αP > αE: P captures E if |r(0)| ≤ l + (αP−αE)T. The capture zone grows with T.
If αP < αE: capture zone is only |r(0)| ≤ l (already captured), E escapes.
Numerical Methods for HJI
Finite difference method: discretization in x on a grid, integration of HJI backwards in t.
Requirements: monotone (upwind) scheme for correct viscosity solution. Number of points: O((1/h)^n)—exponential in dimension!
Curse of dimensionality: for n=4 and step h=0.1: 10⁴ = 10,000 points—feasible. For n=6: 10⁶—hard. For n=10: impossible.
Alternatives:
- Deep learning for HJI (DeepReach, 2021): neural network approximates V(x,t) → works in high dimensions
- Linearization: V ≈ xᵀP(t)x (LQ-approximation)
- PINN (Physics-Informed Neural Networks): train a neural net to satisfy HJI as a physical constraint
Problems with Fixed Time
In games with finite horizon, several types of terminal conditions are important:
- Fixed T, free x(T): V(x, T) = g(x)—terminal-cost problem
- Fixed xT: achieve a given state—V(xT, T) = 0, V(x, T) = +∞ for x ≠ xT (degeneracy)
- Free T (stopping time): T is defined by the first moment when x(T) enters the target set—games with stopping time
Principle of Dynamic Programming
Bellman's principle of optimality for games: V(x, t) = inf_u sup_v {∫_t^{t+δ} F dτ + V(x(t+δ), t+δ)}. This local condition generates the HJI equation and connects values at different moments in time. Solution—a viscosity solution to the HJI with terminal condition V(x, T) = g(x), integrated backward from T to 0.
Reachability and Achievable Zones
For practical safety problems, the optimum is often less relevant than the reachability set—which states x the player can guarantee at time T against the opponent’s worst strategy. This is handled via signed distance: define V(x, T) as the signed distance to the target set and solve HJI backward. The level set {V ≤ 0} is the guaranteed reach zone.
Applications
Differential games with a finite horizon are foundational for safe navigation of quadcopters (Tomlin, Mitchell), maneuvering military aircraft (Air Combat Maneuvering), obstacle avoidance for autonomous transportation, reactor control on a finite time window. Libraries helperOC, hj_reachability implement these algorithms for dimensions up to 5–6.
§ Act · what next