330. Sample Average Approximation
Sample Average Approximation (SAA) — replace an intractable expectation in a stochastic program with a Monte Carlo average over scenarios.
330.1. Problem
Original stochastic program:
The expectation is hard (high-dimensional integral; complex distribution).
330.2. SAA approximation
Generate i.i.d. samples from the distribution of . Solve:
This is just a deterministic mathematical program — with scenarios. Solvable by standard tools (simplex, branch-and-bound, etc.).
330.3. Convergence
As :
with the optimal solution also converging to the true optimum . Rate: by CLT — the error variance decays as , standard error as .
330.4. Confidence intervals
The objective value from a single SAA run is a random estimate. Standard confidence interval:
where is the sample standard deviation of the recourse values .
330.5. Practical SAA algorithm
- Generate samples
- Solve SAA problem →
- Independently evaluate on fresh samples (out-of-sample, ) to estimate true
- Compare in-sample vs out-of-sample estimate. If similar, is a good solution. If much worse out-of-sample, increase .
This out-of-sample validation is essential — in-sample optimization can over-fit to the scenarios used.
330.6. Choosing
Trade-off:
- Larger : better approximation, but exploding problem size (number of recourse variables = )
- Smaller : faster solve, more sample error
Heuristics:
- Replication: solve SAA with several smaller ‘s, take the best. Cheaper than one huge .
- Adaptive sampling: start small, grow until confidence interval is tight enough
- Stratified sampling: sample more from high-impact regions of the distribution to reduce variance
330.7. Variance reduction
Plain Monte Carlo has variance . Tricks to do better:
- Antithetic variates — use and together
- Importance sampling — sample more from rare-but-impactful scenarios
- Common random numbers — across optimization iterations
- Quasi-Monte Carlo (Halton, Sobol sequences) — convergence instead of for smooth integrands
330.8. When SAA shines
- Continuous distributions of (chance-constrained SP can be cumbersome)
- Distributions without closed-form moments
- Black-box simulators — easy to sample, hard to integrate analytically