451. Bias–Variance Tradeoff
451.1. Bias Variance Tradoff
We have:
- : True function
- : Observation with noise
- : Model
Bias (systematic error)
- Train the same model on many different training sets
- Plot all the fitted curves together
- Compute the average of those fitted curves
- Compare that average curve to the true function .
The difference is the bias
Consistently across datasets
Variance (random error)
- Train the same model on many different training sets.
- Plot all the fitted curves together
The spread of those curves around their average prediction at each is the variance
- Low variance on training set
- High variance on test set
Inconsistent across different data
Example
True Function:
Observations:
Observations differ due to noise.
| Dataset 1 | Dataset 2 | Dataset 3 |
| 1.1 | 0.5 | 1.2 |
| −0.2 | 0.3 | −0.1 |
| 1.05 | 1.2 | 0.7 |
Average across datasets
Bias
Bias at each point
Overall Bias
Mean Square Bias
Variance
Deviations
Overall Variance
Average Variance
Overfitting
| Model Complexity |
Bias | Variance | Training Error |
Test Error |
Situation |
| Too Simple | High | Low | High | High | Underfitting |
| Just Right | Moderate | Moderate | Low | Lowest | Good Generalization |
| Too Complex | Low | High | Very Low | High | Overfitting |
Bias-Variance Tradoff