The bias–variance tradeoff describes errors from oversimplified models (high bias) vs models that memorize noise (high variance). Good models balance both on unseen data.
High bias (underfitting)
- Training error and test error both high
- Model too simple for the pattern (linear line for curved data)
High variance (overfitting)
- Training error low, test error much higher
- Too many features, deep trees without regularization, small data
Mitigations
- More relevant data, better features
- Regularization (L1/L2), pruning, early stopping
- Simpler model or ensemble with validation monitoring
Learning curves
Plot error vs training set size—if train and validation error stay far apart, variance is likely high; if both high, bias or noisy labels.
Important interview questions and answers
- Q: Underfitting sign?
A: Poor performance on both train and validation. - Q: Overfitting sign?
A: Great train metrics, poor validation/test metrics.
Self-check
- What is underfitting?
- What is overfitting?
- Name one way to reduce variance.
Tip: More features can increase variance—watch test performance.
Interview prep
- Overfitting?
Low train error, high test error—high variance.