Metrics translate model outputs into decisions. Pick metrics matching cost of errors: false alarm vs missed detection. Business KPIs (revenue, safety incidents) should align with ML metrics—not replace them blindly.
Classification metrics
- Accuracy — correct / total (misleading if imbalanced)
- Precision — of predicted positives, how many correct
- Recall — of actual positives, how many caught
- F1 — harmonic mean of precision and recall
- ROC-AUC — rank quality across thresholds
Regression metrics
MAE, RMSE, MAPE—choose based on whether large errors are disproportionately costly.
Threshold thinking
# Confusion matrix cells (conceptual counts)
tp, fp, fn, tn = 80, 10, 5, 905
precision = tp / (tp + fp)
recall = tp / (tp + fn)
print(f"precision={precision:.2f} recall={recall:.2f}")Practice: Optional snippets use pandas-style pseudocode—run with Pandas locally if you want tactile practice.
Important interview questions and answers
- Q: High recall when?
A: When missing a positive is dangerous—medical screening, fraud with high cost. - Q: Precision vs recall trade-off?
A: Lowering threshold usually increases recall but drops precision.
Self-check
- When is accuracy misleading?
- Define precision and recall in plain language.
Tip: Pair precision/recall with the cost of false positives vs false negatives.
Interview prep
- Precision vs recall?
- Precision: of predicted positives, how many correct. Recall: of actual positives, how many caught.
- Accuracy misleading when?
- Class imbalance—majority class dominates the metric.