Machine learning is a toolbox inside data science—useful when patterns in historical data predict future outcomes. Not every problem needs a model.
When modeling helps
- Spam detection, churn prediction, recommendation ranking
- Forecasting demand with stable patterns
- Image/text classification with labeled examples
When simpler methods win
- Executive needs one accurate KPI from SQL
- Sample size too small—variance dominates
- Rules and domain expertise already explain outcomes
Explore and visualize before reaching for sklearn locally.
Leakage warning
Using future information in training features inflates metrics—data science rigor is preventing self-deception.
Important interview questions and answers
- Q: Leakage?
A: Training uses information unavailable at prediction time—metrics look great, production fails. - Q: Baseline model?
A: Simple heuristic (always predict majority class) sets minimum bar to beat.
Self-check
- Give one problem that may not need ML.
- What is label leakage?
Tip: Always define a simple baseline before complex models.
Interview prep
- Leakage?
Training uses information unavailable at prediction time.
- Baseline?
Simple heuristic sets minimum performance to beat.