Teams choose between building custom models, buying vendor APIs, or open-source weights with self-hosting. Decision hinges on differentiation, data moat, compliance, and total cost—not hype.
Build when
- Proprietary data is the competitive advantage
- Strict latency, cost, or air-gapped deployment
- Regulators require full lineage and control
Buy / API when
- Commodity capability (speech-to-text, generic chat)
- Speed to market beats marginal quality gains
- Vendor SLA and safety filters acceptable
Hidden costs
- Labeling, GPU, MLOps headcount for build
- Per-token fees, vendor lock-in, data residency for buy
- Security review and prompt injection testing for both
Important interview questions and answers
- Q: Open-source middle path?
A: Host Llama-class models—you control data path but still operate infra. - Q: Vendor API data retention?
A: Read terms—some providers train on customer content unless opted out.
Self-check
- Name two reasons to build vs buy.
- What hidden cost applies to both paths?
Tip: Read vendor data retention terms before sending customer content to APIs.
Interview prep
- Build when?
- Proprietary data moat, strict control, or air-gapped deployment needs.
- Buy when?
- Commodity capability and speed to market outweigh marginal quality gains.