Production RAG fails in predictable ways—catalog them in runbooks instead of calling the model "random."
Failure catalog
- Missed retrieval — wrong embedding model or chunk size
- Poisoned corpus — outdated policy still indexed
- Contradictory chunks — model blends incompatible versions
- Overlong context — right chunk drowned by noise
- Injection via docs — malicious instructions in ingested HTML
Monitoring
Log retrieval IDs, scores, latency, and thumbs-down feedback. Alert when citation rate drops week over week.
Fallbacks
Escalate to human, show search results only, or narrow to a single verified FAQ when confidence is low.
Important interview questions and answers
- Q: Contradictory chunks fix?
A: Version corpus, dedupe, or retrieve from single source-of-truth index per product.
Self-check
- List three RAG failure modes.
- One monitoring signal?
Tip: Alert when citation rate drops—often stale index, not "model got worse."
Interview prep
- Stale corpus?
Old policy chunks cause confident wrong answers—version and re-embed.
- Monitor citation rate?
Drop often signals retrieval/index issues, not random model noise.