Interviewers test whether you can design grounded assistants—not recite parameter counts.
Common topics
- RAG architecture and failure modes
- Prompt injection mitigations
- Token/cost/latency trade-offs
- Eval metrics for retrieval and factuality
- When not to use an LLM
60-second story template
Problem → baseline → RAG/prompt design → eval metric → incident you prevented with guardrails.
Whiteboard prompt
"Design a support bot for 10k PDF policies"—expect chunking, index, moderation, human escalation, and cost model.
Important interview questions and answers
- Q: How explain hallucinations?
A: Fluent but ungrounded outputs—fix retrieval, citations, refusal paths.
Self-check
- List four interview topics.
- What belongs in a 60-second story?
Tip: Prepare one RAG diagram on paper—whiteboards appear in almost every Gen AI loop.
Interview prep
- Design question?
Expect RAG diagram, eval metrics, injection defenses, cost model.
- 60s story?
Problem, baseline, design, metric, guardrail incident prevented.