High-stakes flows (medical, legal, money movement, public communications) need human approval before action—not just post-hoc analytics.
Patterns
- Draft → reviewer edit → send
- Confidence gating—route low scores to queue
- Active learning—humans label failures to improve retrieval
UX
Show sources, highlight uncertainty, make editing faster than rewriting from scratch—otherwise reviewers rubber-stamp.
Metrics
Track override rate, time-to-approve, and error catch rate—not only model BLEU scores.
Important interview questions and answers
- Q: Automation bias?
A: Reviewers accept fluent wrong drafts—combat with mandatory source checks.
Self-check
- Name two human-in-the-loop patterns.
- What metric shows reviewers catch errors?
Pitfall: Review UI slower than blind accept—design for fast edits with sources visible.
Interview prep
- Automation bias?
Reviewers accept fluent wrong drafts—show sources and uncertainty.
- Override rate?
Signals when humans correct the model—feeds improvement loops.