Q.You've trained a model that's great on eval but your PM says users hate it. Walk me through your next week.

Two candidates in the last month reported some version of this. It's designed to separate the researcher who optimises a metric from the engineer who ships a product.

What Coach listens for

How you define 'hate'
Do you ask for data before redesigning the model? Coach listens for whether you jump to solutions or start with diagnosis.
Eval–reality gap
Name two specific reasons eval can look great while users churn. Specifics beat generalities here.
Weekly plan
Day-by-day beats week-by-week. They want to hear prioritisation under uncertainty.
What you'd ship on Friday
Every answer should end with a concrete ship artifact — a dataset, a prompt, a guardrail, a rollback plan.

One click away from drilling Anthropic.

It's four minutes. No credit card. Free forever while you're looking.

Want the full library? Browse all drills