Independent and unofficial. Synthesized from publicly-reported, first-hand candidate accounts (2024–2026). Not affiliated with, authorized by, or endorsed by Anthropic, OpenAI, or any company named. Treat stage structure as well-corroborated and all numbers as directional self-report.
I've been collecting publicly-reported, first-hand accounts of the software-engineering interview loops at Anthropic and OpenAI. The patterns are consistent enough to be worth writing down.
The loop, stage by stage
The two loops rhyme but emphasize different things — Anthropic is values-aware from the recruiter screen; OpenAI front-loads team fit. The single most consistent finding: a values / culture round appears in essentially every Anthropic onsite, and it fails more technically-strong candidates than any coding round.
| Stage | Anthropic | OpenAI |
|---|---|---|
| Recruiter screen | Mission/values-aware from minute one | Background + which team is hiring |
| First technical filter | CodeSignal OA, ~4 progressive levels (often waived for referrals/seniors) | CoderPad/HackerRank screen, or a 4–8 hr take-home |
| Onsite | ~4–6 rounds: coding, system/AI-infra design, values (universal), deep-dive | ~3–5 rounds: coding, system design, refactoring (senior), deep-dive, behavioral |
| Design tool | Shared Google Doc | Excalidraw |
| After | References + team matching (opaque) | Hiring committee + org match |
| Negotiation | Expected | Tends to hold firmer |
Five things that surprise strong candidates
- AI tools are banned in live rounds. Anthropic enforces it hardest and reportedly detects test-gaming. Prep with AI; never solve with it live.
- Coding is build-from-scratch, not LeetCode. You implement a small system and extend it under observation — algorithm trivia won't save you.
- System design is AI-infra-flavored and math-first. Do the capacity math early and let it drive the design — then keep it simple. Over-engineering is the most-reported design failure.
- The values / culture round is the #1 filter at Anthropic. It rewards genuine, specific, skeptical thinking; scripted "STAR" answers and flattery backfire.
- The question bank is small and well-known — so interviewers perturb problems to test whether you can operate, not just recall.
Coding: the known families
Be fluent building these from scratch in Python (a real edge): an in-memory multi-level key-value store, a web crawler, an LRU cache, a stack-trace / sampling-profiler problem, a tokenizer, a distributed mode/median exercise. Knowing them is table stakes; surviving the perturbation is the test.
System design: the one rule
Almost verbatim across sources: do the math first; design the simplest system that meets the stated numbers; bake safety/limits into the request flow; lead the discussion yourself. Anthropic prompt themes are infra-shaped (serving LLMs, token services, retrieval, agents); OpenAI leans more product-shaped.
The values round, and how to prep
It's reflective and probing — "a time your values were tested," "a belief you changed," "a genuine critique of the company." Follow-ups probe your reasoning and honesty, not tidy outcomes. Candidates who pass build a few true stories only they could tell, form a real point of view on AI safety, and read the primary sources (Core Views on AI Safety, the Responsible Scaling Policy, Dario Amodei's essays) to engage critically — not memorize.
I compiled the full ~105-page version — the master question bank, the values-round playbook, reconciled comp data, and a prep plan — grounded in the same 60+ accounts. The condensed field analysis is free here, and there's a free cheat-sheet on GitHub.













