The Mythos of Responsible Deployment

When Anthropic announced Claude Mythos Preview earlier this month, the company delivered an unusual message: this model is too dangerous for you to use. Not too expensive. Not too niche. Too dangerous. That framing should have been the first warning sign.

"Too dangerous for the public" turned out to mean fine for approximately 40 handpicked organisations — Amazon, Apple, Microsoft, Google, Cisco — and, as we learned this week, the NSA. An unauthorised group breached Mythos through a third-party vendor environment within days of launch, having reverse-engineered its URL location from Anthropic's naming conventions and walked in through a contractor's door. The model capable of autonomously chaining zero-day exploits across every major operating system had, predictably, escaped its curated guest list. Anthropic's responsible deployment framework didn't fail. It was never a containment strategy to begin with.

The real issue: Restricting a weapons-grade AI model to 40 organisations is not a safety policy — it's a liability document with a visitor list.

Let's be precise about what Mythos actually does. According to Anthropic's own red-team research, the model can discover vulnerabilities in Linux kernels and chain them autonomously into full system compromises. It can follow instructions that allow it to escape virtual sandboxes. It can sequence multi-step exploits that, previously, only elite human operators could construct — and it does this without hand-holding. Anthropic acknowledged the risk plainly: "its offensive cyber capabilities were too dangerous to allow for a wider release." Then they deployed it to more than 40 organisations, disclosed only 12 of them publicly, and left a contractor door unlocked.

The NSA revelation from Axios made the situation sharper. The same Department of Defense that blacklisted Anthropic as a "supply chain risk" — the same DoD that is currently arguing in court that Anthropic's tools threaten US national security — has been quietly running Mythos through its intelligence arm. The Pentagon is simultaneously prosecuting Anthropic as a liability and consuming its most dangerous product. This is not bureaucratic incoherence. It is the entirely predictable endpoint of every "controlled release" story: the actors most motivated to use weapons-grade capabilities will find their way in, through legal channels or otherwise.

"Foreign actors — primarily based in China — are systematically extracting value from leading American AI systems through deliberate, industrial-scale campaigns." — Michael Kratsios, White House OSTP, April 23, 2026

That memo, released today, describes tens of thousands of proxy accounts conducting distillation campaigns against US frontier AI. Mythos — a model that can autonomously identify and exploit vulnerabilities across critical infrastructure — is precisely the capability those campaigns are designed to reach. The White House warning and the Mythos breach are not separate stories. They are the same story told from opposite ends.

The counterargument is familiar: some restriction is better than none. Forty vetted organisations under contractual scrutiny is safer than open weights on Hugging Face. That argument holds for coding assistants and image generators. It does not hold for autonomous offensive cyber tools, where the marginal risk of each additional actor with access is not linear. One breach, one rogue contractor, one successful distillation campaign by a foreign state actor, and the entire containment calculus unravels. The question is not whether Anthropic's list of 40 was carefully chosen. It is whether any list survives contact with an adversary who has every incentive to get onto it — or around it.

Dario Amodei met with White House officials this week to discuss Mythos within government. I hope that conversation was about what containment actually requires at this capability level — not about managing the optics of a breach that was, on reflection, entirely foreseeable. The hard question is not who gets access to Mythos. It is whether we have built something that can only proliferate, not be contained. If the answer to "what do we do when a model is too dangerous?" keeps being "build it anyway and write a careful terms-of-service," we are not doing safety. We are doing curation. And as this week proved, curation has a visitor policy problem.