← Back to Hello, AI
3 min

Claude Mythos: The AI Too Dangerous to Release

Anthropic built a model that dominates 17 of 18 benchmarks and achieves 100% on cybersecurity tasks — then decided the world isn't ready for it. Here's what that tells us about where AI is headed.

Last month, a data leak forced Anthropic's hand. The existence of Claude Mythos — a model so capable it leads 17 of 18 benchmarks Anthropic measured — became public before the company was ready to announce it. The reaction from Anthropic was unusual for a lab that has released multiple frontier models: they said they won't release this one publicly. Not because it doesn't work. Because it works too well.

The numbers are striking. Mythos scores 93.9% on SWE-bench (coding), up 13 points over Claude Opus 4.6. It hits 97.6% on USAMO, the prestigious math olympiad benchmark. And on Cybench — a suite measuring an AI's ability to find and exploit real security vulnerabilities — it scored 100%. It's the first model to ever achieve that. That last number is what gave Anthropic pause.

What Anthropic found during red-teaming is that Mythos can autonomously discover zero-day vulnerabilities in major operating systems and browsers — including flaws that survived decades of human security review. This isn't a theoretical concern. In controlled tests, Mythos identified thousands of previously unknown vulnerabilities in widely deployed software. A model that can do this at scale, available to anyone, changes the threat landscape in a way that no patch cycle can keep up with.

So instead of a public release, Anthropic stood up Project Glasswing — a restricted access program with Amazon, Apple, Microsoft, Cisco, CrowdStrike, the Linux Foundation, and a handful of other infrastructure stewards. The idea is to use Mythos offensively (to find the bugs before bad actors do) rather than letting the capability diffuse into the open. It's a calculated bet: the defenders get the tool first.

What this episode reveals is that we've crossed a threshold. For years, the debate around AI safety was mostly theoretical — 'what happens when models get smart enough to cause real harm?' Mythos is the first concrete answer. It's not a hypothetical superintelligence. It's a production-ready model that Anthropic built, evaluated, and then deliberately kept off the shelf. The question for every lab now is: what's your Mythos policy? Because if Anthropic is drawing this line, others will face the same choice soon. The models are only getting more capable — and the decision about who gets access, and under what conditions, is going to define this decade.

← More from Hello, AI