Cloudflare published results from testing Anthropic's Mythos Preview - the security-focused AI model that found thousands of vulnerabilities across major operating systems and browsers, then was deemed too dangerous to release publicly. The results show genuine capability in vulnerability research, but also inconsistent guardrails that reveal why keeping this technology locked down makes sense.
For context: Anthropic announced Project Glasswing last month, revealing they'd built an AI that autonomously discovered high-severity vulnerabilities across every major OS and web browser. Instead of open-sourcing it or selling it commercially, they gave access to roughly 40 organizations to use it defensively. Cloudflare was one of them.
The genuinely impressive part: the model can take several exploit primitives and reason about how to chain them into working proofs. This isn't just pattern matching against known vulnerability signatures. The reasoning looks like the work of a senior security researcher, not an automated scanner. Cloudflare tested it against more than 50 of their own repositories and found it could identify subtle logic flaws that traditional tools miss.
The catch: its built-in guardrails aren't consistent. Cloudflare found that the same task framed differently could produce completely different outcomes. Sometimes the model would refuse to provide exploitation details. Other times it would helpfully walk through the entire attack chain. That inconsistency is exactly why any future public release needs hardened safeguards layered on top of the base model.
Cloudflare's broader point is more sobering: the same capabilities that helped them find bugs in their own code will, in the wrong hands, accelerate attacks against every application on the internet. This is dual-use technology in the truest sense. Give it to defenders and it helps secure systems. Give it to attackers and it weaponizes vulnerability research at scale.
This is what happens when you give advanced AI to actual security engineers instead of just hyping it in press releases. Cloudflare didn't claim Mythos Preview will replace their security team. They found it useful for augmenting research workflows, identifying edge cases, and reasoning through complex exploit chains. But they also warn the technology isn't ready for general availability because the defensive and offensive capabilities are inseparable.
Mythos Preview appears to be different - a genuinely capable tool with real limitations and serious dual-use concerns. Anthropic's decision to restrict access wasn't about generating hype through artificial scarcity. It was about acknowledging that some capabilities shouldn't be democratized until better safeguards exist.



