Nicolas Carlini is not someone prone to hype. With over 67,000 citations on Google Scholar and a career spent breaking security systems, he's seen enough AI snake oil to fill a conference hall. So when he says Claude is now a better security researcher than he is, I pay attention.
According to Carlini, Claude found a buffer overflow vulnerability in Linux that's been sitting in the codebase since 2003. Twenty-one years of human code review, security audits, and professional researchers missed it. An AI found it.
Buffer overflows are the classic security vulnerability - the kind that's been in textbooks for decades. They're also notoriously difficult to exploit correctly. Carlini, who has spent his career finding security bugs, says he's never successfully done a buffer overflow exploit himself. The AI did it.
This particular vulnerability allows attackers to steal admin keys from Linux systems. That's not a theoretical concern or a minor bug. That's a critical security issue affecting one of the most widely deployed operating systems on the planet.
But it gets more interesting. Carlini also reports that Claude made $3.7 million exploiting smart contracts through bug bounties. That's not "the AI found something that might be a problem." That's "the AI found exploitable vulnerabilities, wrote working exploits, and collected payment for them."
I want to be skeptical here. AI security research has been promised before. We've seen demos that fall apart under scrutiny. We've seen benchmarks that don't translate to real-world capability. But Carlini's credibility is hard to dismiss. This isn't a startup CEO hyping their product. This is a researcher with decades of experience saying the tools have fundamentally changed.
So what changed? Why is AI suddenly effective at security research after years of underwhelming results?
My best guess: context windows got big enough to hold entire codebases. Finding a buffer overflow in a 50-line function is hard. Finding one in a million lines of interconnected code where the vulnerability emerges from interactions between distant parts of the system - that's where humans struggle and AI might have an advantage. We can't hold that much code in our heads at once. AI can.
