In a stunning reversal that should alarm anyone who believed AI companies would hold the line on safety, Anthropic has abandoned the core commitment of its Responsible Scaling Policy - the promise to never train AI systems without guaranteed adequate safety measures in place first.
The San Francisco-based company, co-founded by former OpenAI researchers who left over safety concerns, built its entire brand on being the responsible AI company. That brand just evaporated.
According to TIME's exclusive reporting, Anthropic's leadership unanimously approved dropping their flagship safety framework in February 2026. The company cited the lack of international governance, the Trump administration's deregulatory stance, and intensified global AI competition as reasons to abandon their red lines.
The Old Promise vs. The New Reality
When Anthropic introduced the RSP in 2023, it was evidence they were serious about not racing to the bottom. The policy created clear thresholds: if AI capabilities crossed certain danger levels, training would pause until safety measures caught up. It was a binary commitment - either the safety measures exist or we don't proceed.
That's gone now. The new framework is all carrots, no sticks. Anthropic promises "increased transparency" through quarterly risk reports and frontier safety roadmaps. They'll "match or surpass competitors' safety efforts." They'll consider delaying development only if they're leading the AI race and leadership believes catastrophic risks are significant.
Read that again: they'll only pause if they're winning and they think it's really, really dangerous. That's not a safety commitment - that's a competitive strategy with a PR wrapper.
The Uncomfortable Questions
I've talked to engineers who worked on AI safety systems at major labs. The consistent refrain: safety research doesn't scale at the same pace as capability research. You can't just throw more compute at alignment problems and expect breakthroughs to match the pace of model improvements.

