Anthropic is publicly objecting to other AI companies using Claude's outputs to train their models—despite Claude itself being trained on vast amounts of copyrighted material without permission. The hypocrisy isn't lost on observers: AI companies want their outputs protected while claiming fair use for their inputs.
The AI industry wants to have it both ways—scraping the entire internet to train models while demanding protection for their own outputs. Anthropic calling out competitors for the same behavior is peak tech doublethink.
What Anthropic Is Claiming
According to Yahoo reporting, Anthropic has been vocal about competitors using Claude's responses to improve their own models—a practice called "distillation" or "model-to-model training."
The argument goes like this: Claude's outputs are proprietary intellectual property. Using them to train competing models is theft. Anthropic invested billions in research, compute, and safety work. Competitors shouldn't get to free-ride on that investment.
It's a reasonable position—until you remember how Anthropic trained Claude in the first place.
The Inconvenient Truth
Claude, like every other large language model, was trained on:
• Books scraped from piracy sites and digital libraries • News articles taken without compensation or permission • Code repositories with various open source licenses • Social media posts from millions of users who never consented • Academic papers behind paywalls • Copyrighted content across the entire internet
Anthroponic's position has consistently been that this is fair use—that training AI models on copyrighted material is transformative and doesn't constitute infringement.
