There is a transition happening on the internet that almost nobody is talking about in terms of its full economic consequences. AI-generated bot traffic has overtaken human web traffic. The web was built on a model that assumed humans were on the other end — eyeballs reading ads that paid for the content. That model is now broken, and nobody has a workable replacement.
According to The Register, AI crawlers and bots are scraping content across the web at scales that are straining server infrastructure and fundamentally undermining the economics of online publishing. This is not a future risk. It is happening right now, today, and the companies building the crawlers have not developed a compensation model that works for publishers.
The mechanics are straightforward. Companies building AI systems — whether training new models or running retrieval-augmented generation for AI assistants — need access to current web content. They deploy crawlers that hit pages constantly. A single AI company's crawlers can generate traffic equivalent to thousands of human readers. Unlike human readers, they do not see ads. They do not subscribe. They do not generate any revenue for the sites they scrape.
For a small publisher running on tight margins, the situation is perverse: your most popular and frequently-updated content gets hit hardest by crawlers, driving up your hosting and bandwidth costs, while generating zero revenue. You are subsidizing the AI companies' training data pipeline and getting nothing in return.
robots.txt is clearly not solving this. The protocol that allows site owners to signal which pages they do not want crawled is voluntary, non-enforceable, and largely ignored by aggressive crawlers. OpenAI and Google have faced allegations of violating robots.txt restrictions. Smaller crawlers often have no policy at all.
Larger publishers have started pursuing licensing deals. The New York Times sued OpenAI. Reddit struck a deal with Google for training data access. The Associated Press signed agreements with multiple AI companies. These deals matter because they establish that web content has economic value to AI companies and that value should flow back to creators.
But the licensing model only works for publishers with enough scale and legal resources to negotiate. The long tail of the web — independent bloggers, niche publications, local news sites — has no leverage and no realistic path to compensation.
The deeper problem is structural. The web's economic model was built around the assumption that valuable traffic is human traffic. Every analytics system, every ad platform, every subscription funnel is optimized for human visitors. Bot traffic of the scale AI companies are generating breaks those assumptions at a fundamental level.
There are realistic solutions, but none of them are easy. A micropayment system for AI crawler access — where each page request by a commercial AI company triggers a small payment — would require industry-wide coordination that has historically failed to materialize. An API model, where publishers offer curated access to their content for fees, works for large publishers but creates barriers for the open web. Government regulation of crawler economics is possible but slow.
What is not sustainable is the current state: AI companies benefiting enormously from the accumulated knowledge and writing of humanity, paying almost nothing for it, and then charging users to access the result. The internet's publishing economy did not need much help breaking. Giving it over to bot traffic without a compensation model might finish the job.




