The AI industry has a problem it doesn't want to talk about: the environmental cost of all those impressive video demos. A new report reveals that generating just 5 seconds of AI video consumes roughly the same amount of energy as running a microwave for an hour.
Let that sink in. Five seconds.
This isn't speculation or theoretical modeling—these are measured numbers from production systems that millions of users are accessing daily. While companies like OpenAI, Google, and Runway race to release increasingly sophisticated video generation tools, they've been remarkably quiet about the infrastructure costs.
The energy consumption stems from the massive computational requirements of diffusion models, which generate video frame by frame through repeated neural network passes. Unlike text generation, which might require a few seconds of GPU time, video synthesis demands sustained processing across hundreds of frames, each requiring complex calculations to maintain temporal coherence.
The scale problem gets worse when you consider how these tools are actually used. A typical user might generate dozens of clips before getting one they like. That's not five seconds of microwave energy—that's hours. Multiply that by millions of users experimenting with Sora, Runway Gen-3, or Google's Veo, and you're looking at energy consumption equivalent to powering small cities.
What's particularly frustrating is that this was entirely predictable. When Sam Altman first demoed Sora in February 2024, engineers immediately started calculating the computational requirements. The math was straightforward: video has dramatically higher dimensionality than text or images, and diffusion models scale poorly with dimensionality.
But here's what the press releases never mention: most of this energy is wasted. The majority of generated videos are discarded. Users iterate until they get something acceptable, burning energy with each attempt. There's no penalty for waste, no feedback loop that discourages inefficient use.
The report also highlights the disparity between AI video and traditional video processing. from existing footage uses a fraction of the energy—we're talking orders of magnitude less. Generative AI isn't just inefficient; it's fundamentally wasteful compared to the tools it's trying to replace.

