A troubling pattern is emerging in open source: developers are using LLMs to rewrite GPL-licensed code, then relicensing it under permissive terms like MIT or Apache. It's effectively laundering copyleft protections through AI, and it raises fundamental questions about whether software licensing can survive contact with large language models.
The practice came to prominence recently when the chardet Python library—a character encoding detection tool originally licensed under LGPL—was rewritten using an LLM and relicensed under MIT. The new version preserves the functionality while claiming the rewritten code is no longer a derivative work subject to the original copyleft license.
Legally and ethically, this is a minefield.
Copyleft licenses like GPL exist for a specific reason: to ensure that improvements to shared code remain shared. If you modify GPL software, your modifications must also be GPL-licensed. This creates a commons where contributions benefit everyone, not just the first person to close off a fork.
The theory behind the relicensing is that an LLM rewrite produces genuinely new code. Yes, the functionality is identical. Yes, the algorithms may be conceptually similar. But if the actual lines of code are different—generated fresh by a model rather than copied verbatim—it might not be a derivative work under copyright law.
That argument falls apart under scrutiny.
First, the LLMs doing the rewriting were trained on GPL code. Codex, GitHub Copilot, and other code-generation models scraped millions of open source repositories, including GPL-licensed projects. The model internalized those patterns and now reproduces them with slight variations. Calling that "original work" is like saying a plagiarism detector can't catch you if you rephrase each sentence.
Second, even if the rewritten code is technically new, it's functionally identical. Copyright law has the concept of "substantial similarity"—you can't avoid infringement just by rewriting something in different words if the structure and expression remain the same. Courts have applied this to software before. There's no obvious reason an LLM rewrite would be treated differently than a human rewrite.
Third, and most fundamentally, this completely undermines the social contract of open source. The reason people contribute to GPL projects is the assurance that their work won't be appropriated and closed off. If LLMs provide a trivial way to strip copyleft protections, that assurance evaporates.


