The rapid advancement of generative AI has sparked debates about blurring lines between original and reproduced content, posing potential economic threats to creative professionals. This issue is particularly pertinent in training AI models like ChatGPT and Claude, which rely on massive datasets. Recently, Anthropic, the company behind Claude, faced accusations of using copyrighted materials inappropriately. However, the tide turned in their favor as a judge ruled that their training methods fall under “fair use” according to U.S. copyright law.
In a landmark decision on June 24, 2025, Judge William Alsup of the Northern District of California ruled that Anthropic’s use of legally purchased and digitized books to train its AI model is considered fair use. The judge emphasized that converting text into AI knowledge, rather than copying or redistributing it, aligns with fair use guidelines.
However, the ruling didn’t grant carte blanche to all practices. The judge found Anthropic accountable for using pirated books from unauthorized sites like Book3 and LibGen. Although the transformation intention was acknowledged, the illegal sourcing of data wasn’t tolerated.
Judge Alsup announced a separate trial to address Anthropic’s use of pirated content and determine potential damages. He pointed out that it’s difficult to justify downloading from pirate sites when legal access was possible.
This case sets a precedent for the AI industry, highlighting that while using legally obtained books for AI training is permissible, acquiring data through piracy is not defensible. It paves the way for authors to pursue piracy claims and establishes clear guidelines for the ethical training of AI models.






