Judge Rules in Favor of Anthropic on AI Training—but Pirated Books Case Heads to Trial
In a major legal moment for the AI industry, a federal judge has ruled that Anthropic, the company behind the AI chatbot Claude, did not violate copyright law by training its model on millions of books. But the court isn’t letting the company off the hook entirely.
The ruling, handed down Monday by U.S. District Judge William Alsup in San Francisco, dismissed the core copyright infringement claim brought by three authors, agreeing that Anthropic’s use of the books was “quintessentially transformative” and thus protected under fair use. However, the court also ruled that Anthropic must stand trial this December over how it obtained those books—specifically, from shadowy online repositories of pirated content.
“Anthropic had no entitlement to use pirated copies for its central library,” Judge Alsup wrote.
“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability.”
What’s the Case About?
The lawsuit was filed last year by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson, who accused Anthropic of “large-scale theft” by using their books—along with countless others—as raw material to build its chatbot.
They allege that Anthropic “strip-mined” their creative work to train Claude and now profits from their expression without permission or payment. While books are known to be a crucial part of the data diet for training advanced AI, the problem lies in where those books came from.
Court documents revealed that Anthropic employees downloaded pirated versions of copyrighted books from so-called “shadow libraries”—illegal online archives—raising internal concerns about legality. Eventually, the company changed course and began purchasing physical books in bulk, removing the bindings, scanning every page, and feeding the data into its model.
Still, the judge made it clear: that about-face doesn’t erase past wrongdoing.
Why This Matters
This ruling is a first-of-its-kind legal benchmark in the growing number of lawsuits aimed at AI companies like OpenAI and Meta, who have also been accused of training their models on copyrighted content without proper licensing.
The court’s message is nuanced but important:
- Training an AI model itself may be legal under fair use if the output is transformative (i.e., the AI generates new text, not copied content).
- But using pirated material to do it is not. And that’s the part that could cost Anthropic in damages when the case goes to trial.
Anthropic’s Response
Anthropic responded to the ruling by emphasizing the judge’s fair use conclusion.
“We’re pleased the court recognized that AI training is transformative and aligns with copyright’s purpose of fostering creativity and scientific progress,” the company said in a statement.
It notably did not address the allegations of piracy.
Anthropic, which was founded in 2021 by former OpenAI executives, has worked hard to brand itself as a more ethical, safety-focused alternative in the AI arms race. But this lawsuit challenges that image.
The authors behind the suit have argued that Anthropic’s actions undermine its high-minded mission and that its product is built on stolen intellectual property.
“Anthropic’s actions have made a mockery of its lofty goals,” their original complaint said.
Attorneys for the authors declined to comment following the judge’s ruling.
What Comes Next?
The case now heads to trial in December, where the court will decide whether Anthropic is liable for downloading pirated books, and—if so—how much it owes in damages.
With more copyright lawsuits looming across the AI industry, this case could shape how companies acquire training data moving forward, potentially redrawing the boundaries between innovation and infringement.
Stay tuned. This one’s far from over.
Source: AP News – Anthropic wins ruling on AI training in copyright lawsuit but must face trial on pirated books