In a landmark ruling that could shape the future of AI and copyright law, a US federal judge has held that AI company Anthropic did not violate copyright laws by using books to train its large language model. However, the court found the company at fault for storing pirated versions of those books.
Judge William Alsup of the San Francisco federal court ruled that Anthropic’s use of works by authors Andrea Bartz, Charles Graeber, and Kirk Wallace Johnson to train its Claude AI model constituted fair use. He likened the process to that of “a reader aspiring to be a writer,” stating that the model used the books “not to race ahead and replicate or supplant them” but to “turn a hard corner and create something different.”
While affirming the legality of AI training on copyrighted works under fair use, the judge drew a firm line when it came to Anthropic’s methods of acquiring the training material. He ruled that the company’s copying and storage of over 7 million pirated books in a central library was an infringement of copyright law.
“That Anthropic later bought a copy of a book it earlier stole off the internet will not absolve it of liability for the theft,” Judge Alsup wrote, “but it may affect the extent of statutory damages.”
The ruling allows the case to move forward to trial in December to determine how much Anthropic owes the plaintiffs for the infringement. Under US law, wilful copyright infringement could result in damages of up to $150,000 per work.
The decision, the first to address the fair use doctrine in the context of generative AI, is likely to influence several similar cases pending across the United States. “Judge Alsup’s decision here will be something those other courts must consider in their own case,” said John Strand, a copyright lawyer at Wolf Greenfield.
An Anthropic spokesperson expressed satisfaction with the ruling, stating the company was “pleased the court recognised its AI training was transformative and consistent with copyright’s purpose in enabling creativity and fostering scientific progress.”
However, Judge Alsup noted that Anthropic’s internal “central library of all the books in the world” was not exclusively used for training and storing pirated works could not be justified under fair use.
The case is part of a growing legal battle between AI firms and copyright holders, including authors, publishers, and media organizations, who argue their work is being used without consent or compensation. More rulings in the coming months—and potentially a future decision from the US Supreme Court—are expected to clarify the boundaries of AI training and copyright law.