Judge Sides With Meta In Copyright Battle With Authors

Siding with Meta Platforms, a federal judge in California ruled that the company did not infringe copyright by downloading books by Sarah Silverman, Junot Diaz and other authors, and then using the texts to train the large language model Llama.

But U.S. District Court Vince Chhabria emphasized in the 40-page decision that his ruling was limited to the particular facts of the dispute, and that in most cases, copying material in order to train generative artificial intelligence would likely be illegal.

The ruling marked the second major decision this week regarding tech companies' use of copyrighted material for training purposes.

On Monday, U.S. District Court Judge William Alsup in the Northern District of California ruled that Anthropic did not infringe copyright by digitizing books it had purchased and then using the text to train the chatbot Claude.

advertisement

advertisement

Alsup specifically said using copyrighted material to train artificial intelligence was "quintessentially transformative," and therefore protected by fair use principles.

Chhabria's ruling was significantly narrower than Aslup's, and appears to conflict with Alsup's on a key point.

While Chhabria agreed that using books to train large language models was transformative, he also said doing so would likely infringe copyright "in most cases" because generative artificial intelligence "has the potential to flood the market with endless amounts of images, songs, articles, books, and more."

"There is certainly no rule that when your use of a protected work is 'transformative,' this automatically inoculates you from a claim of copyright infringement," Chhabria wrote.

"By training generative AI models with copyrighted works, companies are creating something that often will dramatically undermine the market for those works, and thus dramatically undermine the incentive for human beings to create things the old-fashioned way," he added.

Despite that broad language, Chhabria sided against the authors, ruling that they failed to present evidence that Meta "copied their works to create a product that will likely flood the market with similar works, causing market dilution."

"In the grand scheme of things, the consequences of this ruling are limited," Chhabria wrote, adding that the lawsuit was not a class action, and therefore wouldn't affect the rights of other authors.

"This ruling does not stand for the proposition that Meta’s use of copyrighted materials to train its language models is lawful. It stands only for the proposition that these plaintiffs made the wrong arguments and failed to develop a record in support of the right one," he wrote.

The decision stemmed from a lawsuit brought against Meta in July 2023 by authors Richard Kadrey, Sarah Silverman, and Christopher Golden, and later joined by others -- including Ta-Nehisi Coates, Junot Díaz, and Laura Lippman. They alleged that Meta downloaded books from free online "shadow libraries" that contained pirated material, and used the content to train Llama.

Meta countered that using copyrighted work to train large language models was a fair use, arguing that large language models can benefit "billions of people."

Large language models like Llama "can use the building blocks of language in remarkable ways, including to generate creative text, solve mathematical theorems, predict protein structure, answer reading comprehension questions, and more," Meta argued.

Chhabria's ruling left open the possibility that Meta may have infringed copyright by allegedly re-uploading some of the same books it downloaded from the shadow libraries.

Next story loading loading..