Artificial intelligence company OpenAI engaged in “flagrant and harmful” infringement by using copyrighted works of fiction to train chatbots like ChatGPT that generate content, the Authors Guild and a group of best-selling authors claim in a new lawsuit.
OpenAI “copied plaintiffs’ works wholesale, without permission or consideration,” the authors write in a complaint brought Tuesday in U.S. District Court for the Southern District of New York. Novelists who are suing include Game of Thrones writer George R.R. Martin, My Sister's Keeper author Jodi Picoult and Presumed Innocent writer Scott Turow.
OpenAI (and affiliated companies) then “fed” the writers' books into the large language model algorithms, which can generate responses to queries by users, the complaint continues.
advertisement
advertisement
“These algorithms are at the heart of defendants’ massive commercial enterprise. And at the heart of these algorithms is systematic theft on a mass scale,” the complaint alleges.
OpenAI already faces several other similar lawsuits, including ones brought in June by authors including Paul Tremblay and Sarah Silverman.
The company suggested in those cases that it will argue it's protected by fair use principles.
“At the heart of plaintiffs’ complaints are copyright claims,” OpenAI wrote last month in motions filed in Tremblay's and Silverman's lawsuits. “Those claims, however, misconceive the scope of copyright, failing to take into account the limitations and exceptions (including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”
Apparently anticipating that argument, the Authors Guild says in its new complaint that OpenAI's conduct was unfair.
The complaint alleges that OpenAI could have used works in the public domain to train the large language models, or could have paid a “reasonable licensing fee” to use copyrighted books.
“What defendants could not do was evade the Copyright Act altogether to power their lucrative commercial endeavor, taking whatever datasets of relatively recent books they could get their hands on without authorization,” the authors allege. “There is nothing fair about this.”