
Artificial intelligence company OpenAI engaged in “flagrant
and harmful” infringement by using copyrighted works of fiction to train chatbots like ChatGPT that generate content, the Authors Guild and a group of best-selling authors claim in a new
lawsuit.
OpenAI “copied plaintiffs’ works wholesale, without permission or consideration,” the authors write in a complaint brought Tuesday in U.S. District Court for the
Southern District of New York. Novelists who are suing include Game of Thrones writer George R.R. Martin, My Sister's Keeper author Jodi Picoult and Presumed Innocent writer
Scott Turow.
OpenAI (and affiliated companies) then “fed” the writers' books into the large language model algorithms, which can generate responses to queries by users, the
complaint continues.
advertisement
advertisement
“These algorithms are at the heart of defendants’ massive commercial enterprise. And at the heart of these algorithms is systematic theft on a mass
scale,” the complaint alleges.
OpenAI already faces several other similar lawsuits, including ones brought in June by authors including Paul Tremblay and Sarah Silverman.
The company suggested in those cases that it will argue it's protected by fair use principles.
“At the heart of plaintiffs’ complaints are copyright claims,” OpenAI wrote
last month in motions filed in Tremblay's and Silverman's lawsuits. “Those claims, however, misconceive the scope of copyright, failing to take into account the limitations and exceptions
(including fair use) that properly leave room for innovations like the large language models now at the forefront of artificial intelligence.”
Apparently anticipating that argument, the
Authors Guild says in its new complaint that OpenAI's conduct was unfair.
The complaint alleges that OpenAI could have used works in the public domain to train the large language models, or
could have paid a “reasonable licensing fee” to use copyrighted books.
“What defendants could not do was evade the Copyright Act altogether to power their lucrative
commercial endeavor, taking whatever datasets of relatively recent books they could get their hands on without authorization,” the authors allege. “There is nothing fair about
this.”