Google Engaged In 'Mass Theft' To Train AI, Lawsuit Alleges

Google was sued Tuesday for allegedly engaging in “mass theft of personal information,” in order to develop artificial intelligence products such as the chatbot Bard.

“Google has been secretly stealing everything ever created and shared on the internet by hundreds of millions of Americans,” eight people allege in a class-action complaint brought in U.S. District Court for the Northern District of California. The complaint doesn't name the litigants, but describes one as a best-selling author and journalist based in Texas, and another as a professor and actor from California.

The 90-page complaint alleges that Google harvested “all our personal and professional information, our creative and copywritten works, our photographs, and even our emails -- virtually the entirety of our digital footprint” to further its artificial intelligence initiative.

The lawsuit includes claims that Google infringed copyright, and that it violated users' privacy.

Many of the allegations draw on published reports, including an April 19 Washington Post article about the data sets that were used to train various chatbots. According to that article, the copyright symbol appeared more than 200 million times in one of the datasets.

The complaint also notes that Google on July 1 updated its privacy policy to say the company uses publicly available data to train artificial intelligence products -- including Bard and Google Translate. The prior policy also allowed Google to use publicly available information to train “language models,” but didn't reference Bard by name.

The privacy policy "essentially presents internet users worldwide with a dystopian ultimatum: either use the internet and surrender all your personal and copyrighted information ... or avoid the internet entirely,” the complaint alleges. “In our modern world, the latter is untenable, as the internet is an essential tool for professional, educational, and social engagement.”

Google general counsel Halimah DeLaine Prado stated through a spokesperson that the company has "been clear for years" that it uses data from public sources to train artificial intelligence models.

"American law supports using public information to create new beneficial uses, and we look forward to refuting these baseless claims," Prado stated.

Google isn't the only company facing litigation over artificial intelligence. OpenAI is currently facing several lawsuits -- including one that accuses the company of violating web users' privacy, and at least two by authors who say the company infringed their copyright. One of the copyright cases was brought last month by Paul Tremblay and Mona Award, and one was filed late last week by Sarah Silverman, Christopher Golden and Richard Kadrey.

Next story loading loading..