Commentary

Google Licensing Reddit Data In Expanded Partnership

Reddit posts during the past month have increasingly served up in Google Search results, so it's not surprising to learn that the two companies have signed a deal that lets Google use the social media platform's data for a variety of reasons, including the ability to train AI models.

Google, per Reuters, will pay Reddit $60 million annually to license its data from posts on the platform.

Last year, Reddit CEO Steve Huffman told The Verge that charging for data access could become a potential revenue stream for the company.

"We’ve had a longstanding relationship with Reddit for many years, and today we’re sharing a number of ways that we’re deepening our partnership across the company," Google wrote in a post published Thursday. "Reddit plays a unique role on the open internet as a large platform with an incredible breadth of authentic, human conversations and experiences, and we’re excited to partner to make it even easier for people to benefit from that useful information."

advertisement

advertisement

For starters, the partnership will facilitate an increase of Reddit content and information that will make our products like search more helpful for users and make it easier to participate in Reddit communities and conversations, the company said.

Google now has access to Reddit's Data API, which delivers real-time, structured, unique content from their large and dynamic platform. It could be related to the millions that Reuters reported, but the company did not put a price tag on the partnership.

With the Reddit Data API, Google will have access to "real-time" information, similar to its partnership with Twitter many years ago, around 2009, as well as enhanced signals that will help us better understand Reddit content and display, train on, and otherwise use it in accurate and relevant ways.

This expanded partnership does not change Google's use of publicly available, crawlable content for indexing, training, or display in Google products, the company said.

Matthew Sag, professor of law, AI, ML and data science at Emory University Law School, held a virtual conversation with analysts at New Street Research on Thursday, where he said any type of community-based platform where you facilitate a conversation and produce content could see the training of AI data as an existential threat.

“You will see companies like Reddit telling the AI companies no, you cannot come to our site and collect information,” he said. “It’s my understanding that AI companies are respecting it.”

Sag also believes the industry will begin to see Reddit and Twitter in a position where they charge access to training AI data, which apparently they have already begun to do.

Twitter in April 2023 said it would start charging a minimum of $42,000 per month to users of its API, and in the following month accused Microsoft of using the company's data without permission to train its chatbots. Elon Musk sent Microsoft CEO Satya Nadella a letter focused on the company's unauthorized use of Twitter data collected through an API.  

Reddit is preparing to make its initial public offering filing this week, maybe sometime today, which would detail its financials for the first time to potential investors, according to Reuters, which reports that makers of AI models have been busy doing deals with content owners to diversify training data beyond large scrapes of the internet.

The company, which was valued at about $10 billion in a funding round in 2021, is seeking to sell about 10% of its shares in the offering, Reuters previously reported.

Next story loading loading..