YouTube CEO Warns OpenAI Against Using Its Data

YouTube CEO Neal Mohan doesn’t know whether OpenAI has used the company’s videos to train its AI models like Sora, but wants to make it perfectly clear that doing so would be a “clear violation” of YouTube’s terms of use.

When asked during an interview by Emily Chang, host of Bloomberg Originals, Mohan said that "from a creator’s perspective, when a creator uploads their hard work to our platform, they have certain expectations.” 

“One of those expectations is that the terms of service is going to be abided by," he added. "Our terms of service allows for some YouTube content to be the title of the video, channel name or creator’s name to be scraped, because that’s how you allow the content to show up in other search engines.”

It does not allow for transcripts or video bits, which is a clear violation of YouTube’s terms of service, he said. Mohan added that those are the rules in terms of content on the platform.



The debate rages on after OpenAI CTO Mira Murati expressed uncertainty about whether Sora, the company's text-to-video AI tool, was trained on user-generated content from platforms like YouTube.

Reddit and Google had disclosed a deal that includes the ability for Google to use Reddit’s social-media data for a variety of reasons, including the ability to train AI models.

Most AI tools are trained with publicly available data, similar to the way search engine crawlers scrape data from across the web.

Mohan said YouTube works to protect creators by ensuring that all follow the core terms of service. It ultimately makes creators successful on the platform and builds “magical experiences for viewers.”

AI companies training their large language models using creative work without compensation or permission is not new. The New York Times initiated a lawsuit against AI creators Microsoft and OpenAI, alleging they used its copyrighted work to train its AI models.

Next story loading loading..