Commentary

Plugging The Crawl Space: Publishers Offered A Tool For Stopping Content Scrapers

As Cloudflare CEO Matthew Prince promised weeks ago, his firm has launched a way of blocking those evil AI crawlers that scrape content without paying for it. And as Prince boasted, several publishers are on board, including Gannett Media, Fortune, Condé Nast, BuzzFeed and Dotdash Meredith.

The goal is to give publishers “the control they deserve and build a new economic model that works for everyone – creators, consumers, tomorrow’s AI founders, and the future of the web itself,” Prince says.  

The old internet model in which search engines indexed content and linked users to the original websites, resulting in traffic and ad revenue, is broken, Cloudflare asserts. These days, AI crawlers collect text, articles, and images without sending people seeking answers to the original source, it says.

advertisement

advertisement

“AI companies, search engines, researchers, and anyone else crawling sites have to be who they say they are,” adds Steve Huffman, co-founder and CEO of Reddit. “And any platform on the web should have a say in who is taking their content for what.”  

Can technology really solve the crawl problem?

Prince says it can—by default.

What does Cloudflare’s new method  do? 

To start with, website owners can  decide if they want AI crawlers to access their content and if so, how they can use it. The AI firms must state their purpose--training, inference, or search—and publishers will determine whether to let them in.   

This obviously requires human oversight even with automation. But it has to be cheaper than filing lawsuits against AI companies after the fact. Publishers ranging from The New York Times to the Center for Investigative Reporting are fighting a long case with OpenAI over its alleged transgressions. And that firm is now being sued in Denmark. 

At the same time, some of the publishers using Cloudflare have paid licensing arrangements with AI companies. The new Cloudflare offering ensures that their content sharing will be limited to the firms with which they have these deals, such as OpenAI. 

How do publishers feel?

Roger Lynch, CEO of Condé Nast, calls the Cloudflare offering a “critical step toward creating a fair value exchange on the Internet.”

“Cloudflare’s innovative approach to block AI crawlers is a game-changer for publishers and sets a new standard for how content is respected online,” Lynch says. “When AI companies can no longer take anything they want for free, it opens the door to sustainable innovation built on permission and partnership.”  

Renn Turiano, chief consumer and product officer of Gannett Media, agrees. 

"As the largest publisher in the country, comprised of USA TODAY and over 200 local publications throughout the USA TODAY Network, blocking unauthorized scraping and the use of our original content without fair compensation is critically important,” Turiano says. “As our industry faces these challenges, we are optimistic the Cloudflare technology will help combat the theft of valuable IP.”

A similar view is held at Dotdash Meredith. 

“We have long said that AI platforms must fairly compensate publishers and creators to use our content,” says says Neil Vogel, CEO of Dotdash Meredith. “We can now limit access to our content to those AI partners willing to engage in fair arrangements.”

One can also find support for this mission with The Associated Press.

 “The information landscape continues to change rapidly but the value of accurate, factual, nonpartisan journalism has never been more essential,” says Kristin Heitmann, chief revenue officer, The Associated Press

Of course, this isn’t the only relevant Cloudflare service, although it may be the most advanced. The firm’s AI Labyrinth links crawlers that ignore “no crawl” directives to a series of fake, AI-generated pages that are “convincing enough to entice a crawler to traverse them,” the company said in a blog post earlier this year. 

It is rare to see a new product like this receiving a plethora of advance endorsements. Even the News/Media Alliance, a publishing trade group, has weighed in. 

 “The rise of AI presents exciting opportunities, but in order for the industry to grow sustainably, it must do so in cooperation with publishers, says Danielle Coffey, president & CEO, News/Media Alliance. “Cloudflare's tools provide a strong framework for a more equitable exchange, offering a path for both industries to grow and thrive together.  

The takeaway? Prince concludes that this is about “safeguarding the future of a free and vibrant Internet with a new model that works for everyone.” 

 

Next story loading loading..