Commentary

Besting The Bots: Most Publishing Firms Are Failing At Detection

by Ray Schultz , Columnist, October 1, 2025

Large language model (LLM) crawlers are harvesting content at scale for training and redistribution, the result often being IP theft, SEO dilution and lost revenue.

Yet 68% of media and publishing sites are failing to detect basic bots -- a 20% increase YoY, according to DataDome’s 2025 Global Bot Security Report.

Fewer than 30% are partially protected (down 16% YoY), while less than 2% are fully protected, detecting basic and sophisticated AI-powered bots (down 4% YoY).

LLM crawler volume rose from 2.6% of verified bot traffic in January 2025 to 10.1% by August across DataDome’s customer base.

“AI agents are rewriting the rules of online engagement,” says Jérôme Segura, vice president of threat research at DataDome. “They mimic human behavior, spawn synthetic browsers, bypass CAPTCHAs, and adapt in real time.”

Segura adds: “Traditional defenses, built to spot static automation, are collapsing under this complexity. Businesses can’t tell if the AI traffic they’re seeing is good or bad, which leaves them both exposed to fraud and blind to opportunity.”

The study presents these findings:

Of the domains studied, “88.9% disallow GPTBot in their robots.txt files, yet this measure offers little real protection,” the study notes. “AI-powered crawlers and browsers ignore these directives, rendering static blocking strategies obsolete.”

At the same time, legacy defenses are failing: A mere 2.8% of websites were fully protected this year, versus 8.4% in 2024.

And, AI-driven traffic doesn’t stop at scraping. This year, 64% of AI bot traffic reached forms, 23% login pages, and 5% checkout flows. The result: new vectors for fraud, account takeover and compliance risk.

Who is suffering the most? High-risk areas such as Government, Nonprofit and Telecom. In contrast, Travel & Hospitality, Gambling, and Real Estate had the highest combined rates of full and partial protection.

Overall, detection rates for sites with bot security tools deployed “topped out at just 42%, with some detecting only 6% of bot traffic, revealing major gaps in real-world effectiveness even among providers claiming bot mitigation as a core capability,” the study states. “For media and publishing, who have mostly defaulted to blocking AI-traffic, this is a significant issue.”

bots, content, content issues, cyber security, language, media, publishing

Next story loading

About the Author

Ray Schultz is the former editor of DM News, Chief Marketer, Direct, Circulation Management and other marketing titles.