It should come as no surprise that other businesses besides publishing are beset by AI bots scraping their sites.
But media organizations are most likely to block the offenders,
judging by a study from ImmuniWeb. Here is a list of organizations that block AI bots:
- Encyclopedia Britannica’s list of the World Newspapers and
Magazines—83%
- Top Academic Journals in the World—74%
- Top Academic Research Databases—73%
- Legal 500 Top Law Firms
in the United States 2025—64%
- Legal 500 Top Law Firms in England 2025—63%
- Forbes: World’s Best Management Consulting
Firms—2025—52%
- Forbes World Best Banks 2025—43%
- Legal 500 Top Law Firms in France 2025—38%
- Shanghai
Academic Ranking of World Universities 2025—36%
advertisement
advertisement
Of 98 surveyed from the 83% are blocking AI bots, according to a study by ImmuniWeb.
Who is getting blocked?
Here is the roster of the AI platforms getting blocked across all industries:
- Copilot by Microsoft—34.7%
- Claude
by Anthropic—27.2%
- GPTBot by OpenAI—20.8%
- AmazonBot by Amazon—17.7%
- Meta AI
by Meta—12.4%
- Apple Intelligence by Apple—11.9%
- Perplexity by Perplexity AI—11.9%
- Gemini by
Google—8.6%
Please do not infer any wrongdoing from the firms on this list. Some are facing litigation, and will have their day in court.
But the scraping problem has developed rather quickly, the study notes:
“Back in mid-2022, automated scraping of data from websites was rather a niche problem, well known only
in some industries like ecommerce, where competitors’ bots were scraping web data such as prices or discounts to gain an edge by offering more attractive deals to their customers,” it
says. “The situation has radically changed after the launch of ChatGPT by OpenAI in November 2022.”
How so?
“Today, the problem of massive proliferation of
unauthorized data scraping by AI companies and their suppliers continuously dominates the global media headlines,” the study notes.
But there is a solution:
“Ultimately, today, site owners and authors of creative content have no other viable option to protect the fruits of their intellectual labor, but to deploy a set of security and technical
controls to ban automated traffic from bots, eventually changing how the modern Internet works.”
That sounds right. But there is one other option: Today, “there are “over 250
pending lawsuits only in the US against major AI vendors for, among other things, copyright infringement, unwarranted data scraping and even exploitation of pirated content for AI training
purposes,” the study reports.
Better staff up with both engineers and lawyers.