Commentary

Amazon Finds Child Sex Abuse Material In AI Training Data

by Laurie Sullivan , Staff Writer, Yesterday

Amazon reported to the National Center for Missing and Exploited Children (NCMEC) that it detected child sexual abuse material in its AI training data.

The NCMEC recently began tracking the number of reports tied to AI products and their development. The organization was established by Congress to field tips about child sexual abuse and share them with law enforcement.

Amazon did remove the content before using it to train its models, the NCMEC said, but the finding could change the assumption for brands that major commercial AI models are inherently "safe" or "clean.”

The organization saw an increase of at least fifteen-fold last year in these AI-related reports, Bloomberg reported. The “the vast majority” came from Amazon.

Overall, the NCMEC received 485,000 reports of AI-generated child sexual abuse material in the first half of 2025, up from 67,000 in all of 2024.

“About 35 tech companies now report AI-generated images of child sexual abuse to NCMEC,” John Shehan of NCMEC, wrote in a Facebook post in July 2025.

Companies often use data scraped from publicly available sources, such as the open web, to train their AI models, Bloomberg reported.

To prevent Child Sexual Abuse Material from entering large language models (LLMs), companies attempt to use a "Safety by Design" approach across the model's entire lifecycle. It is a framework meant to put safety at the core of architecture models.

Sometimes hackers use sophisticated prompt engineering to trick the model into ignoring its safety training.

Techniques such as jailbreak attacks, which attempt to get the model to create malicious responses, or something referred to as crescendo, can begin with an innocent prompt and pivot the conversation toward harmful content.

There are also false negatives, limitations in filtering, and lack of context.

Filters look for specific keywords or body parts but fail to understand nuanced clues about the context of the query.

Other large technology companies have also scanned their training data and reported potentially exploitative material to NCMEC.

However, the clearinghouse pointed to “glaring differences” between Amazon and its peers.

The other companies collectively made just “a handful of reports,” and provided more detail on the origin of the material, a NCMEC official told Bloomberg.

“We take a deliberately cautious approach to scanning foundation model training data, including data from the public web, to identify and remove known [child sexual abuse material] and protect our customers,” an Amazon spokesperson told Bloomberg.

OpenAI made reports estimated to be 80 times as frequently to the National Center for Missing & Exploited Children during the first six months of 2025, compared to the same time period a year prior, Wired reported.

Aside from Amazon and OpenAI, other companies that have reported material to the NCMEC include Google, Meta, Anthropic, and Microsoft.

Amazon accounted for most of the more than one million AI-related reports of child sexual abuse material submitted to NCMEC in 2025, the organization said. '

In 2024, its subsidiaries collectively submitted 64,195 reports to NCMEC.

ad campaign, ad performance measurement, ai, amazon, artificial intelligence, brands, content, content issues, cyber security, google, hacking, internet security, media buying, meta, performance marketing

Next story loading

About the Author

Laurie Sullivan is a writer and editor for MediaPost. You can reach Laurie at lauriesullivan@gmail.com.