Commentary

AI Brand Recommendations: Chaotic, Inconsistent


AI recommendations vary even when prompts are identical. New data from analytics research firm SparkToro shows intent as consistent, but the order of the recommendations tracking is mostly random and filled with errors.

When someone goes to ChatGPT, Claude, or Google’s AI and asks for brand or product recommendations, the AI engines almost never return the same list twice — and almost never in the same order, according to Rand Fishkin, CEO and co-founder of SparkToro.

Fishkin, along with Patrick O’Donnell, CTO and co-founder of Gumshoe.ai, analyzed whether generative AI recommendations could be consistently measured. As a side note, Fishkin has a habit of creating companies and selling them. He co-founded Moz, which was acquired by iContact Marketing in June 2021.

advertisement

advertisement

Inbound.org, another company co-founded by Fishkin, was sold to HubSpot in 2013 for no profit, because it was a "good for the world" side project that never operated as a traditional for-profit business, according to AI Mode. 

In this research from SpokToro, 600 volunteers ran 12 identical prompts through ChatGPT, Claude, and Google’s AI about 2,961 times with the goal to see how often the same answers served in results. What made this interesting is each response was normalized into an ordered list of brands or products. The team then compared those lists for overlap, order, and repetition.

If you ask an AI tool for brand or product recommendations a hundred times, nearly every response will be unique in three ways such as the presented list, order of recommendations, and number of items on that list. Sometimes the AI gave as few as two or three recommendations, and equally often 10 or more.

Turns out there’s a <1 in 100 chance that ChatGPT or Google’s AI, if asked 100X, will give the person querying the engine the same list of brands in any two responses. Claude is just slightly more likely to provide the same list twice in a hundred runs, but even less likely to do so in the same order.

The data shows AI tool responses are so random that it’s more like 1 in 1,000 runs before someone might see two lists in the same order. The group did not try to collect data on how the AIs described each brand or how positive and negative sentiment was around the recommendation.

“When it comes to trusting AI answers, researchers have shown compelling numbers for factual statements in topics like news, politics, science, history, etc. in the high 90%s,” Fishkin wrote in the report. “But I could find no such similar analysis around the recommendations AI gives when asked for the best brands or products in a sector.”

There is a way to make order from this chaotic ordering system, he wrote. His hypothesis that AI brand answer lists are so random as for tracking to be entirely useless was likely wrong.

When AI was asked to recommend digital marketing consultants with expertise in ecommerce, Smartsites agency appeared in 85 out of the 95 responses. So, it appears that the topic matters.

And, smaller markets seem to have more stable results. Regional service providers or niche B2B tools consistently brought up a few familiar names.

Data also suggests that how often a brand shows up in a topic area may have less to do with their relative prominence in that space, or even the model itself, and more to do with how many potential recommendations the AI chooses from.

“There are only a few Cloud Computing providers for SaaS startups’ that AI tools considered, and thus the pairwise correlation (a measure of response similarity we took directly from the Carnegie Mellon researchers’ process) is relatively high, while the average rank difference is pretty low,” Fishkin wrote in the blog post. “That held true across ChatGPT, Claude, and Google AI.”

Next story loading loading..