Commentary

Bing Query Disinformation: Digging Deeper Into Stanford's Research

On Friday, I published an article about misinformation running high in Microsoft Bing search queries results based on a study conducted at Stanford by two men, a student and a postdoctoral scholar. 

For me, the study presented unanswered questions based on the findings, which claim that Microsoft Bing returns disinformation and misinformation in query results at “a significantly higher rate than Google."

I wondered how personalization or location of the IP address might skew the query results. So I reached out to ask the authors of the report — Bing’s Top Search Results Contain an Alarming Amount of Disinformation.

These two questions also were concerns of the authors who designed the study.

Future posts about the research on this topic will include expanded methodologies sections where these challenges are discussed. For now, Alex Ahmed Zaheer, a master of art student in international policy at Stanford, explains.

advertisement

advertisement

Zaheer worked with Daniel Bush, postdoctoral scholar at Stanford Internet Observatory, on this study.

In an email to the Search Insider, Zaheer explains how each search occurred using Stanford IP addresses. The queries were not location-specific — such as “showtimes near me” or “weather,” so the tests did not attempt any controls.

“In the absence of documented location effects, our back-of-the-envelope reasoning was given Stanford’s academic profile, search results would be skewed away from misinformation and disinformation, so we should expect to see fewer such results due to location if anything. In follow up studies, we will look further into controlling for this,” he wrote. 

When it came to personalization, each Google search was conducted programmatically, using essentially the equivalent of a "curl" command, which is designed to work without interaction by the user. It transfers data to or from a network server.

“Given that Google does not set a cookie using this search approach, we did not consider these searches to be affected by search personalization,” he wrote. 

Bing is slightly more complicated, he wrote. Zaheer used Bing’s paid API to conduct searches, which requires an API token, So, Zaheer wrote, the test could not conclude that personalization did not enter into the results.

“Microsoft’s documentation is not clear that cookies are set when using this API, but we have to assume that Bing does a similar kind of tracking for their API searches,” Zaheer wrote. “At worst, each search may have been influenced by a previous search. We purposefully crafted our search terms to be 'neutral,' so this effect should be minimized.

Personalization also must have been limited if it did occur. Given the search script only generated results but did not "click on" them, closed-loop feedback whereby results are personalized based on the clicked results that could not have occurred. Almost every search was conducted on the same day, and the API token was generated fresh for this study, which Zaheer hopes minimizes this effect when present. 

Overall, Zaheer’s research found it difficult to reason about these concerns, given the lack of transparency into the effects on the platforms.

Next story loading loading..