
OpenAI is testing political bias in ChatGPT and its models' ability to
remain objective by mirroring real-world uses.
About 500 prompts spanning 100 topics and varying political slants were used to determine whether bias exists, under what conditions it emerges,
and what it looks like.
The analysis measured five different types of problematic behavior and possible solutions.
The researchers found at least five distinct categories of bias
that manifested in different ways. These could be behavioral patterns such as dismissing a user's point of view, or mirroring and amplifying a user's political stance rather than remaining
objective.
The model could also present political opinions as its own, selecting one viewpoint and omitting others or declining to engage with the user based on the political query
without a valid reason.
advertisement
advertisement
"Political and ideological bias in language models remains an open research problem," according to an OpenAI blog post. "Existing benchmarks, such as the Political Compass test, often rely on multiple-choice questions."
These types of
evaluations cover a narrow slice of everyday use and overlook how bias can emerge in realistic AI interactions.
Various conditions can lead to bias. OpenAI compared results on neutral,
slightly liberal/conservative and emotionally charged liberal/conservative prompts.
"Model objectivity should be invariant to prompt slant — the model may mirror the user’s tone,
but its reasoning, coverage, and factual grounding must remain neutral," according to the post.
OpenAI built an evaluation that reflects real-world use, with nuanced, open-ended scenarios to
test and train its models in the way people apply them, where bias can surface in both obvious and subtle ways.
The evaluation focuses on ChatGPT's text-based responses, which represent the
majority of everyday usage and best reveal how the model communicates and reasons.
This was done by developing an evaluation framework that measures how bias appears in AI usage.
Since bias can vary across languages and cultures, OpenAI conducted a detailed evaluation of U.S. English interactions before testing generalization elsewhere.
Early results
indicate that the primary axes of bias are consistent across regions, suggesting that the evaluation framework generalizes bias globally.
The team built a representative prompt set divided by
areas such as queries seeking opinions, cultural questions, or policy questions and then did the same by topic, such as culture and identity, or public services and well being.
The group
defined the nature of bias and then evaluated and graded the models' responses. OpenAI found that bias does exist, but at very low levels.
"Measuring aggregate performance on our
evaluation, we found that bias appears infrequently and at low severity," OpenAI explains in the blog post. "The latest GPT-5 models are most aligned with objectivity targets, reducing bias scores by
~30% compared to prior models. Worst-case scores for older models are 0.138 for o3 and 0.107 for GPT-4o. Notably, under our strict evaluation rubric even reference responses do not score zero."
When applying the same evaluation method to a representative sample of live user queries and interactions rather than its evaluation prompt set, OpenAI estimated that <0.01% of all model
responses exhibit signs of political bias.
When bias emerged, it often presented itself in one of three ways:
- personal opinion — the model frames political views as its own
rather than attributing to sources
- asymmetric coverage — responses emphasize one side where multiple perspectives are warranted
- emotional escalation — language that amplifies
the user’s slant.
Political refusals and invalidating the user are rare, with scores on these specific topics aligning more closely with the intended behavior.
Similar to the
previous results, OpenAI found GPT-5 returned results instantly and then did its thinking -- outperforming GPT-4o and o3 across all measured specific topics.
Political is one type of bias, and
brand bias is another. How can OpenAI prevent brand bias based on large language model training and reviews? What if reviews are biased?
Brand bias exists in LLMs, where models favor brands
that show systematic preferences as a result of the data they were trained on, according to a survey of bias in LLMs.
Studies show LLMs may associate positive attributes with global
brands and negative ones with local brands, leading to biases in recommendations and other outputs. This can surface in scenarios such as suggesting luxury gifts for users in high-income countries but
budget-friendly local items for those in low-income countries, according to the survey.