Google, Microsoft, and xAI plan to give the U.S. government early access to new artificial intelligence (AI) models before releasing them publicly under a new deal.
Checks for
national security risks by the Center for AI Standards and Innovation (CAISI) at the Department of Commerce would allow it to evaluate the models before deployment.
The rise of advanced AI
systems like Anthropic's Mythos has raised questions about whether government oversight is necessary for their public development. It also brought into question protecting against creating
misinformation and their ability to hack systems.
The full text of the classified agreements between Google, OpenAI, and the U.S. Department of War has not been made publicly
available.
OpenAI published a blog post in March 2026 stating its agreement.
Security risks are
real. Even ones that seem benign. OpenAI, for example, last week revealed in a post that its models began “developing a strange habit.” They increased the mention of goblins,
gremlins, and other creatures in their metaphors.
advertisement
advertisement
"One 'little goblin' in an answer could be harmless, even charming,” OpenAI explained, but across model generations the mention kept
multiplying. The company wrote that model behavior is shaped by many small incentives. In this case, one of those incentives came from training the model for the personality customization feature, specifically the Nerdy personality developers have.
Developers that give models personalities “unknowingly” gave high rewards to the ones that use metaphors with creatures. From there, the goblins spread.
OpenAI retired the
“Nerdy” personality in March after launching GPT-5.4. In training, the company training data containing creature words, making goblins less likely to over-appear or show up in
inappropriate contexts.
Unfortunately, GPT-5.5 began training before developers found the root cause of the goblins, so a developer-prompt instruction was added to resolve the issue.
The goblin issue
may seem inconsequential, but there are other potential issues. For example, the advanced AI model Mythos was considered too dangerous to release by Anthropic developers. Then OpenAI released GPT-5.4-Cyber last month, a variant of its latest model that had been
tuned specifically for defensive cybersecurity work, following Anthropic's announcement of Mythos.
Anthropic has been in discussions with the Pentagon about safety guardrails in its AI
tools. The move follows a 2024 agreement with OpenAI and Anthropic under the Biden administration when CAISI was known as the U.S. Artificial Intelligence Safety Institute, the government's hub for AI
model testing.
The Pentagon said it had reached agreements last week, which included Amazon Web Services, Nvidia, OpenAI, SpaceX, Oracle, and Microsoft.