Commentary

Google Wraps AI Future In Imperceptible Watermarks

Google's SynthID text watermarking technology -- a tool Google created to make AI-generated text easier to identify -- recently became available through open-source as part of a larger family in the Google Responsible Generative AI Toolkit.

The watermark automatically gets embedded into AI-generated text, images, audio, and video. Some call it a "game-changer" for advertisers and marketers and copyright holders. Others point to immediate challenges and changes across the media industry for everything earned, paid or owned.

Saul Marquez, CEO and founder of the agency Outcomes Rocket, said regulatory changes and requirements might not be the most immediate impact, but it’s coming.

advertisement

advertisement

“Data use rights, privacy, IP infringement and hallucinations, and risks when using GAI in highly regulated areas will see major changes,” he says.

How many ways are there to write a sentence correctly? Marquez says, “we're in a new world. Our clients are mostly in health care” -- a highly regulated sector with businesses and professionals who care whether or not open- or closed-loop LLMs are used.

Some shortcomings that have been identified when it comes to text include whether the content is three sentences or less or very factual, but there are other ways.

What if Google DeepMind's technology is wrong in identifying AI-generated textual content? That’s a risk, Marquez says. “To what degree do we accept what it says and how trustworthy is the technology?" he adds. "They are genius in making it open source,” he says, “but there are challenges.”

“The watermark can identify with a high probability what AI model -- OpenAI or Gemini -- and others" generated the content, Marquez says.

It’s not a silver bullet because there could be false positives, but it’s a start to identify AI-generated content.

SynthID adds an invisible watermark into the text when it is generated by an AI model, and then predicts the next words or phrases.

To an extent, this is the way the technology detects whether the content was generated by generative AI (GAI) -- not just from Google’s platforms, but others, too, according to Google DeepMind.

Large language models (LLMs) break words down into “tokens” that can be one character, a word or part of a word, and then predict the token that will most likely follow others.

Each token gets a percentage score for text to determine how likely the next word will be in a sentence. The higher the percentage, the more likely it is that the model can detect it was created by GAI.  

Microsoft and Meta are said to use watermarks and metadata to label GAI-created content, but the IEEE Spectrum -- the media outlet for the IEEE (the Institute of Electrical and Electronics Engineers) that is dedicated to engineering and applied sciences -- ran an article in March highlighting Meta’s most obvious weakness. 

It “will work only if bad actors creating deepfakes use tools that put watermarks -- that is, hidden or visible information about the origin of digital content -- into their images,” the article says, adding that most unsecured open-source generative AI tools don’t produce watermarks at all.

SynthID is generally considered better at detecting AI-generated content, particularly because it can embed imperceptible watermarks directly into the content during generation no matter whose content, while Meta's system may have limitations in identifying content from non-watermarked AI tools.

"By open-sourcing the code, more people will be able to use the tool to watermark and determine whether text outputs have come from their own LLMs - making it easier to build AI responsibly," Google DeepMind wrote on X. 

Text prompts create the content. It gives AI models instructions on what to create, matching the description provided. SynthID gets dropped in to provide a safeguard. Humans cannot see it, but technology can identify it.

Advertisers can download SynthID Text from Google’s updated Responsible GenAI Toolkit, and AI platform Hugging Face. It's available for free to developers and businesses to identify AI-generated content.

Google DeepMind says SynthID is imperceptible to the human eye, and is inserted directly into the pixels of an AI-generated image or to each frame of an AI-generated video.

For video, it is available for those using Vertex AI’s text-to-image models, Imagen 3 and Imagen 2, which create high-quality images in a variety of styles.

SynthID technology also watermarks the image outputs on ImageFX, a free AI-powered image generator from Google Labs that allows users to create images from text prompts.

It also has been integrated into Veo, a video-generation model available to select creators on VideoFX.

Next story loading loading..