Commentary

Google Gen AI Video Model Creates Creators

Veo, Google's generative AI video creation tool announced at I/O on Tuesday, can produce high-quality 1080p resolution video from the inputting of text, image, and video prompts.

The tool has an advanced understanding of natural language to enable the model to understand cinematic terms like “time lapse” or “aerial shots of a landscape.”

Users can direct the creation using prompts such as text, image, and video-based to create realistic movement for people, animals, and objects. Creators can refine the video results by using additional prompts. Google also is exploring other features that will enable Veo to produce storyboards and longer scenes.

"Innovations like Veo and Imagen 3 will empower marketers to become creators," said Chris Long, vice president of marketing at Go Fish Digital. "With these tools, even without formal design skills, marketers can create rich media that can be used in their campaigns."

advertisement

advertisement

Long added that marketers will have the ability to produce media with faster turnaround times and limited production costs. Whether it's video for paid social ads or new branding creative, marketers will have the ability to create content in a way that was not possible before.  

Google partnered with Donald Glover in his creative studio to demonstrate the model's capabilities.

In a brief video, Glover and his crew use text to create video of a convertible arriving at a European home, and a sailboat gliding through the ocean. 

"At the heart of this is storytelling," Glover said. "The closer we are to being able to tell each other our stories, the more we will understand each other."

But will people want to see an algorithmically created piece of art? OpenAI introduced Sora in February. The text-to-video model called aimed to "understand and simulate the physical world in motion," the company said. 

Google also announced Imagen 3, the latest in the tech giant’s Imagen generative AI model family that is said to more accurately understands text prompts that it translates into images versus Imagen 2, the prior version. Google believes this version is more creative and detailed, and the model produces fewer distracting images and errors. 

Imagen 3 uses 
SynthID, an approach developed by DeepMind to apply invisible, cryptographic watermarks to media.

Both Veo and Imagen 3 will become available to use in a private preview through VideoFX from Google Labs.

Next story loading loading..