Adaptive learning has been around for years, but now this technology is seen in multimodal robotics. It appears that a similar technology is being integrated into AI agents and advertising at companies like Google and Microsoft. However, companies have been quiet about the possibilities. It is very early in the process, but the change is coming.
Last weekend I spent a few days in Las Vegas and finally attended a movie showing of "Postcard from Earth" at the Sphere by Darren Aronofsky. The movie -- an immersive experience through the lens of an 18-K camera -- was fascinating, but the humanoid robots in the atrium area created by Engineered Arts in the UK caught my attention.
Something Aura the robot said to me in response to the questions "how far can you see" and "are you continually updated" triggered my curiosity as to how Google might integrate some of Google DeepMind robotic technology into advertising and AI agents.
advertisement
advertisement
Generative AI has shown promise in advertising and robotics based on multimodal and navigation technologies. In November, DeepMind released information on two new AI systems, ALOHA Unleashed and DemoStart, helping robots learn to perform complex tasks requiring dexterous movement.
Google describes multimodal as a machine learning (ML) model capable of processing information from different modalities, such as images, videos, and text. For example, Google's multimodal model Gemini can receive a photo of a plate of cookies and generate a written recipe as a response.
Multimodality gives AI the ability to process and understand different sensory modes.
Users -- and in this case advertisers -- are not limited to one input or output type and can prompt a model with virtually any input to generate virtually any content type. It also learns from the person's behavior that interacts with the model or robot, which is key to understanding or imagining the possible future of advertising and AI agents.
Think about how performance, measurement and metrics are closely connected to adaptive learning with multimodal capabilities and how this will play a role in advertising.
According to Google, Gemini is a multimodal model from the team at Google DeepMind that can be prompted with images, text, code, and video.
Gemini was designed to reason in terms of these inputs and outputs. Gemini on Vertex AI can use prompts to extract text from images, convert image text to JSON, and generate answers about uploaded images.
The humanoids from Engineered Arts are multimodal -- something Google has been working on for advertising and Google DeepMind for robotics.
Google’s and Microsoft’s AI agents have vision. Shouvik Paul, COO of Copyleaks, an AI-based text analysis platform, said where this becomes interesting is the technology can make inferences based on what it sees -- adding personalization.
Paul mentioned advancements in education, but when asked about advertising, he said changes in active learning focus on behavior. Platforms like Amazon are not as smart as they could be.
“It should have known that as a dad I’m probably not ordering dragon jewelry for me,” he said, pointing to the repeated ads about dragon jewelry after he bought some for his daughter to give to a friend for her birthday. “Amazon knew I had already bought the jewelry, but the platform kept targeting the ads to me. They should have connected the dots.”
Not only should they connect the dots and realize he has a daughter between the ages of X and Y, but Amazon Web Services (AWS) should have the technology through its AI offerings to identify this behavior.
He said that with multimodal advertising, consumers will be less annoyed with advertising. “Multimodal technology can see, listen and read, and that is really interesting,” he said. “This can look at you or something on your computer and identify it by researching it on LinkedIn or other websites. Based on the information, including listening to your voice, it connects the dots to make assumptions in milliseconds.”
When I reached out to Google about adaptive learning in advertising, the media relations person asked for clarification to my questions. When I clarified, she did not respond.