Google Gemini launched Wednesday in a long-awaited debut of the company’s most powerful artificial intelligence large language model (LLM). It is multimodal and capable of reasoning across text, images, audio, video and code.
The company said it has started to experiment with Gemini in Search, where it is making Google’s Search Generative Experience (SGE) faster for users, with a 40% reduction in latency in English in the U.S.
It also will become available in services like Search, Ads, Chrome and Duet AI.
The technology works based on state-of-the-art performance. Beginning December 13, developers and enterprise customers can access Gemini Pro via the Gemini API in Google AI Studio or Google Cloud Vertex AI.
Google plans to license Gemini to customers through Google Cloud for use in their own applications.
The company trained Gemini using a new generation of powerful cloud-based processors that can collectively train large AI models nearly three times faster than the prior version.
That technology could mean a significant boost for the wider AI industry and advertisers, making AI training more accessible.
“With a score of 90.0%, Gemini Ultra is the first model to outperform human experts on massive multitask language understanding, which uses a combination of 57 subjects such as math, physics, history, law, medicine and ethics for testing both world knowledge and problem-solving abilities,” Demis Hassabis, CEO and co-founder of Google DeepMind, wrote in a blog post on behalf of the Gemini team.
While Gemini Ultra is designed for extremely demanding use cases such as in data centers, Gemini Nano will fit in Google smartphones.
Gemini -- which will run on everything from data centers to mobile devices -- will become available in Ultra, Pro and Nano versions. Each one is designed for different uses. All are multimodal, with the ability to handle a range of inputs. The technology will only work in English to start, but Google said it will add other languages.
Early next year, Google will also launch Bard Advanced, a new AI experience that gives users access to its best models and capabilities, starting with Gemini Ultra.
Gemini’s sophisticated multimodal reasoning capabilities can help make sense of complex written and visual information.
The biggest challenge -- and likely the reason for the delay in launching Gemini -- is that it was important to ensure the multimodal technology would not become dangerous or offensive because of its ability to handle more than one input (such as text and images) simultaneously.