Chinese technology company Alibaba on Friday launched Tongyi Wanxiang, an artificial intelligence tool that can generate images from prompts, following U.S. companies like Microsoft into the fray.
Microsoft in March announced that Bing AI could generate images based on natural-language prompts. Using the Bing Image Creator, powered by OpenAI’s DALL-E, users can ask Bing to generate any image with text.
Tongyi Wanxiang uses prompts in Chinese and English to generate an image in various styles such as a 3D cartoon or sketch.
Alibaba’s cloud division launched the product to enterprise customers in China in beta. It was developed using Composer, Alibaba Cloud’s proprietary large language model (LLM) that enables greater control over the final image output, such as spatial layout and palette.
“With the release of Tongyi Wanxiang, high-quality generative AI imagery will become more accessible, facilitating the development of innovative AI art and creative expressions for businesses across a wide range of sectors, including e-commerce, gaming, design and advertising,” says Jingren Zhou, CTO of Alibaba Cloud Intelligence.
Alibaba is one of the world's largest retailers and ecommerce companies, but in 2020, it was also rated as the fifth-largest artificial intelligence company.
The tool is powered by Alibaba Cloud’s technologies in knowledge arrangement, visual AI and natural-language processing (NLP), the model leverages multilingual materials for training. It has a semantic comprehension capability, resulting in more accurate and contextually relevant image generation, according to the company.
GAI refers to a type of artificial intelligence that is trained on a massive amount of text data, and is able to communicate and generate human-like text in response to a range of prompts and questions.
In the U.S., Google launched its GAI chatbot Bard, and Microsoft uses technology from OpenAI. China’s Baidu released Ernie Bot.
Optimizing high-resolution diffusion processes that are based on the signal-to-noise ratio, Alibaba’s GAI model can strike a balance between composition accuracy and detail sharpness improving on its ability to generate high-contrast, visual images with clean backgrounds.
Alibaba Cloud also unveiled a framework that harnesses the power of LLMs. They call it ModelScopeGPT. It will use LLMs as a controller to connect an extensive array of domain-specific expert models in the ModelScope open-source community. It leverages various AI capabilities offered on Alibaba Cloud.
Enterprises and developers can use ModelScopeGPT for free access to models for performing sophisticated AI tasks based on users’ requests, such as developing multilingual videos.