
MusicLM is Google’s new
artificial intelligence (AI) system built to generate music from any genre from simple text descriptions, such as “song played at the end of a sad movie” or “arcade game
sounds.”
Here’s the catch: Google says it has no immediate plans to release the technology, citing
potential risks.
There are plenty of music generator AI programs on the market such as Riffusion, which composes music by visualizing it;
Boomy, which has already been used to create millions of original songs; or OpenAI’s Jukebox, which has already caused conflict in the music industry due to its ability to rewrite existing music
and deepfake-style covers in the voice of famous artists.
advertisement
advertisement
What excites tech folks about MusicLM in
particular is its potential ability to produce songs that are more complex in composition and sound, which other programs have not yet been able to accomplish.
In an academic paper, MusicLM is said to be able to generate high-fidelity music from specific detailed descriptions like “a calming
violin melody backed by a distorted guitar riff.”
MusicLM “generates music at 24 kHz that remains consistent over several minutes,” the paper reads. “Our
experiments show that MusicLM outperforms previous systems both in audio quality and adherence to the text description.”
Furthermore, it has been shown that
MusicLM can be conditioned on both text and a melody. In other words, it can transform whistled and hummed melodies according to the style described in a text caption.
The system was trained on a dataset of 280,000 hours of music to learn to generate coherent songs.
Even though the samples often include unwanted distortion, with lyrics remaining basic or difficult to
understand, it is noteworthy how high-quality MusicLM’s samples actually sound compared to other AI music generators. Here is a list of examples.
However, unlike OpenAI, which released its ChatGPT system without taking into account the dire effects it would have on early education, for example, Google has noted
ethical challenges posed by such technology, including the incorporation of copyrighted material.
This
concern is based on the fact that about 1% of the music MusicLM generated was directly replicated from the songs it was trained on.
“We acknowledge the risk of potential misappropriation of creative content associated to the use case,” the co-authors of the paper wrote. “We strongly
emphasize the need for more future work in tackling these risks associated to music generation.”