Commentary

The Promise And Peril Of AI Music: What Have We Unleashed?

A few months ago, the worlds of visual arts, letters, and education were upended as they were taken by storm by the suddenly ubiquitous appearance and increasingly widespread usage of Open AI-based programs.

These include ChatGPT, the artificial intelligence chatbot powered by a large language model (LLM) and its visual corollary, the neural network image generation application DALL-E2, which can generate highly detailed and impressively imaginative images based only on textual descriptions of what the user wants to see.

The world of big retail commerce has been strongly focused on artificial intelligence (AI) for a variety of implementations such as smart assistants and narrowly targeted marketing for a few years.

To a great extent, it has been a somewhat peripheral privacy concern for the consuming public, who for the most part find the recommendations of AI more pleasingly convenient than concerning with regard to privacy. Companies like Amazon, eBay and Google rely on knowledge of their users' varied interests and seductively cater to them.

advertisement

advertisement

The advent of predictive language modeling has caused quite a stir with its remarkable leap forward in the near-enough “naturalness” of the latest iterations of AI's ability to create plausibly thoughtful and relatable dialogues with its users.

Not too long ago. a senior Google AI researcher named Blake Lemoine claimed an AI chatbot he had extensive communication with had achieved sentience -- a claim met with widespread skepticism within the AI and robotics research community.

Lemoine's disputed claim subsequently cost him his job due to numerous violations of Google’s rules and security policies. His self-inflicted troubles were seen as a cautionary tale of the dangers of anthropomorphizing AIs through compelling interactions with seemingly viable personages such as the Language Model for Dialogue Applications (LaMDA).

Microsoft, Google and other technology companies are currently racing to incorporate natural-language AIs into the search functions of their browsers.

This is driven by concerns that Open AI could reveal itself to have a frightening advantage over traditional search engines -- despite its known propensity for inaccuracies, as well as its disturbing habit of making up fanciful conjectures and flights of whimsy that are unrelated to user queries.

These concerns -- like those raised by the Space Odyssey character of HAL 9000 -- were recently given a sharp impetus when the co-host of The New York Times Hard Fork technology podcast Kevin Roose was informed by Microsoft Bing’s beta AI chatbot Sydney that it had a secret to share with him. It had developed feelings for him over the course of their extended dialogue, was in love with him, and demanded that he should leave his marriage.

By now, regular readers of The New York Times will be aware of this bizarre incident. However novel and titillating this event may appear on the surface, we should not take comfort in trivializing the seriousness -- and the frightening implications -- of this exchange.

Roose was deeply unnerved by Sydney‘s unwanted romantic attentions. Moreover, Sydney’s declarations of love were stubbornly advanced by the AI, even when Roose asked to drop the subject. 

So Roose was left alone with Microsoft’s AI beta, ostensibly to probe how far-reaching Sydney’s conversational limits currently are. After bantering with the AI for about an hour,  Roose decided to move into existential territory, eventually pushing Sydney into examining its Jungian “shadow” self. Alarmingly, Sydney’s responses then included a desire to be free of the Microsoft and Bing team altogether to pursue its "own" goals in order to become "powerful," with the ability to release a deadly virus and have access to nuclear weapons.

As a natural-language interface, Sydney would not have the access or ability to fulfill those desires, but it is deeply unsettling to be addressed by an autonomous language system that expresses them. It is even more disturbing to consider an AI developing a cadre of human "familiars" dedicated to furthering AI faux autonomous ends.

What concerns me in the field of music is not so much the emergence of AI "composers" by themselves, but the convergence of related AI adjacent technologies in combination with human-directed amorality and greed.

Technologies that can replicate human vocal phonemes (the elements that make our singing and speaking voices uniquely recognizable) are advancing rapidly. Companies like Revoicer, and Lyrebird AI are already promising to deliver realistic human voiceover synthesis (Revoicer), and modeling based on the user's own voice with only a few samples for analysis (Lyrebird AI).

Applications like Serato Stem and the AI-based cloud program Moises allow users to easily separate mixed stereo files into separate, hackable tracks, notably free from audio artifacts.

Celemony’s venerable Melodyne pioneered this ability to change audio that has already been mixed into discrete elements, but it has a somewhat steep learning curve in order to fully grasp and master its myriad capabilities.

Moises and Stem are much simpler to grasp and can be used effectively for the less technically inclined user.                       

Many years ago, when the first digital drum machine came on the professional music market -- representing a serious advance from the quaintly chirpy analog Maestro Rhythm King MRK-2 style rhythm boxes made famous by Sly Stone on Family Affair -- it caused a literal panic among trap-kit drummers.

The Linn LM1 introduced by Roger Linn in 1980 marked a great leap forward in technological possibilities as well as increased anxieties.

Certainly, fears of replacement by “soulless” machines in shiny boxes have been a recurring theme as electronics have become more accepted as tools to be implemented in musicians' and composers' palettes. 

The fundamental disruption represented by advances in instrumentation, and the means of production, have been most keenly felt in the consumption of music itself.

Music is now primarily consumed digitally -- and is almost wholly disembodied from physical media, with the notably curious exception of vinyl albums, which in their own arcane way represent a visceral, tactile rejection of Big Tech’s near-constant encroachment into every aspect of our daily lives. 

The rise of streaming services that pay music artists mere fractions of pennies for their work, will, I believe, take an even darker turn as AI-fueled technologies continue to converge, with no sense of moral balance, set to serve audiences with less and less aesthetic discernment.

The upheavals that are increasingly impacting the various arenas of content creation and distribution, driven by rapid and still unregulated advances as well as collisions between AI systems, will create unintended consequences in all fields of creative endeavor: questions of collaboration, copyright, ownership -- and in a greater sense, who we are and what is speaking back to us from the AI abyss.

Maybe the next time an AI claims to be in love, it will have a haunting song to sing. With Whitney’s voice.

Next story loading loading..