Co-Creating Music with AI

Compounding human-made music with AI compositions widens artistic ideation. The AI Song Contest1 participants explored this idea with varying techniques.

All teams used AI to generate lyrics and melodies, and more than half created vocals with AI. Teams almost always generated lyrics using AI because high-performing models like GPT-2 along with a fine-tuning script were readily available2.

Teams with professional musicians often chose to only generate lyrics and melodies, giving them the creative space to interpret how these can be arranged in a song of their own style2.

Teams with more ML expertise often used multi-part models because they could generate melody, harmony, bass, and drums together in a coherent way, providing teams with larger building blocks to work with.

Many teams stitched together the AI music with their own. Some tweaked AI-generated material to conform to a style. A musician that wrote a bassline said it “seemed more or less implied by the AI material.”

Many teams fed the output of a model into another. One team used GPT-2 to generate the lyrics, and used a lyric-conditioned model to generate melodies. Pipeline approaches allowed teams to refine outputs at intermediate stages.

Teams often used models to generate a large volume of samples to choose from. For example, a team used LSTMs and convolutional neural nets to generate 450+ melodies and basslines; another team generated 10K lines of death metal lyrics.

Some teams automated the filtering process to pick out the best generations. They trained “catchiness” classifier models on the rankings of songs.

Some teams co-created with AI in a more blended manner using outputs of the other to influence their creations. A team played AI-generated chords and hummed until they got a compatible melody. A participant described it as jamming with another musician.

Using Machine Learning (ML) models surely had limitations. Given the black-box nature of most large-scale generative models, it is hard to steer them precisely.

Many teams fine-tuned their generative models on a smaller dataset that had music of the mood they wanted. A common technique was to use a type of input, like risers or notes, to generate a wider variety of the same.

It was difficult to come up with whole, coherently arranged, structurally meaningful songs as AI is not very context-aware. Teams often created in sections and added contrast in-between by varying starting notes or melodies for the verse and chorus in their models.

Teams felt that model-wrangling tasks to come up with holistically cohesive songs interfered with the rapid-iteration cycle of creativity.

Though co-creating with AI involves the unwanted burden of identifying work-arounds to the AI’s constraints, it augments the human’s creative versatility.

References:
1. https://www.aisongcontest.com/
2. Huang et al, 2020


Subscribe to my mailing list to be notified about my posts:

2 comments

  1. Good article Alind. It would be even better if you make it simpler and use less jargons or technical stuff.

Comments are closed.