Music and AI - Part 4

Last Updated:

Just some closing thoughts on this topic, as well as some random notes and questions I’ve left for myself in the future.

In the second post on AI and Music, we discussed why it’s difficult to generate music with AI. Along the way, we presented different learning archictures, and how they were applied to the problem through different strategies. Here is a quick summary of everything we went through:

List of Architectures

  1. Feedforward
  2. Autoencoder
  3. Variational Autoencoder
  4. Restricted Boltzman Machine
  5. Recurrent Networks
  6. Convolutional Networks
  7. Conditioning Convolutional Networks
  8. Generative Adversarial Networks
  9. Reinforcement Learning

List of Strategies

  1. Single step feedforward
  2. Decoder feedforward
  3. Sampling based methods
  4. Iterative feedforward
  5. Input manipulation
  6. Reinforcement
  7. Unit Selection

List of Challenges

  1. Creatio Ex nihilo (creation from nothing)
  2. Length variability (music of different lengths)
  3. Content variability (is it deterministic?)
  4. Expressiveness (dynamics and feeling)
  5. Melody-harmony consistency
  6. Control (user ability to affect outcome)
  7. Style transfer (applicable to images, but can it apply to music?)
  8. Structure (songs formats, AABA, etc.)
  9. Originality (not just repeating information from training data)
  10. Incrementality
  11. Adaptability
  12. Explainability

Recurrent vs Convolutional Networks

Convolutional networks have not been explored thoroughly as a way to generate or recognize music. Maybe this is because music representation is so complex that it’s difficult to visualize how a CNN would apply.

That being said, convolutional networks do have two advantages:

  1. Faster to train and easier to parallelize
  2. By nature of their implementation, convolutional networks increase the volume of data.

Transfer Learning

Transfer learning for music is another area that really hasn’t been explored in depth, but which could be enormously useful, if we use the (more developed) subject of image generation as a sign of things to come.

Music and AI - Part 4 | Notes