Original Music Generation using Recurrent Neural Networks with Self-Attention

Publication Date

1-1-2022

Document Type

Conference Proceeding

Publication Title

Proceedings - 4th IEEE International Conference on Artificial Intelligence Testing, AITest 2022

DOI

10.1109/AITest55621.2022.00017

First Page

56

Last Page

63

Abstract

A recent trend in deep learning is using state-of-the-art models to generate human art forms. Using such 'intelligent' models to generate novel musical compositions is a thriving area of research. The motivation is to use the capacity of deep learning architectures and training techniques to learn musical styles from arbitrary musical corpora automatically and then generate samples from the estimated distribution. We focus on two popular state-of-the-art models used in deep generative learning of music, namely recursive neural networks (RNN) and the self-attention mechanism. We provide a systematic evaluation of state-of-the-art models used in generative deep learning for music but also contribute novel architectures and compare them to the established baselines. The models are trained on piano compositions embedded in MIDI format from Google's Maestro dataset. A big challenge in such learning tasks is to evaluate the outcome of such learning tasks, since art is very subjective and hard to evaluate quantitatively. Therefore, in addition to the experimental evaluation, we also conduct a blind user study. We conclude that a double-stacked RNN model with a self-attention layer was observed to have the most optimal training time, and the pieces generated by a triple-stacked RNN model with self-attention layers were deemed the most subjectively appealing and authentic.

Keywords

Generative deep learning, MAESTRO, MIDI, music generation, piano-roll, RNN, Self-Attention

Department

Computer Engineering

Share

COinS