Play it again IMuCo! Music Composition to Match your Mood
Software Engineering | Theory and Algorithms
2020 Second International Conference on Transdisciplinary AI (TransAI)
Relating sounds to visuals, like photographs, is something humans do subconsciously every day. Deep learning has allowed for several image-related applications, with some focusing on generating labels for images, or synthesize images from a text description. Similarly, it has been employed to create new music scores from existing ones, or add lyrics to a song. In this work, we bring sight and sound together and present IMuCo, an intelligent music composer that creates original music for any given image, taking into consideration what its implied mood is. Our music augmentation and composing methodology attempts to translate image “linguistics” into music “linguistics” without any intermediate natural language translation steps. We propose an encoder-decoder architecture to translate an image into music, first classifying it into one of predefined moods, then generating music to match it. We discuss in detail how we created the training dataset, including several feature engineering decisions in terms of representing music. We also introduce an evaluation classifier framework used for validation and evaluation of the system, and present experimental results of IMuCo's prototype for two moods: happy and sad. IMuCo can be the core component of a framework that composes the soundtrack for longer video clips, used in advertising, art, and entertainment industries.
artificial intelligence, encoder-decoder architecture, intelligent music composer, music composition, deep learning
Tsung-Min Huang, Hunter Hsieh, Jiaqi Qin, Hsien-Fung Liu, and Magdalini Eirinaki. "Play it again IMuCo! Music Composition to Match your Mood" 2020 Second International Conference on Transdisciplinary AI (TransAI) (2020). https://doi.org/10.1109/TransAI49837.2020.00008