Play it again IMuCo! Music Composition to Match your Mood

Publication Date


Document Type

Conference Proceeding

Publication Title

2020 Second International Conference on Transdisciplinary AI (TransAI)

Conference Location

Irvine, CA/Virtual




Relating sounds to visuals, like photographs, is something humans do subconsciously every day. Deep learning has allowed for several image-related applications, with some focusing on generating labels for images, or synthesize images from a text description. Similarly, it has been employed to create new music scores from existing ones, or add lyrics to a song. In this work, we bring sight and sound together and present IMuCo, an intelligent music composer that creates original music for any given image, taking into consideration what its implied mood is. Our music augmentation and composing methodology attempts to translate image “linguistics” into music “linguistics” without any intermediate natural language translation steps. We propose an encoder-decoder architecture to translate an image into music, first classifying it into one of predefined moods, then generating music to match it. We discuss in detail how we created the training dataset, including several feature engineering decisions in terms of representing music. We also introduce an evaluation classifier framework used for validation and evaluation of the system, and present experimental results of IMuCo's prototype for two moods: happy and sad. IMuCo can be the core component of a framework that composes the soundtrack for longer video clips, used in advertising, art, and entertainment industries.


artificial intelligence, encoder-decoder architecture, intelligent music composer, music composition, deep learning


SJSU users: Use the following link to login and access the article via SJSU databases.


Computer Engineering