Analysis of Multilanguage Regional Music Tracks Using Representation Learning Techniques in Lower Dimensions

Publication Date

6-30-2024

Document Type

Conference Proceeding

Publication Title

Proceedings of the Tenth International Conference on Mathematics and Computing

Volume

964 LNNS

DOI

10.1007/978-981-97-2066-8_14

First Page

151

Last Page

163

Abstract

Machine understanding of music requires digital representation of the music using meaningful features and then analyzing the features. The work in this paper is unique in using representation learning techniques in lower dimensions for analyzing the effectiveness of mel-spectrogram features of assorted music tracks in multiple languages. The features are plotted in three different transformed feature spaces for visual inspection of the fine-grained attributes of the music rendition such as the vocal artist, their gender, language, and standing in the industry. The analysis of the music tracks in a chosen dataset using spectral and non-spectral algorithms such as Principal Component Analysis (PCA), t-distributed Stochastic Neighbor Embedding (t-SNE), and Uniform Manifold Approximation and Projection (UMAP) provide valuable insights into the representation learning of the selected music tracks. UMAP performs better than the other two algorithms and is able to reasonably discern the various subtler aspects of a music rendition.

Keywords

Dimensionality Reduction, Feature transformation, Machine Learning, Mel-spectrogram, t-distributed Stochastic Neighbor Embedding, Uniform Manifold Approximation and Projection

Department

Applied Data Science

Share

COinS