Exploring Music Representation Learning for Detection of Finer-Grained Details
Publication Date
2-25-2026
Document Type
Article
Publication Title
Multimedia Tools and Applications
Volume
85
Issue
3
DOI
10.1007/s11042-026-21302-w
Abstract
The foundational tonic in Indian classical music presents a crucial element for algorithmic understanding. This research investigates the efficacy of modern machine learning techniques for detecting this fine-grained musical attribute from diverse audio representations. Addressing limitations in prior work that used spe-cialized techniques, this study pioneers the integration of audio representations with non-linear dimensionality reduction and machine learning classifiers for tonic classification. The methodology employs various machine learning algorithms on music represented by audio features such as Mel-frequency cepstral coefficients and Mel spectrograms. Spectral analysis using Uniform Manifold Approximation and Projection (UMAP) offers novel insights into music representation learning. The results from tonic classification using permutations of classifiers and dimen- sionality reduction techniques demonstrate the feasibility of automated detection, with UMAP-reduced data achieving a peak accuracy of 99.56%, significantly out- performing Principal Component Analysis (76.82%) and full data without any dimensionality reduction (55.19%). Experiments are also performed on a public dataset with the authors’ renditions for comparison and to increase the diversity and representation of Indian classical music data.
Keywords
Boosting classifiers, Cepstral coefficients, Machine learning, Random forests, Support vector machines, Uniform manifold approximation and projection
Department
Applied Data Science
Recommended Citation
Vishnu Pendyala, Samhita Konduri, and Kriti Pendyala. "Exploring Music Representation Learning for Detection of Finer-Grained Details" Multimedia Tools and Applications (2026). https://doi.org/10.1007/s11042-026-21302-w