E-SMOTE: Entropy Based Minority Oversampling for Heart Failure and AIDS Clinical Trails Analysis
Publication Date
1-1-2024
Document Type
Conference Proceeding
Publication Title
Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024
DOI
10.1109/COMPSAC61105.2024.00291
First Page
1841
Last Page
1846
Abstract
Machine Learning (ML) algorithms often exhibit reduced performance in the presence of class imbalance, leading to biased results favoring the majority class in a dataset. This imbalance can be addressed through various sampling techniques, including oversampling of the minority class, undersampling of the majority class, or a combination of both. However, these techniques utilize the entire set of samples of datasets. In this paper, we introduce E-SMOTE, an Entropy-based Synthetic Minority Oversampling Technique (SMOTE), which extends the traditional SMOTE method. E-SMOTE is a novel oversampling technique designed to utilize a subset of the dataset from the minority class for the oversampling process. We employ entropy as a guiding metric to identify influential minority class instances located near decision boundaries. By generating additional instances near these boundaries within a binary classification system, E-SMOTE strengthens the decision boundary during the ML classifier training process. We conducted experiments on two datasets, Heart Failure Records and AIDS Clinical Trail Records, to demonstrate the effectiveness of E-SMOTE compared to traditional SMOTE. Our experimental results illustrate that E-SMOTE outperforms baseline classifier for both Heart Failure and AIDS clinical trial datasets. Additionally, it provides reasonable and comparable performance using a subset of the datasets compared to SMOTE oversampling technique using the entire dataset.
Keywords
Entropy, Imbalanced class, Machine Learning, Sampling, SMOTE
Department
Applied Data Science
Recommended Citation
Sainath Veerla, Anbu Valluvan Devadasan, Mohammad Masum, Mohammed Chowdhury, and Hossain Shahriar. "E-SMOTE: Entropy Based Minority Oversampling for Heart Failure and AIDS Clinical Trails Analysis" Proceedings - 2024 IEEE 48th Annual Computers, Software, and Applications Conference, COMPSAC 2024 (2024): 1841-1846. https://doi.org/10.1109/COMPSAC61105.2024.00291