Impact of Data Heterogeneity on AI/ML Model Accuracy in Assisting Pneumonia Type Prediction
Publication Date
1-1-2024
Document Type
Conference Proceeding
Publication Title
Proceedings of the 2024 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2024
DOI
10.1109/IAICT62357.2024.10617531
First Page
253
Last Page
258
Abstract
Pneumonia is the fourth most common cause of mortality, resulting in more than 50,000 deaths in the U.S. alone every year. Cases of this respiratory infection have only been exacerbated by the COVID-19 pandemic as the virus tends to attack airways and gas exchange regions. The diagnosis of COVID-19 pneumonia depends on various factors, including the severity as well as the type of the disease, which physicians attempt to determine preliminarily by analyzing chest X-ray scans. With the enormous amounts of X-ray data, one can utilize an automated procedure to identify the defects in scanned images that in conjunction with other clinical diagnostics can lead to the verification of disease presence. Machine learning has emerged as a powerful tool to enable high-accuracy medical diagnostics. In the current work, various neural network algorithms, including the convolutional neural network (CNN), CNN+DenseNet121, CNN+EfficientNetB7, and CNN+ResNet50 were employed to classify chest X-ray images as one of the following diagnoses: Negative for COVID-19 pneumonia, Mild Atypical COVID-19 pneumonia, Moderate Atypical COVID-19 pneumonia, Severe Atypical COVID-19 pneumonia, Mild Indeterminate COVID-19 pneumonia, Moderate Indeterminate COVID-19 pneumonia, Severe Indeterminate COVID-19 pneumonia, Mild Typical COVID-19 pneumonia, Moderate Typical COVID-19 pneumonia, and Severe Typical COVID-19 pneumonia. The CNN, CNN+DenseNet121, CNN+EfficientNetB7, and CNN+ResNet50 models achieved training accuracies of 47.62%, 84.08%, 64.08%, and 74.30% and validation accuracies of 42.29%, 50.25%, 53.98%, and 43.28% respectively. Moderate classification performance across all four of the models suggests that data heterogeneity, particularly the presence of ten similar diagnostic scenarios, greatly limits the potential of machine learning in medical diagnostics. Nevertheless, data manipulations and advanced modeling is being studied further to overcome this barrier.
Keywords
Chest X-ray Images, COVID-19, Machine Learning, Neural Network, Pneumonia
Department
Mechanical Engineering
Recommended Citation
Manasvi Pinnaka, Krishnaveni Parvataneni, and Sohail Zaidi. "Impact of Data Heterogeneity on AI/ML Model Accuracy in Assisting Pneumonia Type Prediction" Proceedings of the 2024 IEEE International Conference on Industry 4.0, Artificial Intelligence, and Communications Technology, IAICT 2024 (2024): 253-258. https://doi.org/10.1109/IAICT62357.2024.10617531