Publication Date

Fall 2023

Degree Type

Master's Project

Degree Name

Master of Science in Data Science (MSDS)

Department

Computer Science

First Advisor

Ching-Seh Wu

Second Advisor

Navrati Saxena

Third Advisor

Fabio di Troia

Keywords

Gesture Recognition, Machine Learning, Computer Vision, Deep Learning, Image Processing Filters, Ensemble models, Sign Language Translation

Abstract

With the rising incidence of hearing loss, effective sign language recognition has become crucial for enhancing communication for individuals with hearing impairments. Traditional sensor-based recognition systems have been challenged by the complexities of realworld settings, prompting a shift toward more adaptable vision-based recognition systems. Distinct from previous studies, this work pioneers the use of ensemble methods with advanced filtering techniques on the Sign Language MNIST dataset, offering a novel perspective on sign language recognition. This research delves into the intersection of machine learning and image processing to develop a robust framework for sign language recognition. A range of filters, including Sobel, Canny, and Hough transform, were employed in preprocessing to optimize feature extraction across various machine learning models such as Convolutional Neural Networks (CNN), XGBoost, Light GBM, CatBoost, Support Vector Machines (SVM), and VGG16. Our findings reveal that while the SVM and XGBoost models require considerable training time due to the image-based data's complexity, the ensemble models, particularly those pairing CNN with SVM, XGBoost, Light GBM, and CatBoost, exhibit a synergistic effect that balances training efficiency with high accuracy. The Sobel filter with SVM ensemble emerged as the most rapid in training, indicating its potential for real-time applications. Accuracy assessments demonstrated that CNN and CNN+SVM models achieved perfect scores, signifying their exceptional categorization capabilities, whereas models like VGG 16 need refinement to mitigate overfitting or dataset-related limitations. The predictive efficiency of these models was systematically classified, with some exhibiting remarkably swift prediction times, underscoring their suitability for systems where computational resources are a constraint. Ultimately, this study comprehensively evaluates various machine learning models, highlighting the trade-offs between computational demand and accuracy and underscores the promise of ensemble approaches in sign language recognition technology.

Share

COinS