Publication Date

Spring 2023

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Nada Attar

Second Advisor

Ching-Seh Wu

Third Advisor

Mashhour Solh

Keywords

classification task, data augmentation, deep learning models, ensemble techniques, sign language recognition, skeletal keypoints, pose extraction

Abstract

Sign language recognition (SLR) has long been a studied subject and research field within the Computer Vision domain. Appearance-based and pose-based approaches are two ways to tackle SLR tasks. Various models from traditional to current state-of-the-art including HOG-based features, Convolutional Neural Network, Recurrent Neural Network, Transformer, and Graph Convolutional Network have been utilized to tackle the area of SLR. While classifying alphabet letters in sign language has shown high accuracy rates, recognizing words presents its set of difficulties including the large vocabulary size, the subtleties in body motions and hand orientations, and regional dialects and variations. The emergence of deep learning has created opportunities for improved word-level sign recognition, but challenges such as overfitting and limited training data remain. Techniques such as data augmentation, feature engineering, hyperparameter tuning, optimization, and ensemble methods have been used to overcome these challenges and improve the accuracy and generalization ability of ASL classification models. We explore various methods to improve the accuracy and performance in this project. From the approach, we were able to first reproduce a baseline accuracy of 43.02% on the WLASL dataset and further achieve an improvement in accuracy at 55.96%. We also extended the work to a different dataset to gain a comprehensive understanding of our work.

Recommended Citation

Luong, Shayla, "Video Sign Language Recognition using Pose Extraction and Deep Learning Models" (2023). Master's Projects. 1251.
DOI: https://doi.org/10.31979/etd.jm4c-myd4
https://scholarworks.sjsu.edu/etd_projects/1251

Download

Included in

Artificial Intelligence and Robotics Commons, Graphics and Human Computer Interfaces Commons

COinS

DOI

https://doi.org/10.31979/etd.jm4c-myd4

Master's Projects

Video Sign Language Recognition using Pose Extraction and Deep Learning Models

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Video Sign Language Recognition using Pose Extraction and Deep Learning Models

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links