Publication Date

Fall 2023

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Chris Tseng

Second Advisor

Faranak Abri

Third Advisor

Hang Zhang

Keywords

Sign Language Translation, Video ResNet, r3d_18 model, r(2+1)d_model, mc3_model, Kinetics-400, GRIT

Abstract

Sign languages, vital for communication among the deaf and hard-of-hearing (DHH) people, face a significant linguistic diversity challenge with over 200 distinct sign languages worldwide. Bridging this communication gap is a priority. Traditional tools like interpreters and costly translation devices have limitations. This project aims to use deep learning techniques to develop a model capable of recognizing sign language from short videos. Our model not only recognizes the sign from a single video clip, but is also capable of making prediction of consecutive pairs of signs. To achieve zero-short gesture sequence recognition, we propose a novel temporal dilation strategy, converting a static video classification model to accept the input of a video of gesture sequence and making a sequence of predictions. Our model achieves 99% accuracy1 on the gesture recognition dataset (GRIT), and 73.18% accuracy on gesture sequence recognition task. This advancement aims to break down barriers and enhance opportunities for DHH individuals, fostering greater inclusivity in education, employment, sports, and social activities. The source code and the pretrained models are publicly available.

Recommended Citation

Yang, Xiaoqian, "Temporal Dilation in Video ResNet for Sign Language Translation" (2023). Master's Projects. 1326.
DOI: https://doi.org/10.31979/etd.9nk9-qh4u
https://scholarworks.sjsu.edu/etd_projects/1326

Download

Included in

Other Computer Engineering Commons

COinS

DOI

https://doi.org/10.31979/etd.9nk9-qh4u

Master's Projects

Temporal Dilation in Video ResNet for Sign Language Translation

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Temporal Dilation in Video ResNet for Sign Language Translation

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links