Publication Date
Fall 2022
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Nada Attar
Second Advisor
Mike Wu
Third Advisor
Noha Elfiky
Keywords
Facial Expression Recognition, Deep Learning, ResNet, Senet, GANs
Abstract
Human beings express themselves via words, signs, gestures, and facial emotions. Previous research using pre-trained convolutional models had been done by freezing the entire network and running the models without the use of any image processing techniques. In this research, we attempt to enhance the accuracy of many deep CNN architectures like ResNet and Senet, using a variety of different image processing techniques like Image Data Generator, Histogram Equalization, and UnSharpMask. We used FER 2013, which is a dataset containing multiple classes of images. While working on these models, we decided to take things to the next level, and we attempted to make changes to the models themselves to improve their accuracy.
While working on this research, we were introduced to another concept in Deep Learning known as Generative Adversarial Networks, which are also known as GANs. They are generative deep learning models which are based on deep CNN models, and they comprise two CNN models - a Generator and a Discriminator. The primary task of the former is to generate random noises in the form of images and passes them to the latter. The Discriminator compares the noise with the input image and accepts/rejects it, based on the similarity. Over the years, there have been various distinguished architectures of GANs namely CycleGAN, StyleGAN, etc. which have allowed us to create sophisticated architectures to not only generate the same image as the original input but also to make changes to them and generate different images. For example, CycleGAN allows us to change the season of scenery from Summer to Winter or change the emotion in the face of a person from happy to sad. Though these sophisticated models are good, we are working with an architecture that has two deep neural networks, which essentially creates problems with hyperparameter tuning and overfitting.
Recommended Citation
Sudhakar, Sriramm Muthyala, "A Study on Human Face Expressions using Convolutional Neural Networks and Generative Adversarial Networks" (2022). Master's Projects. 1207.
DOI: https://doi.org/10.31979/etd.v3ha-qydc
https://scholarworks.sjsu.edu/etd_projects/1207