Publication Date
Fall 2021
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Nada Attar
Second Advisor
Philip Heller
Third Advisor
Bilal Alsallakh
Abstract
The nitrogenase iron protein (NifH) is extensively used to study nitrogen fixation, the ecologically vital process of reducing atmospheric nitrogen to a bioavailable form. The discovery rate of novel NifH sequences is high, and there is an ongoing need for software tools to mine NifH records from the GenBank repository. Since record annotations are unreliable, because they contain errors, classifiers based on sequence alone are required. The ARBitrator classifier is highly successful but must be initialized by extensive manual effort. A Deep Learning approach could substantially reduce manual intervention. However, attempts to build a character-based Deep Learning NifH classifier were unsuccessful. We hypothesized that we could generate visual representations of protein sequences and use a Convolutional Neural Network to classify the representations. Here we present the resulting classifier, which has achieved false positive and false negative rates of 0.19% and 0.22%, respectively.
Recommended Citation
Rez, Amer, "Nitrogenase Iron Protein Classification using CNN Neural Network" (2021). Master's Projects. 1049.
DOI: https://doi.org/10.31979/etd.wum7-btdc
https://scholarworks.sjsu.edu/etd_projects/1049