Publication Date

Spring 2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Philip Heller

Second Advisor

Sami Khuri

Third Advisor

Nada Attar


nitroger fixation, annotation errors, sequence classifier


Nitrogenase Iron Protein (nifH) is the enzyme responsible for nitrogen fixation. Microbes with nifH gene are responsible for injecting reduced nitrogen into the biosphere, which is essential for all living things. Obtaining sequences from GenBank database is problematic due to annotation errors, nomenclature variation and paralogues. One possible solution could be to retrieve sequences from the GenBank database and use a sequence classifier to label the sequences. In this research, we convert sequences to images and build a nifH sequence classifier using image processing and convolutional neural network. We built a nifH classification model which can classify sequences with an accuracy of around 99%.