Publication Date

Spring 2019

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science

First Advisor

Philip Heller

Second Advisor

Katerina Potika

Third Advisor

Thomas Austin


COI gene, Classification, Profile Hidden Markov Models


Traditional classification systems for living organisms like the Linnaean taxonomy involved classification based on morphological features of species. This traditional system is being replaced by molecular approaches which involve using gene sequences. The COI gene, also known as the ”DNA barcode” since it is unique in every species, can be used to uniquely identify organisms and thus, classify them. Classifying using gene sequences has many advantages, including correct identification of cryptic species(individuals which appear similar but belong to different species) and species which are extremely small in size. In this project, I worked on classifying COI sequences of unknown species to a genus, using Profile Hidden Markov Models.

(Taxonomy Ranks: Kingdom → Phylum → Class → Order →Family → Genus → Species)