Publication Date

Fall 2024

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

William Andreopoulos

Second Advisor

Leonard P Wesley

Third Advisor

Wendy Lee

Keywords

ESMFold, ColabFold, Toxin-Antitoxin Systems, Fine-tuning, Mutated Sequence Analysis, Protein Structure Prediction, Data Augmentaion, Bacterial Genomics, Structural Bioinformatics

Abstract

Generative AI models have vast applications and one such critical application explored in this study is protein structure prediction. The 3D structures of proteins determine their function. Our study mainly focuses on using generative AI models such as ESMFold and ColabFold to predict and examine naturally occurring and mutated sequences. The workflow begins with collecting antimicrobial resistance (AMR) and toxin-antitoxin (TA) protein data. The sequences are applied over pretrained AI models to predict protein structures. Following this, models are fine-tuned with original and mutated target datasets. A comparison of models’ performances is done using metrics such as root mean square error, predicted template modeling and predicted local distance difference test scores. This work lays the foundation for future study in leveraging generative AI models for novel protein structures prediction and drug discovery.

Available for download on Saturday, December 20, 2025

Share

COinS