Publication Date

Spring 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Teng Moh

Second Advisor

Melody Moh

Third Advisor

Navrati Saxena

Keywords

Bot Detection, GraphSage, BERT

Abstract

This project details a novel bot detection system developed to battle the ever- changing challenge of disinformation, misinformation, and other bot-generated content.

The methodology employed in this project combines the text-based analytical strength of BERT (Bidirectional Encoder Representations from Transformers) with the strength of GraphSage (Graph Sample and Aggregation) for analyzing network structures. The project concatenates BERT and GraphSage vectors to create an 896-size feature embedding with a rich blend of network and text features. This project employs a Support Vector Machine to process the concatenated embeddings, as SVM works well with high-dimensional data. This project was evaluated on two datasets, namely Cresci-15 and Twibot-22. This model outperformed all other models on the Cresci 15 dataset with an accuracy of 98.68%. Despite the challenges, the model had an accuracy score of 74.62% on the Twibot-22 dataset which still outperformed some state-of-the-art models. The model results highlight an efficient and scalable system that can handle large-scale datasets with intricate network structures.

Available for download on Saturday, February 28, 2026

Share

COinS