Publication Date
Spring 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Teng Moh
Second Advisor
Melody Moh
Third Advisor
Navrati Saxena
Keywords
Bot Detection, GraphSage, BERT
Abstract
This project details a novel bot detection system developed to battle the ever- changing challenge of disinformation, misinformation, and other bot-generated content.
The methodology employed in this project combines the text-based analytical strength of BERT (Bidirectional Encoder Representations from Transformers) with the strength of GraphSage (Graph Sample and Aggregation) for analyzing network structures. The project concatenates BERT and GraphSage vectors to create an 896-size feature embedding with a rich blend of network and text features. This project employs a Support Vector Machine to process the concatenated embeddings, as SVM works well with high-dimensional data. This project was evaluated on two datasets, namely Cresci-15 and Twibot-22. This model outperformed all other models on the Cresci 15 dataset with an accuracy of 98.68%. Despite the challenges, the model had an accuracy score of 74.62% on the Twibot-22 dataset which still outperformed some state-of-the-art models. The model results highlight an efficient and scalable system that can handle large-scale datasets with intricate network structures.
Recommended Citation
Deshmukh, Abhishek, "Bot Detection in Social Media using GraphSage and BERT" (2025). Master's Projects. 1465.
https://scholarworks.sjsu.edu/etd_projects/1465