Publication Date
Spring 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Saptarshi Sengupta
Second Advisor
Nada Attar
Third Advisor
William Andreopoulos
Keywords
Fake news, News articles, XGBoost, Naive Bayes, BERT, Machine learning
Abstract
This project employs machine learning techniques to develop a sequential model for detecting and categorizing fake news, aiming to mitigate its proliferation in today's digital landscape. The model operates in two phases: in the first phase, the classification algorithms like Naïve Bayes, XGBoost and Random Forest are used to distinguish between true and false news stories and in the second phase the capabilities of Naïve Bayes, XGBoost, Random Forest, and the Transformer-based BERT (Bidirectional Encoder Representations from Transformers) model are leveraged to further categorize the news into specific topics.
The methodology encompasses several key steps: data acquisition, preprocessing, feature extraction, and model training, followed by evaluation using metrics such as accuracy, F1 score and a detailed classification report. These processes ensure that the model not only identifies fake news but also classifies it effectively alongside legitimate articles.
Recommended Citation
Adusumilli, Snegdha, "A Two-stage Machine Learning Approach for Fake News Detection and News Article Categorization" (2024). Master's Projects. 1356.
DOI: https://doi.org/10.31979/etd.3gfk-b34t
https://scholarworks.sjsu.edu/etd_projects/1356