Publication Date
Fall 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Ching-seh Wu
Second Advisor
William Andreopolous
Third Advisor
Robert Chun
Keywords
Machine learning, classification, natural language processing, cyberbullying detection, neural networks
Abstract
The issue of cyberbullying is growing due to the online anonymity and due to online platforms having less repercussions. This research proposes for proactive measures to detect and prevent such behavior before it reaches the victim. By using data from various social media platforms and employing machine learning techniques, this research proposes an innovative system aimed at identifying and thwarting cyberbullying incidents preemptively. While existing methods have primarily focused on prediction and detection of cyberbullying incidents, there remains a significant gap in research regarding prevention strategies. This project aims to address this gap by leveraging machine learning, natural language processing (NLP), and software development techniques to proactively prevent cyberbullying. This project uses an approach that involves the implementation of blocking and warning mechanisms to intervene before harmful content reaches the intended victim, fostering a safer online environment. In our research, we have also conducted an extensive comparison of five different feature engineering methods, along with nine machine learning algorithms. These algorithms encompass three ensemble methods, four statistical methods, and two deep learning algorithms, each with two variations. Additionally, we integrate data from multiple online platforms such as Twitter, Wikipedia comments, Kaggle and YouTube, to capture varying user behaviors effectively. Recognizing that behaviors may differ across platforms, our research employs a comprehensive approach to gather insights from diverse sources. Throughout this process, the achieved accuracy across the different algorithms ranges from 87.2% to 95.5%. In this report, we will also discuss other metrics that are relevant to text classification, apart from accuracy.
Recommended Citation
Hegde, Chinmayi Lokeshwar, "MULTI-PLATFORM CYBERBULLYING DETECTION USING NLP AND MACHINE LEARNING" (2024). Master's Projects. 1428.
https://scholarworks.sjsu.edu/etd_projects/1428