Publication Date

Fall 2024

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Ching-seh Wu

Second Advisor

William Andreopolous

Third Advisor

Robert Chun

Keywords

Machine learning, classification, natural language processing, cyberbullying detection, neural networks

Abstract

The issue of cyberbullying is growing due to the online anonymity and due to online platforms having less repercussions. This research proposes for proactive measures to detect and prevent such behavior before it reaches the victim. By using data from various social media platforms and employing machine learning techniques, this research proposes an innovative system aimed at identifying and thwarting cyberbullying incidents preemptively. While existing methods have primarily focused on prediction and detection of cyberbullying incidents, there remains a significant gap in research regarding prevention strategies. This project aims to address this gap by leveraging machine learning, natural language processing (NLP), and software development techniques to proactively prevent cyberbullying. This project uses an approach that involves the implementation of blocking and warning mechanisms to intervene before harmful content reaches the intended victim, fostering a safer online environment. In our research, we have also conducted an extensive comparison of five different feature engineering methods, along with nine machine learning algorithms. These algorithms encompass three ensemble methods, four statistical methods, and two deep learning algorithms, each with two variations. Additionally, we integrate data from multiple online platforms such as Twitter, Wikipedia comments, Kaggle and YouTube, to capture varying user behaviors effectively. Recognizing that behaviors may differ across platforms, our research employs a comprehensive approach to gather insights from diverse sources. Throughout this process, the achieved accuracy across the different algorithms ranges from 87.2% to 95.5%. In this report, we will also discuss other metrics that are relevant to text classification, apart from accuracy.

Recommended Citation

Hegde, Chinmayi Lokeshwar, "MULTI-PLATFORM CYBERBULLYING DETECTION USING NLP AND MACHINE LEARNING" (2024). Master's Projects. 1428.
https://scholarworks.sjsu.edu/etd_projects/1428

Master's Projects

MULTI-PLATFORM CYBERBULLYING DETECTION USING NLP AND MACHINE LEARNING

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Search

Browse All

Links

Master's Projects

MULTI-PLATFORM CYBERBULLYING DETECTION USING NLP AND MACHINE LEARNING

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

Search

Browse All

Links