Publication Date
Spring 2023
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Robert Chun
Second Advisor
William Andreopoulos
Third Advisor
Manika Makam
Keywords
Youtube comment, spam classification, logistic regression, SVM, MLP, BERT
Abstract
This paper suggests an innovative way for finding spam or ham comments on the video- sharing website YouTube. Comments that are contextually irrelevant for a particular video or have a commercial motive constitute as spam. In the past few years, with the advent of advertisements spreading to new arenas such as the social media has created a lucrative platform for many. Today, it is being widely used by everyone. But this innovation comes with its own impediments. We can see how malicious users have taken over these platforms with the aid of automated bots that can deploy a well-coordinated spam across multiple streams in a matter of seconds. This can cause a major disruption to one’s social media experience and greatly tarnish a channel’s reputation.
Presently, the only approach YouTube has applied to tackle these is by blocking comments that have links. These methods are often futile as spammers are known to quickly circumvent such obstacles. Standard machine learning algorithms might prove to be helpful to a certain extent but the only way this issue can be properly checked is with an approach that built around better accuracies. It is our aim in this paper to propose a method for detecting these comments through the development of an innovative method using machine learning algorithms like Logistic Regression, Multilayer Perceptron, Random Forest, Support Vector Machine, Ensemble model and BERT that have been shown to detect and limit spam effectively on these platforms.
Recommended Citation
Kotta, Priyusha, "Spam Comments Detection in YouTube Videos" (2023). Master's Projects. 1276.
DOI: https://doi.org/10.31979/etd.uabw-u22e
https://scholarworks.sjsu.edu/etd_projects/1276