Publication Date

Spring 2023

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Fabio Di Troia

Second Advisor

William Andreopoulos

Third Advisor

Robert Chun

Keywords

Indian Hate Speech, fastText, GloVe, distilBERT, MuRIL

Abstract

Social media is a great place to share one’s thoughts and to express oneself. Very often the same social media platforms become a means for spewing hatred.The large amount of data being shared on these platforms make it difficult to moderate the content shared by users. In a diverse country like India hate is present on social media in all regional languages, making it even more difficult to detect hate because of a lack of enough data to train deep/ machine learning models to make them understand regional languages.This work is our attempt at tackling hate speech in Hindi. We experiment with embeddings like fastText and GloVe combined with machine learning classifiers like logistic regression and decision tree classifier. We also experiment with transformer based embeddings like distilBERT and MuRIL.The transformer based models perform better in our task and we achieve an F1 score of 0.73 with the help of MuRIL embeddings.

Share

COinS