Master of Science (MS)
Fabio Di Troia
Indian Hate Speech, fastText, GloVe, distilBERT, MuRIL
Social media is a great place to share one’s thoughts and to express oneself. Very often the same social media platforms become a means for spewing hatred.The large amount of data being shared on these platforms make it difficult to moderate the content shared by users. In a diverse country like India hate is present on social media in all regional languages, making it even more difficult to detect hate because of a lack of enough data to train deep/ machine learning models to make them understand regional languages.This work is our attempt at tackling hate speech in Hindi. We experiment with embeddings like fastText and GloVe combined with machine learning classifiers like logistic regression and decision tree classifier. We also experiment with transformer based embeddings like distilBERT and MuRIL.The transformer based models perform better in our task and we achieve an F1 score of 0.73 with the help of MuRIL embeddings.
Bansod, Pranjali Prakash, "Hate Speech Detection in Hindi" (2023). Master's Projects. 1265.