Author

Farhan Ansari

Publication Date

Spring 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Navrati Saxena

Second Advisor

William Andreopoulos

Third Advisor

Aditya Kulkarni

Keywords

Sentiment Analysis, CNN, Bi-LSTM, Tokenization, Twitter, Bagging, Boosting, Stacking.

Abstract

The widespread use of multiple social media platforms has amplified the expression of public opinions over the Internet in languages such as English, Hindi and Spanish. With the aid of technological advancements in machine learning, we can analyze opinions posted on the Internet and gauge public sentiments. There are organizations and businesses that are interested in the evaluation of these sentiments as these type of data can generally be used to obtain the opinion of a product, restaurant, a candidate, etc. In this study, we perform a comparative analysis of three popular ensemble learning methodologies (Boosting, Bagging and Stacking) based on multiple base learner models (Bidirectional Long Short-Term Memory, Support Vector Machine, Convolutional Neural Network, Gated Recurrent Unit, Recurrent Neural Network) for sentiment analysis. Publicly available multilingual datasets are used to measure the effectiveness of the models for sentiment analysis. Based on the comparative study, the results show that the stacking ensemble method produces better results than bagging and boosting to identify the correct sentiment based on multilingual data.

Available for download on Tuesday, May 26, 2026

Share

COinS