Publication Date
Spring 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Navrati Saxena
Second Advisor
William Andreopoulos
Third Advisor
Aditya Kulkarni
Keywords
Sentiment Analysis, CNN, Bi-LSTM, Tokenization, Twitter, Bagging, Boosting, Stacking.
Abstract
The widespread use of multiple social media platforms has amplified the expression of public opinions over the Internet in languages such as English, Hindi and Spanish. With the aid of technological advancements in machine learning, we can analyze opinions posted on the Internet and gauge public sentiments. There are organizations and businesses that are interested in the evaluation of these sentiments as these type of data can generally be used to obtain the opinion of a product, restaurant, a candidate, etc. In this study, we perform a comparative analysis of three popular ensemble learning methodologies (Boosting, Bagging and Stacking) based on multiple base learner models (Bidirectional Long Short-Term Memory, Support Vector Machine, Convolutional Neural Network, Gated Recurrent Unit, Recurrent Neural Network) for sentiment analysis. Publicly available multilingual datasets are used to measure the effectiveness of the models for sentiment analysis. Based on the comparative study, the results show that the stacking ensemble method produces better results than bagging and boosting to identify the correct sentiment based on multilingual data.
Recommended Citation
Ansari, Farhan, "Multilingual Sentiment Analysis Using Ensemble Learning" (2025). Master's Projects. 1565.
DOI: https://doi.org/10.31979/etd.ueb4-dngs
https://scholarworks.sjsu.edu/etd_projects/1565