Off-campus SJSU users: To download campus access theses, please use the following link to log into our proxy server with your SJSU library user name and PIN.
Publication Date
Fall 2020
Degree Type
Thesis - Campus Access Only
Degree Name
Master of Science (MS)
Department
Computer Engineering
Advisor
Mahima Agumbe Suresh
Keywords
aspect based sentiment analysis, domain specific, machine learning, natural language processing, semantics, word embeddings
Subject Areas
Computer engineering; Computer science; Linguistics
Abstract
Customer reviews are a rich, abundant source of valuable information that could predict commercial success or failure. Product designers, in particular, benefit significantly if they can better understand customer requirements. Through aspect-based sentiment analysis, we can analyze large amounts of online reviews for customer sentiment towards specific product features or components. However, most such machine learning models require large amounts of aspect-annotated data to train before commercial use is viable. Further, product design is a very industry-specific process. Any algorithm attempting to learn such a product's features will need to train on semantically similar data. These dependencies pose challenges since domain-specific data for a particular product could be extremely hard to find. On the other hand, a machine learning practitioner may wonder whether gathering hard-to-come-by text data discussing a limited set of topics is worth the time and resources it takes; after all, machine learning algorithms trained to generalize across different distributions of data are more robust. In the interest of thoroughness, we gathered large amounts of text data from various generic, domain-related, and topic-specific sources before conducting extensive experimentation on model training. We then compare the results of models trained on the various text data distributions across three different product categories. Our findings clearly show the advantages of gathering text data that are semantically similar to the data we ultimately analyze and evaluate, even if the latter's domain cannot be exactly matched by the former. We also gain valuable insights along the way that could help machine learning practitioners in this field make informed decisions when designing systems.
Recommended Citation
Mokadam, Aashay, "A Study on the Contributions of Domain-Specific Semantics Towards Aspect-Based Sentiment Analysis" (2020). Master's Theses. 5155.
DOI: https://doi.org/10.31979/etd.5tn4-xapw
https://scholarworks.sjsu.edu/etd_theses/5155