Publication Date

Spring 2016

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Melody Moh

Second Advisor

Teng Moh

Third Advisor

John Cribbs

Keywords

Text Classification Sentiment Analysis

Abstract

Digital information available on the Internet is increasing day by day. As a result of this, the demand for tools that help people in finding and analyzing all these resources are also growing in number. Text Classification, in particular, has been very useful in managing the information. Text Classification is the process of assigning natural language text to one or more categories based on the content. It has many important applications in the real world. For example, finding the sentiment of the reviews, posted by people on restaurants, movies and other such things are all applications of Text classification. In this project, focus has been laid on Sentiment Analysis, which identifies the opinions expressed in a piece of text. It involves categorizing opinions in text into categories like 'positive' or 'negative'. Existing works in Sentiment Analysis focused on determining the polarity (Positive or negative) of a sentence. This comes under binary classification, which means classifying the given set of elements into two groups. The purpose of this research is to address a different approach for Sentiment Analysis called Multi Class Sentiment Classification. In this approach the sentences are classified under multiple sentiment classes like positive, negative, neutral and so on. Classifiers are built on the Predictive Model, that consists of multiple phases. Analysis of different sets of features on the data set, like stemmers, n-grams, tf-idf and so on, will be considered for classification of the data. Different classification models like Bayesian Classifier, Random Forest and SGD classifier are taken into consideration for classifying the data and their results are compared. Frameworks like Weka, Apache Mahout and Scikit are used for building the classifiers.

Recommended Citation

Gangireddy, Siva Charan Reddy, "Supervised Learning for Multi-Domain Text Classification" (2016). Master's Projects. 483.
DOI: https://doi.org/10.31979/etd.yvh3-uzqb
https://scholarworks.sjsu.edu/etd_projects/483

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

DOI

https://doi.org/10.31979/etd.yvh3-uzqb

Master's Projects

Supervised Learning for Multi-Domain Text Classification

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Supervised Learning for Multi-Domain Text Classification

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links