Publication Date

2007

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

Abstract

These days blogs are becoming increasingly popular because it allows anyone to share their personal diary, opinions, and comments on the World Wide Wed. Many blogs contain valuable information, but it is a difficult task to extract this information from a high number of blog comments. The goal is to analyze a high number of blog comments by clustering all blog comments by their similarity based on keyword relevance into smaller groups. TF-IDF weight has been used in classifying documents by measuring appearance frequency of each keyword in a document, but it is not effective in differentiating semantic similarities between words. By applying fuzzy semantic to TF-IDF, TF-IDF becomes fuzzy TF-IDF and has the ability to rank semantic relevancy. Fuzzy VSM can be effective in exploring hidden relationship between blog comments by adapting fuzzy TF-IDF and fuzzy semantic for extending Vector Space Model to fuzzy VSM. Therefore, fuzzy VSM can cluster a high number of blog comments into small number of groups based on document similarity and semantic relevancy.

Share

COinS