Ever since Twitter has been widely accepted and has become an immensely popular micro blogging website, it is being used as a primary source of news; be it related to sports, entertainment, politics or technology by several users. It has been proven earlier that the elimination of stop words has a positive impact on the clustering of technology related tweets. The focus of this paper is to enhance the quality of clustering of the technology related Tweets by developing a semi-automated approach to eliminating stop words and by making use of a combination of Canopy and K-means clustering algorithms. The paper also details an algorithmic approach to determine the threshold values for Canopy clustering.
Gopal, Ananth, "Enhanced Clustering of Technology Tweets" (2013). Master's Projects. 349.