Publication Date

2006

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

Abstract

The Web contains massive amount of documents from across the globe to the point where it has become impossible to classify them manually. This project’s goal is to find a new method for clustering documents that are as close to humans’ classification as possible and at the same time to reduce the size of the documents. This project uses a combination of Latent Semantic Indexing (LSI) with Singular Value Decomposition (SVD) calculation as well as Support Vector Machine (SVM) classification. With SVD, data sets are decomposed and can be truncated to reduce the data sets size. The reduced data set will then be used to cluster. With SVM, clustered data set is used for training to allow new data to be classified based on SVM’s prediction. The project’s result show that the method of combining SVD and SVM is able to reduce data size and classifies documents reasonably compared to humans’ classification.

Recommended Citation

Ngo, Tam P., "Clustering High Dimensional Data Using SVM" (2006). Master's Projects. 33.
DOI: https://doi.org/10.31979/etd.ns2s-ejvc
https://scholarworks.sjsu.edu/etd_projects/33

Download

Included in

Computer Sciences Commons

COinS

DOI

https://doi.org/10.31979/etd.ns2s-ejvc

Master's Projects

Clustering High Dimensional Data Using SVM

Publication Date

Degree Type

Degree Name

Department

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Clustering High Dimensional Data Using SVM

Author

Publication Date

Degree Type

Degree Name

Department

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links