Publication Date

Spring 2016

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

T. Y. Lin

Second Advisor

Jon Pearce

Third Advisor

Thomas Austin

Keywords

Web Mining Clustering Homology

Abstract

As data is being mined more and more from the Internet today, Data Science has become an important field of computing to make that data useful. Data Science allows people to turn all of that data into structured knowledge that is easily utilized, validated, and understandable. There are many known theories to analyze data, but this project will focus on a recently introduced method: analyzing text data with homology from mathematics to understand relationships between keyword-sets.

Using structures of algebraic topology as a starting point, keyword-sets in the text are represented by simplexes based on what they are and what their length is. These sets of simplexes come together to make up clustered simplicial complexes, all laying the groundwork for homology to come into play. By calculating homology on all of these simplicial complexes, we can then know the relations between keyword-sets better. Previous work on data analysis of text data through homology was based on establishing the relationships on the real space, but this project extends that to integer space so that the homology can reveal more detail about those relationships.

Recommended Citation

Nam, Eric, "Analyzing Clustered Web Concepts with Homology" (2016). Master's Projects. 496.
DOI: https://doi.org/10.31979/etd.bg5q-z2x8
https://scholarworks.sjsu.edu/etd_projects/496

Download

Included in

Databases and Information Systems Commons

COinS

DOI

https://doi.org/10.31979/etd.bg5q-z2x8

Master's Projects

Analyzing Clustered Web Concepts with Homology

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Analyzing Clustered Web Concepts with Homology

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links