Publication Date

Spring 5-28-2015

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

T. Y. Lin

Second Advisor

Sami Khuri

Third Advisor

Howard Ho

Abstract

To fruitful using big data, data mining is necessary. There are two well-known methods, one is based on apriori principle, and the other one is based on FP-tree. In this project we explore a new approach that is based on simplicial complex, which is a combinatorial form of polyhedron used in algebraic topology. Our approach, similar to FP-tree, is top down, at the same time, it is based on apriori principle in geometric form, called closed condition in simplicial complex. Our method is almost 300 times faster than FP-growth on a real world database using a SJSU laptop. The database is provided by hospital of National Taiwan University. It has 65536 transactions and 1257 columns in bit form. Our major work is mining concepts from big text data; this project is the core engine of the concept based semantic search engine.

Share

COinS