Master of Science (MS)
T. Y. Lin
To fruitful using big data, data mining is necessary. There are two well-known methods, one is based on apriori principle, and the other one is based on FP-tree. In this project we explore a new approach that is based on simplicial complex, which is a combinatorial form of polyhedron used in algebraic topology. Our approach, similar to FP-tree, is top down, at the same time, it is based on apriori principle in geometric form, called closed condition in simplicial complex. Our method is almost 300 times faster than FP-growth on a real world database using a SJSU laptop. The database is provided by hospital of National Taiwan University. It has 65536 transactions and 1257 columns in bit form. Our major work is mining concepts from big text data; this project is the core engine of the concept based semantic search engine.
Yang, Jingjing, "MINING CONCEPT IN BIG DATA" (2015). Master's Projects. 411.