Publication Date
Spring 2016
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
T. Y. Lin
Second Advisor
Robert Chun
Third Advisor
Eric Louie
Keywords
Concept Web Search Engine
Abstract
Data on the internet is increasing exponentially every single second. There are billions and billions of documents on the World Wide Web (The Internet). Each document on the internet contains multiple concepts (an abstract or general idea inferred from specific instances).
In this paper, we show how we created and implemented an algorithm for extracting concepts from a set of documents. These concepts can be used by a search engine for generating search results to cater the needs of the user. The search result will then be more targeted than the usual keyword search.
The main problem was to extract concepts from a set of documents. Each page could have thousands of combinations that could be potential concepts. An average document could have millions of concepts. Combine that to the vast amount of data on the web, we are talking about an enormous amount of dataset and samples. As a result, the main areas of concern are the main memory constraints and the time complexity of the algorithm.
This paper introduces an algorithm which is scalable, independent of the main memory and has a linear time complexity.
Recommended Citation
Rastogi, Aishwarya, "Concept Based Search Engine: Concept Creation" (2016). Master's Projects. 462.
DOI: https://doi.org/10.31979/etd.b8xv-3u8u
https://scholarworks.sjsu.edu/etd_projects/462