Pradeep Roy

Publication Date

Spring 2014

Degree Type

Master's Project


Computer Science


In the current day and age, search engines are the most relied on and critical ways to find out information on the World Wide Web (W3). With the ushering in of Big Data, traditional search engines are becoming inept and inadequate at dishing out relevant pages. It has become increasingly difficult to locate meaningful results from the mind boggling list of returns typical of returned search queries. Keywords, often times, alone cannot capture the intended concept with high precision. These and associated issues with the current search engines call for a more powerful and holistic search engine capability. The current project presents a new approach to resolving this widely relevant problem - a concept based search engine. It is known that a collection of concepts naturally forms a polyhedron. Combinatorial topology is, thus, used to manipulate the polyhedron of concepts that are mined from W3. Based on this triangulated polyhedron, the concepts are clustered together based on primitive concepts that are geometrically, simplexes of maximal dimensions. Such clustering is different from conventional clustering since the proposed model may have overlapping. Based on such clustering, the search results can then be categorized and users allowed to select a category more apt to their needs. The results displayed are based on aforementioned categorization thereby leading to more sharply gathered and, thus, semantically related relevant information.