Publication Date

Fall 2014

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

T. Y. Lin

Second Advisor

Suneuy Kim

Third Advisor

Eric Louie

Abstract

The internet is a vast collection of billions of web pages containing terabytes of information arranged in thousands of servers using HTML. The size of this collection itself is a formidable obstacle in retrieving information necessary and relevant. This made search engines an important part of our lives. Search engines strive to retrieve information as relevant as possible to the user. One of the building blocks of search engines is the Web Crawler. A web crawler is a bot that goes around the internet collecting and storing it in a database for further analysis and arrangement of the data.

The project aims to create a smart web crawler for a concept based semantic based search engine. The crawler not only aims to crawl the World Wide Web and bring back data but also aims to perform an initial data analysis of unnecessary data before it stores the data. We aim to improve the efficiency of the Concept Based Semantic Search Engine by using the Smart crawler.

Recommended Citation

Kancherla, Vinay, "A Smart Web Crawler for a Concept Based Semantic Search Engine" (2014). Master's Projects. 380.
DOI: https://doi.org/10.31979/etd.ubfy-s3es
https://scholarworks.sjsu.edu/etd_projects/380

Download

Included in

Databases and Information Systems Commons

COinS

DOI

https://doi.org/10.31979/etd.ubfy-s3es

Master's Projects

A Smart Web Crawler for a Concept Based Semantic Search Engine

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

A Smart Web Crawler for a Concept Based Semantic Search Engine

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links