Master of Science (MS)
Yioop, Web Search Engine, Result Caching, Static-Topic- Dynamic Cache
Caches are the most effective mechanism utilized by web search engines to optimize the performance of search queries. Search engines employ caching at multiple levels to improve its performance, for example, caching posting list and caching result set. Caching query results reduces overhead of processing frequent queries and thus saves a lot of time and computing power. Yioop is an open-source web search engine which utilizes result cache to optimize searches. The current implementation utilizes a single dynamic cache based on Marker’s algorithm. The goal of the project is to improve the performance of cache in Yioop. To choose a new caching system, Static-Dynamic cache along with its different variations Machine Learning Static- Dynamic Cache, Static-Semistatic-Dynamic Cache, and Static-Topic-Dynamic Cache were evaluated. Based on these experiments, Static-Topic-Dynamic was implemented in Yioop. Static-Dynamic cache exploits temporal locality by dividing cache into a static part which stores most popular queries and a dynamic part which captures the bursty behavior of queries. Static-Topic-Dynamic adds topical cache section in Static-Dynamic Cache which captures queries that are neither too popular to be in static cache nor too frequent to be in dynamic cache by creating dedicated cache for each topic. To extract topic from the queries, 𝑘-means algorithm was chosen as topic model. The results of Static-Dynamic Cache and Static-Topic-Dynamic cache showed the improvement of 2.3% and 1% over the initial performance of the cache.
Padia, Rushikesh, "Robust Cache System for Web Search Engine Yioop" (2023). Master's Projects. 1258.