Author

Sakshi Garg

Publication Date

Spring 2025

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

William Andreopoulos

Second Advisor

Genya Ishigaki

Third Advisor

Thomas Austin

Keywords

Retrieval-Augmented Generation, Knowledge Graph, MultiHop- RAG, Vector Database, Summarization, Semantic Chunking, Question Answering, Large Language Models (LLMs).

Abstract

With the vast amount of information available on the internet distributed across several lengthy documents, finding relevant information has become more important and challenging. The goal of this project is to develop advanced techniques to retrieve information from long texts in order to deliver accurate and relevant results while ensuring speed and efficiency. As part of this work, we employ techniques to address unique difficulties posed by large and complex documents. This paper presents a custom Retrieval-Augmented Generation (RAG) framework designed to improve contextual retrieval in long and multi-document settings. In this paper, we employ several techniques like summarization, semantic chunking, vectorization and knowledge graph construction to enhance query understanding and reasoning. We use the MultiHop-RAG dataset to evaluate multi-hop retrieval and question-solving scenarios where the evidence for a query is distributed across multiple documents.

Available for download on Monday, May 25, 2026

Share

COinS