Lightweight Relevance Grader in RAG

Publication Date

1-1-2025

Document Type

Conference Proceeding

Publication Title

Proceedings 2025 8th International Conference on Information and Computer Technologies Icict 2025

DOI

10.1109/ICICT64582.2025.00037

First Page

198

Last Page

203

Abstract

Retrieval-augmented generation (RAG) addresses limitations of large language models (LLMs) by leveraging a vector database to provide more accurate and up-to-date information. When a user submits a query, RAG executes a vector search to find relevant documents, which are then used to generate a response. However, ensuring the relevance of retrieved documents is a challenge. To address this, a secondary model, known as a relevant grader, is used to verify document relevance. To reduce computational requirements, a lightweight small language model can be used as a relevant grader. This work aims to improve the capability of such a model, achieving a significant increase in precision from 0.1038 to 0.7750 using llama-3.2-1b, outperforming llama-3.1-70b and gpt4o-mini. Our code is available at https://github.com/taeheej/Lightweight-Relevance-Grader-in-RAG.

Keywords

Fine-tuning, Large language models, Relevance grader, Retrieval-augmented generation, Vector database, Vector search

Department

Applied Data Science; Computer Engineering

Share

COinS