Code Reviews on a Budget: Memory-Efficient Fine-Tuning with QLoRA and RAG for Big Code Applications
Publication Date
1-1-2026
Document Type
Conference Proceeding
Publication Title
Lecture Notes in Computer Science
Volume
16324 LNCS
DOI
10.1007/978-3-032-14107-1_31
First Page
367
Last Page
381
Abstract
In this technological era, where Artificial Intelligence and Machine Learning are revolutionizing various domains, Large Language Models (LLMs) are surfacing as a powerful tool for managing and analyzing large-scale data, including software codebases. In the process of software development, having reliable code reviews is highly essential to ensure code security, maintain quality, and manage huge code repositories. This paper aims to survey numerous existing methodologies that aid in creating efficient code review automation agents and investigate the suitability of various existing methods for fine-tuning open-source models in the context of code review automation. Parameter-efficient fine-tuning (PEFT) methodologies, such as Low-Rank Adaptation (LoRA) and Quantized Low-Rank Adaptation (QLoRA), have been explored, with an additional focus on a hybrid model that combines QLoRA with Retrieval Augmented Generation (RAG) to determine efficient ways to reduce the amount of memory required for fine-tuning while ensuring inference quality is not affected. The context has been used from general-purpose LLMs, specifically Meta’s Llama 3.2 3B model. Experiments show that the hybrid approach reduces memory utilization by nearly 17% while achieving low entropy values. The results also show improved performance over baseline systems in both efficiency and inference stability, highlighting the potential of this hybrid technique for real-world code review automation.
Keywords
Code Review Automation, Large Language Models, LoRA, QDyLoRA, QLoRA, Retrieval Augmented Generation
Department
Computer Science
Recommended Citation
Sumukh Naveen Aradhya, Melody Moh, and Teng Sheng Moh. "Code Reviews on a Budget: Memory-Efficient Fine-Tuning with QLoRA and RAG for Big Code Applications" Lecture Notes in Computer Science (2026): 367-381. https://doi.org/10.1007/978-3-032-14107-1_31