Publication Date

Fall 2024

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

William Andreopoulos

Second Advisor

Robert Chun

Third Advisor

Vidya Rangasayee

Keywords

Code Documentation, Large Language Models, Llama2, Llama3, CodeLlama, PEFT, LoRA, QLoRA

Abstract

In this fast developing world of software development, it is crucial to maintain the quality of code and the developers’ productivity. This can be done effectively with good code documentation. SDEBuddy uses the latest generation of Large Language Models (LLMs) and finetuning procedures to create code documentation. In this project, state-of-the-art models such as Llama2 and Llama3 are employed to mimic the behavior of the given code and produce documentation. Such models are tuned for various programming languages and documentation formats using LoRA and QLoRA fine-tuning approaches. These models are evaluated in terms of the BLEU score, ROUGE score and also by assessing it manually. It also assesses the effectiveness of the models in addition, the effects of various fine-tuning techniques on the performance of such models are also considered. The system aims to improve the documentation process to relieve the developers from focusing more on the development process and less on documentation. Last, a custom metric is proposed to capture the effect of fine-tuning on these models. This metric considers the model’s effectiveness, the resources needed, and the quality of the generated document to give a measure of the potential benefits and drawbacks of fine-tuning.

Available for download on Wednesday, December 31, 2025

Share

COinS