Publication Date
Fall 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Robert Chun
Third Advisor
Vidya Rangasayee
Keywords
Code Documentation, Large Language Models, Llama2, Llama3, CodeLlama, PEFT, LoRA, QLoRA
Abstract
In this fast developing world of software development, it is crucial to maintain the quality of code and the developers’ productivity. This can be done effectively with good code documentation. SDEBuddy uses the latest generation of Large Language Models (LLMs) and finetuning procedures to create code documentation. In this project, state-of-the-art models such as Llama2 and Llama3 are employed to mimic the behavior of the given code and produce documentation. Such models are tuned for various programming languages and documentation formats using LoRA and QLoRA fine-tuning approaches. These models are evaluated in terms of the BLEU score, ROUGE score and also by assessing it manually. It also assesses the effectiveness of the models in addition, the effects of various fine-tuning techniques on the performance of such models are also considered. The system aims to improve the documentation process to relieve the developers from focusing more on the development process and less on documentation. Last, a custom metric is proposed to capture the effect of fine-tuning on these models. This metric considers the model’s effectiveness, the resources needed, and the quality of the generated document to give a measure of the potential benefits and drawbacks of fine-tuning.
Recommended Citation
Nagendra, Nischay, "SDEBuddy - Code Documentation using Large Language Models" (2024). Master's Projects. 1443.
https://scholarworks.sjsu.edu/etd_projects/1443