Publication Date
Fall 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Saptarshi Sengupta
Third Advisor
Chung-Wen Tsao
Keywords
Qwen2.5-Coder, Parameter-Efficient Fine Tuning, LoRA
Abstract
The main objective of this research is to improve the quality of software code that is produced by the Qwen2.5-Coder model specifically in terms of maintainability, complexity, and reliability. Our approach is going to be a more specific one that will involve the Parameter-Efficient Fine Tuning (PEFT) framework combined with quantization through Low-Rank Adaption (LoRA). This approach involves fine-tuning only some of the parameters of a model to make it suitable for software programming with the general structure of the model largely intact. In this paper, SonarQube is used as a tool to help quantify the improvements made to the code by the model. The base data set used for this project is “starcoderdata”, which is part of a larger data set that includes various programming discussions and code snippets taken from different open-source projects on GitHub. Since it includes a number of real-world programming tasks, it will be suitable for the model to train on and enhance its coding skills. In line with the suggestions given in the Qwen2.5-Coder paper, the model will be further refined with LoRA and bitsandbytes quantization. This is the most effective way of adjusting the model to the particular task at hand and at the same time ensure that it is as efficient and effective in performing various software development tasks. In this project, we will explore ways in which the baseline model can be fine-tuned in order to generate high-quality code output. In addition, a set of software quality metrics will be employed to assess the code quality of the baseline model and the fine-tuned model.
Recommended Citation
Nagaraja, Lohith, "Enhancing Qwen2.5-Coder: A Deep Dive into Fine-Tuning using PEFT for Superior Code Outputs" (2024). Master's Projects. 1444.
https://scholarworks.sjsu.edu/etd_projects/1444