Publication Date
Fall 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Genya Ishigaki
Third Advisor
Nishanth Uchil
Keywords
Text-to-Image Models, Generative AI, Stable Diffusion, Parameter- Efficient Fine-Tuning (PEFT), LoRA, BitFit, runwayml/stable-diffusion, CompVis/stable-diffusion-v1-4, BitFit, LoRA.
Abstract
The rapid advancements in Generative AI, particularly Text-to-Image (T2I) models, have opened up new possibilities for personalized image generation. Finetuning large T2I models for specific downstream tasks is a key approach to achieving tailored outputs. In recent years, Parameter-Efficient Fine-Tuning (PEFT) techniques have gained significant attention as a cost-effective and efficient solution for fine-tuning large models. Initially developed for fine-tuning large language models (LLMs), PEFT techniques have been extensively studied and compared in the context of language tasks. However, regarding the T2I domain, there is a lack of similarly exhaustive and detailed literature on PEFT. This research project, in the context of the T2I domain, aims to empirically explore a few PEFT techniques and do a comparative analysis of two of the most prominent PEFT methods from the LLM domain: BitFit and LoRA. Through this analysis, we aim to achieve two things: provide a starting point that will aid in the decision-making process of selection of a starter PEFT method for fine-tuning, and secondly, provide a framework for comparing PEFT techniques in the T2I domain, that can be built upon by the community.
Recommended Citation
Jain, Mit Ramesh, "Personalizing Image Generation from Prompts Using Generative AI" (2024). Master's Projects. 1433.
https://scholarworks.sjsu.edu/etd_projects/1433