Publication Date
Fall 2024
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
William Andreopoulos
Second Advisor
Nada Attar
Third Advisor
Robert Chun
Keywords
Text-to-Image synthesis, Lexical-driven image generation, Image feature preservation, Generative model, Semantic image refinement, Image transformation
Abstract
This research project proposes a novel approach to user-driven image editing via natural language descriptions. The aim is an accurate change of certain features of an image with respect to the descriptive text while maintaining, with equal concern, the integrity of the remaining parts of the image not affected by the description. The task is particularly relevant for fields like content creation, personalized design, and automated image editing that require both coherence of a visual scene and textual description. We propose a generative model, LexiGen, which perfectly integrates natural language descriptions with their corresponding visual changes within an image. The proposed model works in two main stages: identifying and selecting the relevant regions of the image with respect to the input description and associating them with the corresponding semantic features, and refining those changes toward consistency and coherence with the original image. Moreover, we provide an evaluation framework that pays equal attention to the addition of new aspects according to the text and to the preservation of parts which are unaffected by the input. Extensive experiments on publicly available data prove that our model generates semantically well-aligned high-quality images much superior compared to the existing methods.
Recommended Citation
Chincholkar, Sangram Prashant, "LexiGen: Lexical-Driven Image Generation" (2024). Master's Projects. 1432.
https://scholarworks.sjsu.edu/etd_projects/1432