Publication Date
8-27-2025
Document Type
Conference Proceeding
Publication Title
2025 Silicon Valley Cybersecurity Conference Svcc 2025
DOI
10.1109/SVCC65277.2025.11133622
Abstract
Malware detection is a critical task for protecting our assets from attacks. However, traditional approaches to malware detection often struggle with limited datasets, which hinder the effectiveness of machine learning models. In this paper, we propose a malware image generation system designed to craft high-quality synthetic malware image samples, addressing the challenges posed by small datasets in malware detection. The proposed system utilizes two popular generative models, WGAN-GP and Diffusion, to generate synthetic malware images. It converts malware binary files into image files using four different color spaces: monochrome, grayscale, RGB, and CMYK. These images are then evaluated based on key performance metrics such as accuracy, precision, and recall. A cosine similarity evaluation system is employed to filter high-quality samples and enhance data quality. The experimental results demonstrate that the system generates synthetic samples that improve the performance of malware detection models. The findings indicate that synthetic image-based representations are effective for malware detection tasks.
Funding Number
2244597
Funding Sponsor
National Science Foundation
Keywords
color space, Diffusion, Image-Based Malware Classification, Malware, Wasserstein General Adversarial Network with Gradient Penalty (WGAN-GP)
Department
Computer Engineering; Computer Science
Recommended Citation
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.
Comments
© 2025 IEEE. Personal use of this material is permitted. Permission from IEEE must be obtained for all other uses, in any current or future media, including reprinting/republishing this material for advertising or promotional purposes, creating new collective works, for resale or redistribution to servers or lists, or reuse of any copyrighted component of this work in other works.