Publication Date
Spring 2025
Degree Type
Master's Project
Degree Name
Master of Science in Computer Science (MSCS)
Department
Computer Science
First Advisor
Fabio Di Troia
Second Advisor
Amith Kamath Belman
Third Advisor
Sayma Akther
Keywords
Malware Classification, Convolutional Neural Networks (CNN), Generative Adversarial Network (GAN), StyleGAN2-ADA, Synthetic Data Generation, Data Augmentation, Transfer Learning, Image Based Malware Detection
Abstract
Malware classification is a critical component in the field of cybersecurity. Accurate identification of a malware family can enable timely threat detection and response. In this thesis, we propose a robust image-based malware classification pipeline using Convolutional Neural Networks (CNNs) and Generative Adversarial Networks (GANs), with a focus on improving performance for underrepresented malware families. We train a baseline CNN model on the Malimg dataset across 25 malware families, but observe misclassifications in classes with limited data and overlapping visual features. To address this, we apply targeted augmentations and generate class-specific synthetic data using StyleGAN2-ADA. A CNN trained on the combination of real and synthetic data outperforms both the baseline CNN and fine-tuned transfer learning models. This project highlights the impact of data augmentation and the importance of preserving malware patterns during preprocessing to enhance malware classification performance.
Recommended Citation
Pathak, Milind Anand, "Enhancing Robustness of CNN Model for Malware Detection using GAN-Based Data Augmentation and Transfer Learning" (2025). Master's Projects. 1557.
DOI: https://doi.org/10.31979/etd.u65q-pbv9
https://scholarworks.sjsu.edu/etd_projects/1557