Poster: Synthetic Malware Generation using Generative Models
Abstract
Malware poses significant challenges to cybersecurity, exacerbated by the scarcity of high-quality datasets. This paper explores the use of Diffusion and Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate synthetic malware samples as opcode sequences. The synthetic data, validated through classification metrics, demonstrates improved malware detection accuracy across families, particularly in addressing zero-day attacks. Results show that Diffusion achieves up to 99.6% accuracy in multi-class classification, outperforming WGAN-GP, which reaches up to 96.1%. Incorporating synthetic samples improves detection accuracy for rare malware families by up to 100%, underscoring the potential of generative models in enhancing malware detection.