Poster: Synthetic Malware Generation using Generative Models

Publication Date

1-1-2025

Document Type

Conference Proceeding

Publication Title

2025 Silicon Valley Cybersecurity Conference Svcc 2025

DOI

10.1109/SVCC65277.2025.11133626

Abstract

Malware poses significant challenges to cybersecurity, exacerbated by the scarcity of high-quality datasets. This paper explores the use of Diffusion and Wasserstein Generative Adversarial Networks with Gradient Penalty (WGAN-GP) to generate synthetic malware samples as opcode sequences. The synthetic data, validated through classification metrics, demonstrates improved malware detection accuracy across families, particularly in addressing zero-day attacks. Results show that Diffusion achieves up to 99.6% accuracy in multi-class classification, outperforming WGAN-GP, which reaches up to 96.1%. Incorporating synthetic samples improves detection accuracy for rare malware families by up to 100%, underscoring the potential of generative models in enhancing malware detection.

Funding Number

2244597

Funding Sponsor

National Science Foundation

Keywords

Diffusion, Generative Models, Machine Learning, Malware, WGAN-GP

Department

Computer Science; Computer Engineering

Share

COinS