Publication Date

Spring 2024

Degree Type

Master's Project

Degree Name

Master of Science in Computer Science (MSCS)

Department

Computer Science

First Advisor

Fabio Di Troia

Second Advisor

Nada Attar

Third Advisor

Katerina Potika

Keywords

Malware · Fake malware generation · GAN · Word embedding · Machine learning.

Abstract

With malware threats on the rise, they have also grown more complicated and subtle. Consequently, incorporating cutting-edge machine learning into cybersecurity defenses has never been more crucial. Nevertheless, building resilient machine-learning models is a significant challenge due to the need for existing diversified and complete malware datasets. This project will relieve this difficulty by employing a Generative

Adversarial Network (GAN) to develop artificial malware samples featuring an Appli- cation Programming Interface (API) call series. While traditional generative modeling

has primarily been limited to image-based fields, we offer an “outside the box” domain – malware signature generation – as an API call sequence. The goal is to imagine synthetic malware that highly resembles benign malware; several GAN architectures specializing in sequence generation will be employed to generate “fake” malware that appears authentic. Researchers can use synthetic datasets to exploit the scarcity of data for training machine learning, but they also help detection models bridge the gap and mimic this “real” malware. Preliminary results show our approach is on a favorable course, with synthetic samples exhibiting enough lifelike traits that current models do not accurately detect, creating a foundation for more robust and adaptable malware detection systems.

Available for download on Friday, May 23, 2025

Share

COinS