Publication Date

Spring 2023

Degree Type

Master's Project

Degree Name

Master of Science (MS)

Department

Computer Science

First Advisor

Fabio Di Troia

Second Advisor

William Andreopoulos

Third Advisor

Thomas Austin

Keywords

Graph Convolution Network, Graph Attention network, GraphSAGE, Word2Vec

Abstract

Word embeddings are widely recognized as important in natural language pro- cessing for capturing semantic relationships between words. In this study, we conduct experiments to explore the effectiveness of word embedding techniques in classifying malware. Specifically, we evaluate the performance of Graph Neural Network (GNN) applied to knowledge graphs constructed from opcode sequences of malware files. In the first set of experiments, Graph Convolution Network (GCN) is applied to knowledge graphs built with different word embedding techniques such as Bag-of-words, TF-IDF, and Word2Vec. Our results indicate that Word2Vec produces the most effective word embeddings, serving as a baseline for comparison with three GNN models- Graph Convolution network, Graph Attention network (GAT), and GraphSAGE network

(GraphSAGE). For the next set of experiments, we generate vector embeddings of various lengths using Word2Vec and construct knowledge graphs with these embed- dings as node features. Through performance comparison of the GNN models, we show that larger vector embeddings improve the models’ performance in classifying the malware files into their respective families. Our experiments demonstrate that word embedding techniques can enhance feature engineering in malware analysis.

Recommended Citation

Mananjaya, Manasa, "Malware Classification using Graph Neural Networks" (2023). Master's Projects. 1268.
DOI: https://doi.org/10.31979/etd.68ya-hj74
https://scholarworks.sjsu.edu/etd_projects/1268

Download

Included in

Artificial Intelligence and Robotics Commons

COinS

DOI

https://doi.org/10.31979/etd.68ya-hj74

Master's Projects

Malware Classification using Graph Neural Networks

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

DOI

Search

Browse All

Links

Master's Projects

Malware Classification using Graph Neural Networks

Author

Publication Date

Degree Type

Degree Name

Department

First Advisor

Second Advisor

Third Advisor

Keywords

Abstract

Recommended Citation

Included in

Share

DOI

Search

Browse All

Links