Faculty Research, Scholarly, and Creative Activity

Malware Detection through Contextualized Vector Embeddings

Vinay Pandya, Alumni
Fabio Di Troia, San Jose State UniversityFollow

Publication Date

1-1-2023

Document Type

Conference Proceeding

Publication Title

2023 Silicon Valley Cybersecurity Conference, SVCC 2023

DOI

10.1109/SVCC56964.2023.10165170

Abstract

Detecting malware is an integral part of system security. In recent years, machine learning models have been applied with success to overcome this challenging problem. The aim of this research is to apply context-dependent word embeddings to classify malware. We extract opcodes from the malware samples and use them to generate the embeddings that train the classifiers. Transformers are a novel architecture that utilizes self-attention to handle long-range dependencies. Different transformer architectures, namely, BERT, DistilBERT, AIBERT, and RoBERTa, are implemented in this work to generate context-dependent word embeddings. Apart from using transformer models, we also experimented with ELMo, a bidirectional language model which can generate contextualized opcode embeddings. These embeddings are used to train our machine learning models in classifying samples from different malware families. We compared our contextualized results with context-free embeddings generated by Word2Vec, and HMM2Vec algorithms. The classification algorithms trained on our embeddings consist of Resnet-18 CNN, Random Forest, Support Vector Machines (SVMs), and k-Nearest Neighbours (k-NNs).

Keywords

AlBERT, BERT, ELMo, Malware detection, RoBERTa, Transformer

Department

Computer Science

Recommended Citation

Vinay Pandya and Fabio Di Troia. "Malware Detection through Contextualized Vector Embeddings" 2023 Silicon Valley Cybersecurity Conference, SVCC 2023 (2023). https://doi.org/10.1109/SVCC56964.2023.10165170

Link to Full Text

COinS

Faculty Research, Scholarly, and Creative Activity

Malware Detection through Contextualized Vector Embeddings

Publication Date

Document Type

Publication Title

DOI

Abstract

Keywords

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

Malware Detection through Contextualized Vector Embeddings

Authors

Publication Date

Document Type

Publication Title

DOI

Abstract

Keywords

Department

Recommended Citation

Share

Search

Browse All

Links