Faculty Research, Scholarly, and Creative Activity

BERT for Malware Classification

Joel Alvares, San Jose State University
Fabio Di Troia, San Jose State UniversityFollow

Publication Date

1-1-2022

Document Type

Contribution to a Book

Publication Title

Advances in Information Security

Volume

DOI

10.1007/978-3-030-97087-1_7

First Page

161

Last Page

181

Abstract

In this paper, we aim to accomplish malware classification using word embeddings. Specifically, we trained machine learning models using word embeddings generated by BERT. We extract the “words” directly from the malware samples to achieve multi-class classification. In fact, the attention mechanism of a pre-trained BERT model can be used in malware classification by capturing information about the relation between each opcode and every other opcode belonging to a specific malware family. As means of comparison, we repeat the same experiments with Word2Vec. Differently than BERT, Word2Vec generates word embeddings where words with similar context are considered closer, being able to classify malware samples based on similarity. As classification algorithms, we used and compared Support Vector Machines (SVM), Logistic Regression, Random Forests, and Multi-Layer Perceptron (MLP). We found that the classification accuracy obtained by the word embeddings generated by BERT is effective in detecting malware samples, and superior in accuracy when compared to the ones created by Word2Vec.

Department

Computer Science

Recommended Citation

Joel Alvares and Fabio Di Troia. "BERT for Malware Classification" Advances in Information Security (2022): 161-181. https://doi.org/10.1007/978-3-030-97087-1_7

Link to Full Text

COinS

Faculty Research, Scholarly, and Creative Activity

BERT for Malware Classification

Publication Date

Document Type

Publication Title

Volume

DOI

First Page

Last Page

Abstract

Department

Recommended Citation

Search

Browse All

Links

Faculty Research, Scholarly, and Creative Activity

BERT for Malware Classification

Authors

Publication Date

Document Type

Publication Title

Volume

DOI

First Page

Last Page

Abstract

Department

Recommended Citation

Share

Search

Browse All

Links