Publication Date
Fall 2019
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Mark Stamp
Second Advisor
Katerina Potika
Third Advisor
Samanvitha Basole
Keywords
Malware detection, multi-family detection, model fusion techniques
Abstract
A fundamental problem in malware research consists of malware detection, that is, dis- tinguishing malware samples from benign samples. This problem becomes more challeng- ing when we consider multiple malware families. A typical approach to this multi-family detection problem is to train a machine learning model for each malware family and score each sample against all models. The resulting scores are then used for classification. We refer to this approach as “cold fusion,” since we combine previously-trained models—no retraining of these base models is required when additional malware families are considered. An alternative approach is to train a single model on samples from multiple malware families. We refer to this latter approach as “hot fusion,” since we must completely retrain the model whenever an additional family is included in our training set. In this research, we compare hot fusion and cold fusion—in terms of both accuracy and efficiency—as a function of the number of malware families considered. We use features based on opcodes and a variety of machine learning techniques.
Recommended Citation
Bichkar, Snehal, "Hot Fusion vs Cold Fusion for Malware Detection" (2019). Master's Projects. 902.
DOI: https://doi.org/10.31979/etd.q2kz-y82c
https://scholarworks.sjsu.edu/etd_projects/902