Document Type

Conference Proceeding

Publication Date


Publication Title

Proceedings of the 15th International Joint Conference on e-Business and Telecommunications (ICETE 2018) - Volume 1: DCNET, ICE-B, OPTICS, SIGMAP and WINSYS


Christian Callegari, Marten van Sinderen, Paulo Novais, Panagiotis Sarigiannidis, Sebastiano Battiato, Ángel Serrano Sánchez de León, Pascal Lorenz, and Mohammad S. Obaidat

First Page


Last Page



Malware detection based on machine learning typically involves training and testing models for each malware family under consideration. While such an approach can generally achieve good accuracy, it requires many classification steps, resulting in a slow, inefficient, and potentially impractical process. In contrast, classifying samples as malware or benign based on more generic “families” would be far more efficient. However, extracting common features from extremely general malware families will likely result in a model that is too generic to be useful. In this research, we perform controlled experiments to determine the tradeoff between generality and accuracy—over a variety of machine learning techniques—based on n-gram features.


This paper can also be read online here.

Creative Commons License

Creative Commons License
This work is licensed under a Creative Commons Attribution-Noncommercial-No Derivative Works 4.0 License.