Publication Date

Fall 2013

Degree Type

Master's Project


Computer Science


Malware is a software which is developed for malicious intent. Malware is a rapidly evolving threat to the computing community. Although many techniques for malware classification have been proposed, there is still the lack of a comprehensible and useful taxonomy to classify malware samples. Previous research has shown that hidden Markov model (HMM) analysis is useful for detecting certain types of malware. In this research, we consider the related problem of malware classification based on HMMs. We train HMMs for a variety of malware generators and a variety of compilers. More than 9000 malware samples are then scored against each of these models and the malware samples are separated into clusters based on the resulting scores. We analyze the clusters and show that they correspond to certain characteristics of malware. These results indicate that HMMs are an effective tool for the challenging task of automatically classifying malware.