Publication Date

Spring 2018

Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science


With the ever increasing use of burgeoning volumes of data, machine learning systems involving minimal human oversight are crucial for classification and analysis tasks. Machine learning algorithms used for such purposes have revolutionized the way we sort, classify, and analyze data. The accuracy of any machine learning algorithm depends heavily on the data it is trained on. In some circumstances, an attacker can attempt to poison the training data to subvert a machine learning system. In this research, we analyze the effects of training data poisoning attacks on hidden Markov models (HMMs), in the context of malware classification. With the increase in percentage of data poisoning, HMM is still able to classify most files correctly. Hence we find that HMMs are able to classify at high and low level of poisoning.