Publication Date


Degree Type

Master's Project

Degree Name

Master of Science (MS)


Computer Science


For the past two decades, computer viruses have been a constant security threat. A computer virus is a type of malware that may damage computer systems by destroying data, crashing the system, or through other malicious activity. Among the different types of viruses, metamorphic viruses are one of the most difficult to detect since such viruses change their internal structures with each mutation, making signature-based detection infeasible. Many construction kits are available that can be used to easily generate metamorphic strains of any given virus. Previous work has shown that metamorphic viruses are detectable using Hidden Markov Models (HMM). In such an HMM-based approach, instruction opcodes are observed and a model is trained to detect a given virus family. These instruction opcodes are obtained by disassembling the binary executable file. However, the disassembling process is time-consuming, making the process impractical. In this project, we develop and demonstrate a technique to derive an approximate opcode sequence directly from the executable file, which, in general, reduces the time required as compared to a standard disassembly process.