Publication Date
Fall 2010
Degree Type
Master's Project
Degree Name
Master of Science (MS)
Department
Computer Science
First Advisor
Mark Stamp
Second Advisor
Sami Khuri
Third Advisor
Robert Chun
Keywords
anit-virus dynamic programming disassembly
Abstract
Most commercial anti-virus software uses signature based techniques to detect whether a file is infected by a virus or not. However, signature based detection systems are unable to detect metamorphic viruses, since such viruses change their internal structure from generation to generation. Previous work has shown that hidden Markov models (HMMs) can be used to detect metamorphic viruses. In this technique, the code is disassembled and the resulting opcode sequences are used for training and detection. Due to the disassembly step, this process is not efficient enough to use when a decision has to be made in real time.
In this project, we explore whether dynamic programming can be used to speed up the process of disassembling, with minimal loss of accuracy. Dynamic programming is generally used to solve problems having two key attributes: optimal substructure and overlapping sub problems. During each iteration our algorithm reads part of the input stream from the executable file and determines assembly instructions, thus dividing problems into sub problems.
We have created a score matrix representing digraphs of the most common opcode instructions and we have implanted a dynamic program based on this scoring matrix. For various file sizes, we determine the time taken by our dynamic program and we show that our approach is significantly faster than a standard disassembler (OllyDbg). Finally, we analyze the accuracy of our results.
Recommended Citation
Shah, Abhishek, "Approximate Disassembly using Dynamic Programming" (2010). Master's Projects. 8.
DOI: https://doi.org/10.31979/etd.8m5n-rxtz
https://scholarworks.sjsu.edu/etd_projects/8