Arjun Shah

Publication Date

Fall 2015

Degree Type

Master's Project


Computer Science


Metamorphic software is famous for changing the internal structure of the code while keeping the functionality same. In order to escape the signature detection along with some advanced detection techniques, many malware writers have used metamorphism as the means. On the other hand, code morphing technique increases the diversity of the software which is considered to be a potential security advantage. In our paper, we have developed a metamorphic code generator based on the LLVM framework. The architecture of LLVM has a three-phase compiler design which includes the front end, the optimizer and the back end. It also gives assistance to various source languages and designs which can be considered as a target. LLVM Intermediate Representation(IR) is the most important aspect of LLVM that uses a common IR bytecode within its optimizer. As a result of this, the compilation process of LLVM can transform any high-level language to its IR bytecode. The metamorphic code generator that we have developed works at this IR bytecode level. Leveraging on the dead code obfuscation technique from the previous research, we have implemented a much more difficult technique of instruction substitution at the IR bytecode level. Hence this paper discusses the implementation of obfuscation techniques like dead code insertion, subroutine reordering, and instruction substitution. The effectiveness of these techniques have been tested by using the Hidden Markov Model.