Emulation Versus Instrumentation for Android Malware Detection
Contribution to a Book
Advanced Sciences and Technologies for Security Applications
In resource constrained devices, malware detection is typically based on offline analysis using emulation. An alternative to such emulation is malware analysis based on code that is executed on an actual device. In this research, we collect features from a corpus of Android malware using both emulation and on-phone instrumentation. We train machine learning models using the emulator-based features and we train models on features collected via instrumentation, and we compare the results obtained in these two cases. We obtain strong detection and classification results, and our results improve slightly on previous work. Consistent with previous work, we find that emulation fails for a significant percentage of malware applications. However, we also find that emulation fails to extract useful features from an even larger percentage of benign applications. We show that for applications that are amenable to emulation, malware detection and classification rates based on emulation are consistently within 1% of those obtained using more intrusive and costly on-phone analysis. We also show that emulation failures are easily explainable and appear to have little to do with malware writers employing anti-emulation techniques, contrary to claims made in previous research. Among other contributions, this work points to a lack of sophistication in Android malware.
Anukriti Sinha, Fabio Di Troia, Philip Heller, and Mark Stamp. "Emulation Versus Instrumentation for Android Malware Detection" Advanced Sciences and Technologies for Security Applications (2021): 1-20. https://doi.org/10.1007/978-3-030-60425-7_1