The number of malicious applications targeting the Android system has literally exploded in recent years. While the security community, well aware of this fact, has proposed several methods for detection of Android malware, most of these are based on permission and API usage or the identication of expert features. Unfortunately, many of these approaches are susceptible to instruction level obfuscation techniques. Previous research on classic desktop malware has shown that some high level characteristics of the code, such as function call graphs, can be used to nd similarities between samples while being more robust against certain obfuscation strategies. However, the identication of similarities in graphs is a non-trivial problem whose complexity hinders the use of these features for malware detection. In this paper, we explore how recent developments in machine learning classication of graphs can be eciently applied to this problem. We propose a method for malware detection based on ecient embeddings of function call graphs with an explicit feature map inspired by a linear-time graph kernel. In an evaluation with 12,158 malware samples our method, purely based on structural features, outperforms several related approaches and detects 89% of the malware with few false alarms, while also allowing to pin-point malicious code structures within Android applications.
Adagio contains several modules that implement the method described in the paper: http://user.informatik.uni-goettingen.de/~hgascon/docs/2013b-aisec.pdf
These modules allow to extract and label the call graphs from a series of Android APKs or DEX files and apply an explicit feature map that captures their structural relationships. The analysis module provides classes to desing a binary or multiclass classification experiment using the vectorial representation and support vector machines.
For latest Module v-0.1 Dev