A Framework for Identifying Obfuscation
Techniques applied to Android Apps using Machine Learning 1Dankook
University, Yongin, Korea {parkminjae,
geunhayou, sjcho}@dankook.ac.kr 2Konkuk
University, Chungju,
Korea {minkyup, schan}@kku.ac.kr Abstract Malicious app writers tend to employ code
obfuscation techniques to prevent their malicious code from being easily
reverse engineered and analyzed. In order to effectively analyze malicious
Android apps, it is necessary to identify what code obfuscation technique is
applied to the malicious apps. Existing studies have devised some approaches
that identify app-level obfuscation. However, recent obfuscators can apply
different obfuscation techniques on a class-by-class basis not on an app
basis. In such a case, app-level obfuscation identification may be
ineffective. In this paper, we propose a new framework to identify a
class-level obfuscation technique used in Android apps. The proposed
framework vectorizes the decompiled codes of each class of Android apps using
a paragraph vector. Then the output vectors are fed to machine learning
classifier to identify what obfuscation technique is applied to each class.
We use four machine learning classifiers: Random Forest, AdaBoost, Extra
Trees, and Linear SVM, and compare the performance of the classifiers for
each obfuscation technique. Keywords: Android app, Obfuscation technique,
Class-level obfuscation, Machine learning. +: Corresponding author: Sangchul Han Department of Software Technology, Konkuk University, 268 Chungwondaero, Chungju-si, Chungcheongbuk-do, 27478, Korea, Tel: +82-43-840-3605
|
Journal of Wireless
Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA)
Vol.
10, No. 4, pp.22-30, December 2019 [pdf]
Received:
November 1, 2019; Accepted: December 7, 2019; Published: December 31, 2019
DOI: 10.22667/JOWUA.2019.12.31.022