Efficient Android Malware Detection Using API Rank and Machine Learning

Jaemin Jung¹, Hyunjin Kim¹, Seong-je Cho¹, Sangchul Han²⁺, and Kyoungwon Suh³

¹Dankook University, Yongin, Republic of Korea
{snorlax, khj0417, sjcho}@dankook.ac.kr

²Konkuk University, Chungju, Republic of Korea

schan@kku.ac.kr

³Illinois State University, Normal IL, United States of America

kwsuh@ilstu.edu

Abstract

As more and more sophisticated Android malwares appear in the online markets, accurate malware detection becomes an important issue in the Android ecosystem. This paper proposes a machine learning based Android malware detection technique that uses ranked Android APIs as machine learning features. First, our technique extracts the information of API invocation from APK files, then produces two ranked lists of APIs frequently used by benign apps and malwares respectively. After filtering out the APIs common to the both lists, we merge the two lists into a single list. We apply three classifiers, random forests (RF), k-nearest neighbor (k-NN), and logistic regression (LR) on a dataset of 60,243 apps using the merged list as the features of the classifiers. Our evaluation results show that the RF classifier can achieve the highest accuracy of 97.47 ~ 8.87% with very low false positive rate (0.99 ~ 2.38%) among them.

Keywords: API call, Benign APIs, Malicious APIs, Android malware, Machine Learning,
Ranked API list

+: Corresponding author: Sangchul Han
Department of Software Technology, Konkuk University, 268, Chungwon-daero, Chungju-si,

Chungcheongbuk-do, Republic of Korea, Tel: +82-43-840-3605

Journal of Internet Services and Information Security (JISIS), 9(1): 48-59, February 2019

DOI: 10.22667/JISIS.2019.02.28.048 [pdf]