Efficient Android Malware Detection Using API Rank and Machine Learning

Jaemin Jung
1, Hyunjin Kim1, Seong-je Cho1, Sangchul Han2+, and Kyoungwon Suh3
 

1Dankook University, Yongin, Republic of Korea
{snorlax, khj0417, sjcho}@dankook.ac.kr

2Konkuk University, Chungju, Republic of Korea

schan@kku.ac.kr

3Illinois State University, Normal IL, United States of America

kwsuh@ilstu.edu

 

Abstract

As more and more sophisticated Android malwares appear in the online markets, accurate malware detection becomes an important issue in the Android ecosystem. This paper proposes a machine learning based Android malware detection technique that uses ranked Android APIs as machine learning features. First, our technique extracts the information of API invocation from APK files, then produces two ranked lists of APIs frequently used by benign apps and malwares respectively. After filtering out the APIs common to the both lists, we merge the two lists into a single list. We apply three classifiers, random forests (RF), k-nearest neighbor (k-NN), and logistic regression (LR) on a dataset of 60,243 apps using the merged list as the features of the classifiers. Our evaluation results show that the RF classifier can achieve the highest accuracy of 97.47 ~ 8.87% with very low false positive rate (0.99 ~ 2.38%) among them.

Keywords: API call, Benign APIs, Malicious APIs, Android malware, Machine Learning,
Ranked API list

 

+: Corresponding author: Sangchul Han
Department of Software Technology, Konkuk University, 268, Chungwon-daero, Chungju-si,

Chungcheongbuk-do, Republic of Korea, Tel: +82-43-840-3605

 

Journal of Internet Services and Information Security (JISIS), 9(1): 48-59, February 2019

DOI: 10.22667/JISIS.2019.02.28.048 [pdf]