Effects of Code Obfuscation on Android App Similarity Analysis

Jonghwa Park1, Hyojung Kim1, Younsik Jeong1, Seong-je Cho1, Sangchul Han2, and Minkyu Park2
+
 

1Dankook University, Yongin, Korea

{72150262, 72150251, jeongyousik, sjcho}@dankook.ac.kr

 

2Konkuk University, Chungju, Korea

{schan, minkyup}@kku.ac.kr

 

 

Abstract

Code obfuscation is a technique to transform a program into an equivalent one that is harder to be reverse engineered and understood. On Android, well-known obfuscation techniques are shrinking, optimization, renaming, string encryption, control flow transformation, etc. On the other hand, adversaries may also maliciously use obfuscation techniques to hide pirated or stolen software. If pirated software were obfuscated, it would be difficult to detect software theft. To detect illegal software transformed by code obfuscation, one possible approach is to measure software similarity between original and obfuscated programs and determine whether the obfuscated version is an illegal copy of the original version. In this paper, we analyze empirically the effects of code obfuscation on Android app similarity analysis. The empirical measurements were done on five different Android apps with DashO obfuscator. Experimental results show that similarity measures at bytecode level are more effective than those at source code level to analyze software similarity.

Keywords: code obfuscation, Android app, software similarity, software birthmark, reverse engineering

 

+: Corresponding author: Minkyu Park
Department of Computer Engineering, Konkuk University, 268 Chungwondaero, Chungju-si,

Chungcheongbuk-do, 27478, Korea, Tel: +82-43-840-3559

 

Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA),

Vol. 6, No. 4, pp. 86-98, December 2015 [pdf]