Open Source Software Detection
using Function-level Static Software Birthmark


Dongjin Kim1, Seong-je Cho1, Sangchul Han2+, Minkyu Park2, and Ilsun You3
 

1Dankook University, Yongin 448-701, Korea

{kdjorang, sjcho}@dankook.ac.kr 

2Konkuk University, Chungbuk 380-701, Korea

{schan, minkyup}@kku.ac.kr 

3Korean Bible University, Seoul 138-791, Korea

isyou@bible.ac.kr 

 

 

Abstract

As open-source software (OSS) is widely used, many IT organizations adopt OSS without obeying some guidelines for open-source license agreements. To reduce risks related to open-source licenses, the organizations should meet the requirements for OSS licenses. Because some OSS components may be given from major upstream suppliers in binary form, it is very hard to verify whether a binary program contains unlicensed OSS components. In this paper, we propose a novel technique for determining whether a binary includes certain OSS components without respecting the OSS licensing terms. Our technique employs function-level static software birthmark to detect code clones in binaries. In our technique, the birthmark is a sequence of the size information of arguments and local variables of functions inside a binary, and the similarity between birthmarks is computed using semi-global sequence alignment or k-gram method. We evaluate the effectiveness of the proposed techniques by performing experiments with some binaries and OSS components.

Keywords: Open-source software, Static analysis, Software birthmark, Sequence alignment

 

+: Corresponding author: Sangchul Han
Dept. of Computer Engineering, 268 Chungwondaero, Chungju-si, Chungcheongbuk-do, 380-701,

Tel: +82-(0)43-840-3605

 

Journal of Internet Services and Information Security (JISIS), 4(4): 25-37, November 2014 [pdf]