Detecting Information Leakage via
a HTTP Request Based on the Edit Distance
Kazuki Chiba, Yoshiaki Hori*, and Kouichi Sakurai
Institute of Systems, Information Technologies and
Nanotechnologies /
Kyushu University Fukuoka, Japan
{chiba, hori}@itslab.inf.kyushu-u.ac.jp,
sakurai@csce.kyushu-u.ac.jp
Abstract
Recently, we often face the problem of information
leakage. In a lot of routes of leakage, the number of leakage victims
via the Internet makes up approximately the half of all leakage victims. The
cause of leakage via the Internet is divided
into human action and malware such as spyware. For example, it occurs when
human writes on the bulletin board and
spyware works. Especially a technical countermeasure against spyware is needed.
In any event, we cannot trust
countermeasures for information leakage via the Internet completely.
When a web browser communicates with a server, it sends a HTTP request.
The server replies with the information
specified in the HTTP request. Some spyware takes advantage of the HTTP request.
Installed spyware collects user¡¯s
information and embeds it in the HTTP request, then sends it to an attacker¡¯s
server. Filtering packets by a port number
of TCP or UDP is not a good way because HTTP is a main communication protocol.
A signature based technique is often
used as a countermeasure against these spyware. If data of some software
matches with signatures stored in the database,
it is regarded as spyware. This technique has an advantage that it can detect
most spyware if data of spyware is stored,
however, it loses effects if data of spyware is not stored.
Then, we propose a leakage detection system which is independent of a
database. This system focuses on the leakage
caused by human action and malware. In an existing research, researchers
calculate an edit distance between the last HTTP
request and the new HTTP request. The edit distance is much smaller than the
number of characters because a lot of HTTP
requests have common characters. We can detect leakage easily because the
information which is sent repeatedly is disregarded
and the new information which is sent suddenly is digitized and its value
stands out. We propose and evaluate a technique
that uses not only the just previous HTTP request but further previous HTTP requests
to further ignore unnecessary information.
Furthermore, we propose a system which raises an alert when it is in danger of
information leakage.
When an abnormal value is detected in a continuous numerical value, this system
judges that there is some possibility of leakage.
Assuming that certain quantity information is leaked, some of the detection
rate is higher than 90%.
Keywords: HTTP, information leakage, edit distance, behavior
based detection
*Corresponding author: Kyushu University, 744 Motooka, Nishi-ku, Fukuoka,
819-0395, Japan, Tel: +81-92-802-3666,
Email: hori@inf.kyushu-u.ac.jp,
Web: http://itslab.inf.kyushu-u.ac.jp/~hori/index.html
Journal of Internet Services and Information
Security (JISIS), 2(3/4): 18-28, November 2012 [pdf]