Preventing Data Loss by Harnessing
Semantic Similarity and Relevance 1King
Saud University, Riyadh,
Saudi Arabia 2University
of Victoria, Victoria, BC,
Canada itraore@ece.uvic.ca 3Ryerson
University, Toronto, ON,
Canada iwoungan@cs.ryerson.ca Abstract Malicious insiders are considered among the most dangerous threat actors faced by organizations that maintain security sensitive data. Data loss prevention (DLP) systems are designed primarily to detect and/or prevent any illicit data loss or leakage out of the organization by both authorized and unauthorized users. However, exiting DLP systems face several challenges related to performance and efficiency, especially when skillful malicious insiders transfer critical data after altering it syntactically but not semantically. In this paper, we propose a new approach for matching and detecting similarities between monitored and transferred data by employing the conceptual and relational semantics, including extracting explicit relationships and inferring implicit relationships. In our novel approach, we detect altered sensitive data leakage effectively by combining semantic similarity and semantic relevance metrics, which are based on an ontology. Our experimental results show that our system generates on average relatively high detection rate (DR) and low false positive rate (FPR). Keywords: Data loss
prevention, Threat actors, Malicious insiders, Similarities, Data leakage, Detection rate +: Corresponding author: Isaac Woungang 350, Victoria street,
Toronto, Ontario, M5B 2K3, Canada, Tel: +1-416-979-5000 ext. 6972, Journal of Internet
Services and Information Security (JISIS), 11(2):
78-99, May 2021 Received:
February 15, 2021; Accepted: April 20, 2021; Published: May 31, 2021 DOI:
10.22667/JISIS.2021.05.31.078 [pdf] |