Towards Detecting and Classifying Malicious URLs
Using Deep Learning


Clayton Johnson
1, Bishal Khadka1, Ram B. Basnet1+, and Tenzin Doleck2
 

1Colorado Mesa University, Grand Junction, CO 81501, USA

{cpjohnson, bkhadka}@mavs.coloradomesa.edu, rbasnet@coloradomesa.edu

 

2University of Southern California, Los Angeles, CA 90007, USA

doleck@usc.edu

 

Abstract

Emails containing Uniform Resource Locators (URLs) pose substantial risks to organizations by potentially compromising both credentials and network security through general and spear-phishing campaigns to their employees. The detection and classification of malicious URLs is an important research problem with practical applications. With an appropriate machine learning model, an organization may protect itself by filtering incoming emails and the websites its employees are visiting based on the maliciousness of URLs contained in emails and web pages. In this work, we compare the performance of traditional machine learning algorithms, such as Random Forest, CART, and kNN against popular deep learning framework models, such as Fast.ai and Keras-TensorFlow across CPU, GPU, and TPU architectures. Using the publicly available ISCX-URL-2016 dataset, we present the models’ performances across binary and multiclass classification experiments. By collecting accuracy and timing metrics, we find that Random Forest, Keras-TensorFlow, and Fast.ai models performed comparably and with the highest accuracies > 96% in both the detection and classification of malicious URLs, with Random Forest as the preferable model based on time, performance, and complexity constraints. Additionally, by ranking and using feature selection techniques, we determine that the top 5-10 features provide the best performances compared to using all the features provided in the dataset.

Keywords: Malicious URLs, Phishing URLs, Deep Learning, Web Security, Machine Learning

 

+: Corresponding author: Ram B. Basnet
Department of Computer Science and Engineering, Colorado Mesa University, Grand Junction, CO 81501, USA,
Tel: +1-970-248-1400

 

Journal of Wireless Mobile Networks, Ubiquitous Computing, and Dependable Applications (JoWUA), Vol. 11, No. 4, pp. 31-48, December 2020 [pdf]

Received: September 30, 2020; Accepted: December 11, 2020; Published: December 31, 2020

DOI: 10.22667/JOWUA.2020.12.31.031