Open Access iconOpen Access



Enhancing Detection of Malicious URLs Using Boosting and Lexical Features

Mohammad Atrees*, Ashraf Ahmad, Firas Alghanim

Princess Sumaya University for Technology, Amman, 11941, Jordan

* Corresponding Author: Mohammad Atrees. Email: email

Intelligent Automation & Soft Computing 2022, 31(3), 1405-1422.


A malicious URL is a link that is created to spread spams, phishing, malware, ransomware, spyware, etc. A user may download malware that can adversely affect the computer by clicking on an infected URL, or might be convinced to provide confidential information to a fraudulent website causing serious losses. These threats must be identified and handled in a decent time and in an effective way. Detection is traditionally done through the blacklist usage method, which relies on keyword matching with previously known malicious domain names stored in a repository. This method is fast and easy to implement, with the advantage of having low false-positive rates regarding previously recognized malicious URLs. However, this method cannot recognize newly created malicious URLs. To solve this problem, many machine-learning models have been used. In this paper, we introduce an effective machine learning approach that uses an ensemble learner algorithm called AdaBoost (Adaptive Boosting), combined with different algorithms that enhance detection. For datasets filtration, we used CfsSubsetEval technique, which is an algorithm that searches for a subset of features that work well together. Datasets were collected from the UNB repository; divided into four categories: spam, phishing, malware, and defacement URLs; combined with benign URLs, dataset content is based on lexical features. The experimental results indicate that the proposed approach was successful in enhancing the detection accuracy of malicious URLs with less false-positive rates for all experimental algorithms.


Cite This Article

M. Atrees, A. Ahmad and F. Alghanim, "Enhancing detection of malicious urls using boosting and lexical features," Intelligent Automation & Soft Computing, vol. 31, no.3, pp. 1405–1422, 2022.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1426


  • 872


  • 1


Share Link