Open Access
ARTICLE
Advances in Machine Learning for Explainable Intrusion Detection Using Imbalance Datasets in Cybersecurity with Harris Hawks Optimization
1 Artificial Intelligence & Data Analytics Lab, CCIS, Prince Sultan University, Riyadh, 11586, Saudi Arabia
2 Department of Computer Sciences, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 11671, Saudi Arabia
3 Department of Information Systems, College of Computer and Information Sciences, Princess Nourah bint Abdulrahman University, Riyadh, 11671, Saudi Arabia
* Corresponding Author: Amjad Rehman. Email:
(This article belongs to the Special Issue: Advances in Machine Learning and Artificial Intelligence for Intrusion Detection Systems)
Computers, Materials & Continua 2026, 86(1), 1-15. https://doi.org/10.32604/cmc.2025.068958
Received 10 June 2025; Accepted 12 September 2025; Issue published 10 November 2025
Abstract
Modern intrusion detection systems (MIDS) face persistent challenges in coping with the rapid evolution of cyber threats, high-volume network traffic, and imbalanced datasets. Traditional models often lack the robustness and explainability required to detect novel and sophisticated attacks effectively. This study introduces an advanced, explainable machine learning framework for multi-class IDS using the KDD99 and IDS datasets, which reflects real-world network behavior through a blend of normal and diverse attack classes. The methodology begins with sophisticated data preprocessing, incorporating both RobustScaler and QuantileTransformer to address outliers and skewed feature distributions, ensuring standardized and model-ready inputs. Critical dimensionality reduction is achieved via the Harris Hawks Optimization (HHO) algorithm—a nature-inspired metaheuristic modeled on hawks’ hunting strategies. HHO efficiently identifies the most informative features by optimizing a fitness function based on classification performance. Following feature selection, the SMOTE is applied to the training data to resolve class imbalance by synthetically augmenting underrepresented attack types. The stacked architecture is then employed, combining the strengths of XGBoost, SVM, and RF as base learners. This layered approach improves prediction robustness and generalization by balancing bias and variance across diverse classifiers. The model was evaluated using standard classification metrics: precision, recall, F1-score, and overall accuracy. The best overall performance was recorded with an accuracy of 99.44% for UNSW-NB15, demonstrating the model’s effectiveness. After balancing, the model demonstrated a clear improvement in detecting the attacks. We tested the model on four datasets to show the effectiveness of the proposed approach and performed the ablation study to check the effect of each parameter. Also, the proposed model is computationaly efficient. To support transparency and trust in decision-making, explainable AI (XAI) techniques are incorporated that provides both global and local insight into feature contributions, and offers intuitive visualizations for individual predictions. This makes it suitable for practical deployment in cybersecurity environments that demand both precision and accountability.Keywords
Cite This Article
Copyright © 2026 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools