Open Access
ARTICLE
A Hybrid Feature Selection Method for Advanced Persistent Threat Detection
1 Cyber Threat Intelligence Lab, Faculty of Computing, Universiti Teknologi Malaysia, Johor, 81310, Malaysia
2 College of Computing, Birmingham City University, Birmingham, B4 7RQ, UK
3 School of Computing, University of Portsmouth, Buckingham Building, Lion Terrace, Portsmouth, PO1 3HE, UK
* Corresponding Author: Yussuf Ahmed. Email:
(This article belongs to the Special Issue: Advanced Algorithms for Feature Selection in Machine Learning)
Computers, Materials & Continua 2025, 84(3), 5665-5691. https://doi.org/10.32604/cmc.2025.063451
Received 15 January 2025; Accepted 25 June 2025; Issue published 30 July 2025
Abstract
Advanced Persistent Threats (APTs) represent one of the most complex and dangerous categories of cyber-attacks characterised by their stealthy behaviour, long-term persistence, and ability to bypass traditional detection systems. The complexity of real-world network data poses significant challenges in detection. Machine learning models have shown promise in detecting APTs; however, their performance often suffers when trained on large datasets with redundant or irrelevant features. This study presents a novel, hybrid feature selection method designed to improve APT detection by reducing dimensionality while preserving the informative characteristics of the data. It combines Mutual Information (MI), Symmetric Uncertainty (SU) and Minimum Redundancy Maximum Relevance (mRMR) to enhance feature selection. MI and SU assess feature relevance, while mRMR maximises relevance and minimises redundancy, ensuring that the most impactful features are prioritised. This method addresses redundancy among selected features, improving the overall efficiency and effectiveness of the detection model. Experiments on a real-world APT datasets were conducted to evaluate the proposed method. Multiple classifiers including, Random Forest, Support Vector Machine (SVM), Gradient Boosting, and Neural Networks were used to assess classification performance. The results demonstrate that the proposed feature selection method significantly enhances detection accuracy compared to baseline models trained on the full feature set. The Random Forest algorithm achieved the highest performance, with near-perfect accuracy, precision, recall, and F1 scores (99.97%). The proposed adaptive thresholding algorithm within the selection method allows each classifier to benefit from a reduced and optimised feature space, resulting in improved training and predictive performance. This research offers a scalable and classifier-agnostic solution for dimensionality reduction in cybersecurity applications.Keywords
Cite This Article
Copyright © 2025 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools