FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models

Ali Farea; Iman Askerzade; Omar Alhazmi; Savaş Takan

doi:10.32604/cmc.2025.064872

Open Access icon Open Access

ARTICLE

FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models

Ali Hamid Farea^1,*, Iman Askerzade^1,2, Omar H. Alhazmi³, Savaş Takan⁴

1 Department of Computer Engineering, Ankara University, Ankara, 06830, Trkiye
2 Center for Theoretical Physics, Khazar University, Baku, Az1096, Azerbaijan
3 Department of Cyber Security, Taibah University, Medina, 42353, Saudi Arabia
4 Department of Artificial Intelligence, Ankara University, Ankara, 06830, Trkiye

* Corresponding Author: Ali Hamid Farea. Email: email

Computers, Materials & Continua 2025, 84(1), 1457-1484. https://doi.org/10.32604/cmc.2025.064872

Received 26 February 2025; Accepted 08 May 2025; Issue published 09 June 2025

Abstract

Feature selection (FS) is a pivotal pre-processing step in developing data-driven models, influencing reliability, performance and optimization. Although existing FS techniques can yield high-performance metrics for certain models, they do not invariably guarantee the extraction of the most critical or impactful features. Prior literature underscores the significance of equitable FS practices and has proposed diverse methodologies for the identification of appropriate features. However, the challenge of discerning the most relevant and influential features persists, particularly in the context of the exponential growth and heterogeneity of big data—a challenge that is increasingly salient in modern artificial intelligence (AI) applications. In response, this study introduces an innovative, automated statistical method termed Farea Similarity for Feature Selection (FSFS). The FSFS approach computes a similarity metric for each feature by benchmarking it against the record-wise mean, thereby finding feature dependencies and mitigating the influence of outliers that could potentially distort evaluation outcomes. Features are subsequently ranked according to their similarity scores, with the threshold established at the average similarity score. Notably, lower FSFS values indicate higher similarity and stronger data correlations, whereas higher values suggest lower similarity. The FSFS method is designed not only to yield reliable evaluation metrics but also to reduce data complexity without compromising model performance. Comparative analyses were performed against several established techniques, including Chi-squared (CS), Correlation Coefficient (CC), Genetic Algorithm (GA), Exhaustive Approach, Greedy Stepwise Approach, Gain Ratio, and Filtered Subset Eval, using a variety of datasets such as the Experimental Dataset, Breast Cancer Wisconsin (Original), KDD CUP 1999, NSL-KDD, UNSW-NB15, and Edge-IIoT. In the absence of the FSFS method, the highest classifier accuracies observed were 60.00%, 95.13%, 97.02%, 98.17%, 95.86%, and 94.62% for the respective datasets. When the FSFS technique was integrated with data normalization, encoding, balancing, and feature importance selection processes, accuracies improved to 100.00%, 97.81%, 98.63%, 98.94%, 94.27%, and 98.46%, respectively. The FSFS method, with a computational complexity of O(f_n log n), demonstrates robust scalability and is well-suited for datasets of large size, ensuring efficient processing even when the number of features is substantial. By automatically eliminating outliers and redundant data, FSFS reduces computational overhead, resulting in faster training and improved model performance. Overall, the FSFS framework not only optimizes performance but also enhances the interpretability and explainability of data-driven models, thereby facilitating more trustworthy decision-making in AI applications.

Keywords

Artificial intelligence; big data; feature selection; FSFS; models trustworthy; similarity-based feature ranking; explainable artificial intelligence (XAI)

Cite This Article

APA Style

Farea, A.H., Askerzade, I., Alhazmi, O.H., Takan, S. (2025). FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models. Computers, Materials & Continua, 84(1), 1457–1484. https://doi.org/10.32604/cmc.2025.064872

Vancouver Style

Farea AH, Askerzade I, Alhazmi OH, Takan S. FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models. Comput Mater Contin. 2025;84(1):1457–1484. https://doi.org/10.32604/cmc.2025.064872

IEEE Style

A. H. Farea, I. Askerzade, O. H. Alhazmi, and S. Takan, “FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models,” Comput. Mater. Contin., vol. 84, no. 1, pp. 1457–1484, 2025. https://doi.org/10.32604/cmc.2025.064872

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

FSFS: A Novel Statistical Approach for Fair and Trustworthy Impactful Feature Selection in Artificial Intelligence Models

Abstract

Keywords

Cite This Article

2254

439

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link