Open Access
ARTICLE
Contribution Tracking Feature Selection (CTFS) Based on the Fusion of Sparse Autoencoder and Mutual Information
College of Information Science and Engineering, Northeastern University, Shenyang, 110004, China
* Corresponding Author: Dazhi Wang. Email:
Computers, Materials & Continua 2024, 81(3), 3761-3780. https://doi.org/10.32604/cmc.2024.057103
Received 08 August 2024; Accepted 30 September 2024; Issue published 19 December 2024
Abstract
For data mining tasks on large-scale data, feature selection is a pivotal stage that plays an important role in removing redundant or irrelevant features while improving classifier performance. Traditional wrapper feature selection methodologies typically require extensive model training and evaluation, which cannot deliver desired outcomes within a reasonable computing time. In this paper, an innovative wrapper approach termed Contribution Tracking Feature Selection (CTFS) is proposed for feature selection of large-scale data, which can locate informative features without population-level evolution. In other words, fewer evaluations are needed for CTFS compared to other evolutionary methods. We initially introduce a refined sparse autoencoder to assess the prominence of each feature in the subsequent wrapper method. Subsequently, we utilize an enhanced wrapper feature selection technique that merges Mutual Information (MI) with individual feature contributions. Finally, a fine-tuning contribution tracking mechanism discerns informative features within the optimal feature subset, operating via a dominance accumulation mechanism. Experimental results for multiple classification performance metrics demonstrate that the proposed method effectively yields smaller feature subsets without degrading classification performance in an acceptable runtime compared to state-of-the-art algorithms across most large-scale benchmark datasets.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.