Open Access
ARTICLE
A Software Defect Prediction Method Using a Multivariate Heterogeneous Hybrid Deep Learning Algorithm
1 College of Computer Science and Technology, Harbin Engineering University, Harbin, 150001, China
2 Information Technology Research Department, Jiangsu Automation Research Institute, Lianyungang, 222062, China
3 School of Computer Science and Technology, Harbin Institute of Technology, Shenzhen, 518055, China
* Corresponding Author: Qi Fei. Email:
Computers, Materials & Continua 2025, 82(2), 3251-3279. https://doi.org/10.32604/cmc.2024.058931
Received 24 September 2024; Accepted 05 December 2024; Issue published 17 February 2025
Abstract
Software defect prediction plays a critical role in software development and quality assurance processes. Effective defect prediction enables testers to accurately prioritize testing efforts and enhance defect detection efficiency. Additionally, this technology provides developers with a means to quickly identify errors, thereby improving software robustness and overall quality. However, current research in software defect prediction often faces challenges, such as relying on a single data source or failing to adequately account for the characteristics of multiple coexisting data sources. This approach may overlook the differences and potential value of various data sources, affecting the accuracy and generalization performance of prediction results. To address this issue, this study proposes a multivariate heterogeneous hybrid deep learning algorithm for defect prediction (DP-MHHDL). Initially, Abstract Syntax Tree (AST), Code Dependency Network (CDN), and code static quality metrics are extracted from source code files and used as inputs to ensure data diversity. Subsequently, for the three types of heterogeneous data, the study employs a graph convolutional network optimization model based on adjacency and spatial topologies, a Convolutional Neural Network-Bidirectional Long Short-Term Memory (CNN-BiLSTM) hybrid neural network model, and a TabNet model to extract data features. These features are then concatenated and processed through a fully connected neural network for defect prediction. Finally, the proposed framework is evaluated using ten promise defect repository projects, and performance is assessed with three metrics: F1, Area under the curve (AUC), and Matthews correlation coefficient (MCC). The experimental results demonstrate that the proposed algorithm outperforms existing methods, offering a novel solution for software defect prediction.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.