Open Access iconOpen Access

ARTICLE

crossmark

Software Defect Prediction Based on Semantic Views of Metrics: Clustering Analysis and Model Performance Analysis

Baishun Zhou1,2, Haijiao Zhao3, Yuxin Wen2, Gangyi Ding1, Ying Xing3,*, Xinyang Lin4, Lei Xiao5

1 School of Computer Science and Technology, Beijing Institute of Technology, Beijing, 100081, China
2 School of Computer Science, China University of Labor Relations, Beijing, 100048, China
3 School of Intelligent Engineering and Automation, Beijing University of Posts and Telecommunications, Beijing, 100876, China
4 Xiamen Zhonglian Century Co., Ltd., Xiamen, 361013, China
5 School of Computer and Information Engineering, Xiamen University of Technology, Xiamen, 361024, China

* Corresponding Author: Ying Xing. Email: email

Computers, Materials & Continua 2025, 84(3), 5201-5221. https://doi.org/10.32604/cmc.2025.065726

Abstract

In recent years, with the rapid development of software systems, the continuous expansion of software scale and the increasing complexity of systems have led to the emergence of a growing number of software metrics. Defect prediction methods based on software metric elements highly rely on software metric data. However, redundant software metric data is not conducive to efficient defect prediction, posing severe challenges to current software defect prediction tasks. To address these issues, this paper focuses on the rational clustering of software metric data. Firstly, multiple software projects are evaluated to determine the preset number of clusters for software metrics, and various clustering methods are employed to cluster the metric elements. Subsequently, a co-occurrence matrix is designed to comprehensively quantify the number of times that metrics appear in the same category. Based on the comprehensive results, the software metric data are divided into two semantic views containing different metrics, thereby analyzing the semantic information behind the software metrics. On this basis, this paper also conducts an in-depth analysis of the impact of different semantic view of metrics on defect prediction results, as well as the performance of various classification models under these semantic views. Experiments show that the joint use of the two semantic views can significantly improve the performance of models in software defect prediction, providing a new understanding and approach at the semantic view level for defect prediction research based on software metrics.

Keywords

Software defect prediction; software engineering; semantic views; clustering; interpretability

Cite This Article

APA Style
Zhou, B., Zhao, H., Wen, Y., Ding, G., Xing, Y. et al. (2025). Software Defect Prediction Based on Semantic Views of Metrics: Clustering Analysis and Model Performance Analysis. Computers, Materials & Continua, 84(3), 5201–5221. https://doi.org/10.32604/cmc.2025.065726
Vancouver Style
Zhou B, Zhao H, Wen Y, Ding G, Xing Y, Lin X, et al. Software Defect Prediction Based on Semantic Views of Metrics: Clustering Analysis and Model Performance Analysis. Comput Mater Contin. 2025;84(3):5201–5221. https://doi.org/10.32604/cmc.2025.065726
IEEE Style
B. Zhou et al., “Software Defect Prediction Based on Semantic Views of Metrics: Clustering Analysis and Model Performance Analysis,” Comput. Mater. Contin., vol. 84, no. 3, pp. 5201–5221, 2025. https://doi.org/10.32604/cmc.2025.065726



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 982

    View

  • 597

    Download

  • 0

    Like

Share Link