Open Access iconOpen Access

ARTICLE

Explainable Ensemble Learning Approach for Ovarian Cancer Diagnosis Using Clinical Data

Daniyal Asif1,*, Nabil Kerdid2, Muhammad Shoaib Arif3, Mairaj Bibi4

1 Skolkovo Institute of Science and Technology (Skoltech), Moscow, Russia
2 Department of Mathematics and Statistics, College of Science, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, Saudi Arabia
3 Department of Mathematics and Sciences, College of Sciences and Humanities, Prince Sultan University, Riyadh, Saudi Arabia
4 Department of Mathematics, COMSATS University Islamabad, Park Road, Islamabad, Pakistan

* Corresponding Author: Daniyal Asif. Email: email

(This article belongs to the Special Issue: Artificial Intelligence Models in Healthcare: Challenges, Methods, and Applications)

Computer Modeling in Engineering & Sciences 2026, 146(3), 38 https://doi.org/10.32604/cmes.2026.077334

Abstract

Ovarian cancer (OC) is one of the leading causes of death related to gynecological cancer, with the main difficulty of its early diagnosis and a heterogeneous nature of tumor biomarkers. Machine learning (ML) has the potential to process complex datasets and support decision-making in OC diagnosis. Nevertheless, traditional ML models tend to be biased, overfitting, noisy, and less generalized. Moreover, their black-box nature reduces interpretability and limits their practical clinical applicability. In this study, we introduce an explainable ensemble learning (EL) model, TreeX-Stack, based on a stacking architecture that employs tree-based learners such as Decision Tree (DT), Random Forest (RF), Gradient Boosting (GB), and Extreme Gradient Boosting (XGBoost) as base learners, and Logistic Regression (LR) as the meta-learner to enhance ovarian cancer (OC) diagnosis. Local Interpretable Model-Agnostic Explanations (LIME) are used to explain individual predictions, making the model outputs more clinically interpretable and applicable. The model is trained on the dataset that includes demographic information, blood test, general chemistry, and tumor markers. Extensive preprocessing includes handling missing data using iterative imputation with Bayesian Ridge and addressing multicollinearity by removing features with correlation coefficients above 0.7. Relevant features are then selected using the Boruta feature selection method. To obtain robust and unbiased performance estimates during hyperparameter tuning, nested cross-validation (CV) with grid search is employed, and all experiments are repeated five times to ensure statistical reliability. TreeX-Stack demonstrates excellent diagnostic performance, achieving an accuracy of 0.9027, a precision of 0.8673, a recall of 0.9391, and an F1-score of 0.9012. Feature-importance analyses using LIME and permutation importance highlight Human Epididymis Protein 4 (HE4) as the most significant biomarker for OC. The combination of high predictive performance and interpretability makes TreeX-Stack a reliable tool for clinical decision support in OC diagnosis.

Keywords

Ovarian cancer; ensemble learning; machine learning; stacking; explainable artificial intelligence; medical data analysis; clinical data; HE4

Cite This Article

APA Style
Asif, D., Kerdid, N., Arif, M.S., Bibi, M. (2026). Explainable Ensemble Learning Approach for Ovarian Cancer Diagnosis Using Clinical Data. Computer Modeling in Engineering & Sciences, 146(3), 38. https://doi.org/10.32604/cmes.2026.077334
Vancouver Style
Asif D, Kerdid N, Arif MS, Bibi M. Explainable Ensemble Learning Approach for Ovarian Cancer Diagnosis Using Clinical Data. Comput Model Eng Sci. 2026;146(3):38. https://doi.org/10.32604/cmes.2026.077334
IEEE Style
D. Asif, N. Kerdid, M. S. Arif, and M. Bibi, “Explainable Ensemble Learning Approach for Ovarian Cancer Diagnosis Using Clinical Data,” Comput. Model. Eng. Sci., vol. 146, no. 3, pp. 38, 2026. https://doi.org/10.32604/cmes.2026.077334



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 10

    View

  • 4

    Download

  • 0

    Like

Share Link