ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins

Mujeebu Rehman; Qinghua Liu; Ali Ghulam; Tariq Ahmad; Jawad Khan; Dildar Hussain; Yeong Gu

doi:10.32604/cmes.2025.067412

Open Access icon Open Access

ARTICLE

ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins

Mujeebu Rehman¹, Qinghua Liu¹, Ali Ghulam², Tariq Ahmad³, Jawad Khan^4,*, Dildar Hussain^5,*, Yeong Hyeon Gu⁵

1 School of Information and Communication Engineering, Guilin University of Electronic Technology, Guilin, 541004, China
2 Information Technology Centre, Sindh Agriculture University, Tandojam, 70060, Pakistan
3 School of Electrical and Information Engineering, Hunan University, Changsha, 410082, China
4 School of Computing, Gachon University, Seongnam, 13120, Republic of Korea
5 Department AI and Data Science, Sejong University, Seoul, 05006, Republic of Korea

* Corresponding Authors: Jawad Khan. Email: email ; Dildar Hussain. Email: email

(This article belongs to the Special Issue: Recent Developments on Computational Biology-II)

Computer Modeling in Engineering & Sciences 2025, 145(1), 779-805. https://doi.org/10.32604/cmes.2025.067412

Received 02 May 2025; Accepted 09 September 2025; Issue published 30 October 2025

Abstract

Identifying druggable proteins, which are capable of binding therapeutic compounds, remains a critical and resource-intensive challenge in drug discovery. To address this, we propose CEL-IDP (Comparison of Ensemble Learning Methods for Identification of Druggable Proteins), a computational framework combining three feature extraction methods Dipeptide Deviation from Expected Mean (DDE), Enhanced Amino Acid Composition (EAAC), and Enhanced Grouped Amino Acid Composition (EGAAC) with ensemble learning strategies (Bagging, Boosting, Stacking) to classify druggable proteins from sequence data. DDE captures dipeptide frequency deviations, EAAC encodes positional amino acid information, and EGAAC groups residues by physicochemical properties to generate discriminative feature vectors. These features were analyzed using ensemble models to overcome the limitations of single classifiers. EGAAC outperformed DDE and EAAC, with Random Forest (Bagging) and XGBoost (Boosting) achieving the highest accuracy of 71.66%, demonstrating superior performance in capturing critical biochemical patterns. Stacking showed intermediate results (68.33%), while EAAC and DDE-based models yielded lower accuracies (56.66%–66.87%). CEL-IDP streamlines large-scale druggability prediction, reduces reliance on costly experimental screening, and aligns with global initiatives like Target 2035 to expand action-able drug targets. This work advances machine learning-driven drug discovery by systematizing feature engineering and ensemble model optimization, providing a scalable workflow to accelerate target identification and validation.

Keywords

Druggable proteins; ensemble learning; computational drug discovery; pharmacological target identification; machine learning; feature extraction

Cite This Article

APA Style

Rehman, M., Liu, Q., Ghulam, A., Ahmad, T., Khan, J. et al. (2025). ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins. Computer Modeling in Engineering & Sciences, 145(1), 779–805. https://doi.org/10.32604/cmes.2025.067412

Vancouver Style

Rehman M, Liu Q, Ghulam A, Ahmad T, Khan J, Hussain D, et al. ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins. Comput Model Eng Sci. 2025;145(1):779–805. https://doi.org/10.32604/cmes.2025.067412

IEEE Style

M. Rehman et al., “ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins,” Comput. Model. Eng. Sci., vol. 145, no. 1, pp. 779–805, 2025. https://doi.org/10.32604/cmes.2025.067412

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

ELM-APDPs: An Explainable Ensemble Learning Method for Accurate Prediction of Druggable Proteins

Abstract

Keywords

Cite This Article

602

239

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link