Home / Journals / CMES / Online First / doi:10.32604/cmes.2025.074364
Special Issues
Table of Content

Open Access

ARTICLE

GLM-EP: An Equivariant Graph Neural Network and Protein Language Model Integrated Framework for Predicting Essential Proteins in Bacteriophages

Jia Mi1, Zhikang Liu1, Chang Li2, Jing Wan1,*
1 College of Information Science and Technology, Beijing University of Chemical Technology, Beijing, 100029, China
2 School of Computer Science and Technology, Harbin Institute of Technology, Harbin, 150001, China
* Corresponding Author: Jing Wan. Email: email

Computer Modeling in Engineering & Sciences https://doi.org/10.32604/cmes.2025.074364

Received 09 October 2025; Accepted 19 November 2025; Published online 09 December 2025

Abstract

Recognizing essential proteins within bacteriophages is fundamental to uncovering their replication and survival mechanisms and contributes to advances in phage-based antibacterial therapies. Despite notable progress, existing computational techniques struggle to represent the interplay between sequence-derived and structure-dependent protein features. To overcome this limitation, we introduce GLM-EP, a unified framework that fuses protein language models with equivariant graph neural networks. By merging semantic embeddings extracted from amino acid sequences with geometry-aware graph representations, GLM-EP enables an in-depth depiction of phage proteins and enhances essential protein identification. Evaluation on diverse benchmark datasets confirms that GLM-EP surpasses conventional sequence-based and independent deep-learning methods, yielding higher F1 and AUROC outcomes. Component-wise analysis demonstrates that GCNII, EGNN, and the gated multi-head attention mechanism function in a complementary manner to encode complex molecular attributes. In summary, GLM-EP serves as a robust and efficient tool for bacteriophage genomic analysis and provides valuable methodological perspectives for the discovery of antibiotic-resistance therapeutic targets. The corresponding code repository is available at: https://github.com/MiJia-ID/GLM-EP (accessed on 01 November 2025).

Keywords

Essential proteins; bacteriophages; protein language models; graph neural networks
  • 76

    View

  • 14

    Download

  • 0

    Like

Share Link