Open Access iconOpen Access

REVIEW

crossmark

A Comprehensive Review of Multimodal Deep Learning for Enhanced Medical Diagnostics

Aya M. Al-Zoghby1,2, Ahmed Ismail Ebada1,*, Aya S. Saleh1, Mohammed Abdelhay3, Wael A. Awad1

1 Computer Science Department, Faculty of Computers and Artificial Intelligence, Damietta University, New Damietta, 34517, Egypt
2 Faculty of Computer Science and Engineering, New Mansoura University, Dakhlia, 35516, Egypt
3 Computer Science Department, Faculty of Graduate Studies for Statistical Research, Cairo University, Giza, 12613, Egypt

* Corresponding Author: Ahmed Ismail Ebada. Email: email

(This article belongs to the Special Issue: Multi-Modal Deep Learning for Advanced Medical Diagnostics)

Computers, Materials & Continua 2025, 84(3), 4155-4193. https://doi.org/10.32604/cmc.2025.065571

Abstract

Multimodal deep learning has emerged as a key paradigm in contemporary medical diagnostics, advancing precision medicine by enabling integration and learning from diverse data sources. The exponential growth of high-dimensional healthcare data, encompassing genomic, transcriptomic, and other omics profiles, as well as radiological imaging and histopathological slides, makes this approach increasingly important because, when examined separately, these data sources only offer a fragmented picture of intricate disease processes. Multimodal deep learning leverages the complementary properties of multiple data modalities to enable more accurate prognostic modeling, more robust disease characterization, and improved treatment decision-making. This review provides a comprehensive overview of the current state of multimodal deep learning approaches in medical diagnosis. We classify and examine important application domains, such as (1) radiology, where automated report generation and lesion detection are facilitated by image-text integration; (2) histopathology, where fusion models improve tumor classification and grading; and (3) multi-omics, where molecular subtypes and latent biomarkers are revealed through cross-modal learning. We provide an overview of representative research, methodological advancements, and clinical consequences for each domain. Additionally, we critically analyzed the fundamental issues preventing wider adoption, including computational complexity (particularly in training scalable, multi-branch networks), data heterogeneity (resulting from modality-specific noise, resolution variations, and inconsistent annotations), and the challenge of maintaining significant cross-modal correlations during fusion. These problems impede interpretability, which is crucial for clinical trust and use, in addition to performance and generalizability. Lastly, we outline important areas for future research, including the development of standardized protocols for harmonizing data, the creation of lightweight and interpretable fusion architectures, the integration of real-time clinical decision support systems, and the promotion of cooperation for federated multimodal learning. Our goal is to provide researchers and clinicians with a concise overview of the field’s present state, enduring constraints, and exciting directions for further research through this review.

Keywords

Multimodal deep learning; medical diagnostics; multimodal healthcare fusion; healthcare data integration

Cite This Article

APA Style
Al-Zoghby, A.M., Ismail Ebada, A., Saleh, A.S., Abdelhay, M., Awad, W.A. (2025). A Comprehensive Review of Multimodal Deep Learning for Enhanced Medical Diagnostics. Computers, Materials & Continua, 84(3), 4155–4193. https://doi.org/10.32604/cmc.2025.065571
Vancouver Style
Al-Zoghby AM, Ismail Ebada A, Saleh AS, Abdelhay M, Awad WA. A Comprehensive Review of Multimodal Deep Learning for Enhanced Medical Diagnostics. Comput Mater Contin. 2025;84(3):4155–4193. https://doi.org/10.32604/cmc.2025.065571
IEEE Style
A. M. Al-Zoghby, A. Ismail Ebada, A. S. Saleh, M. Abdelhay, and W. A. Awad, “A Comprehensive Review of Multimodal Deep Learning for Enhanced Medical Diagnostics,” Comput. Mater. Contin., vol. 84, no. 3, pp. 4155–4193, 2025. https://doi.org/10.32604/cmc.2025.065571



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2707

    View

  • 1290

    Download

  • 0

    Like

Share Link