Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework

Himadri Saha; Dipanwita Bhattacharya; Sancharita Dutta; Arnab Bera; Srutorshi Basuray; Satyasaran Changdar; Saptarshi Banerjee; Jon Turdiev

doi:10.32604/cmc.2025.070507

Open Access icon Open Access

REVIEW

Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework

Himadri Nath Saha¹, Dipanwita Chakraborty Bhattacharya^2,*, Sancharita Dutta³, Arnab Bera³, Srutorshi Basuray⁴, Satyasaran Changdar⁵, Saptarshi Banerjee⁶, Jon Turdiev⁷

1 Department of Computer Science, SNEC, University of Calcutta, Kolkata, 700073, India
2 Department of Computer Science, PRTGC, West Bengal State University, Barasat, 700126, India
3 Department of Computer Science & Engineering, The Neotia University, Kolkata, 743368, India
4 Department of Computer Science & Engineering, University College of Science and Technology, University of Calcutta, Kolkata, 700009, India
5 Department of Food Science, University of Copenhagen, Copenhagen, 1165, Denmark
6 Department of Computer Science, Illinois Institute of Technology, 10 West 35th Street, Chicago, IL 60616, USA
7 Department of Computer Science, San Francisco State University, 1600 Holloway Avenue, San Francisco, CA 94132, USA

* Corresponding Author: Dipanwita Chakraborty Bhattacharya. Email: email

Computers, Materials & Continua 2026, 86(2), 1-56. https://doi.org/10.32604/cmc.2025.070507

Received 17 July 2025; Accepted 16 September 2025; Issue published 09 December 2025

Abstract

The emergence of Medical Large Language Models has significantly transformed healthcare. Medical Large Language Models (Med-LLMs) serve as transformative tools that enhance clinical practice through applications in decision support, documentation, and diagnostics. This evaluation examines the performance of leading Med-LLMs, including GPT-4Med, Med-PaLM, MEDITRON, PubMedGPT, and MedAlpaca, across diverse medical datasets. It provides graphical comparisons of their effectiveness in distinct healthcare domains. The study introduces a domain-specific categorization system that aligns these models with optimal applications in clinical decision-making, documentation, drug discovery, research, patient interaction, and public health. The paper addresses deployment challenges of Medical-LLMs, emphasizing trustworthiness and explainability as essential requirements for healthcare AI. It presents current evaluation techniques that improve model transparency in high-stakes medical contexts and analyzes regulatory frameworks using benchmarking datasets such as MedQA, MedMCQA, PubMedQA, and MIMIC. By identifying ongoing challenges in bias mitigation, reliability, and ethical compliance, this work serves as a resource for selecting appropriate Med-LLMs and outlines future directions in the field. This analysis offers a roadmap for developing Med-LLMs that balance technological innovation with the trust and transparency required for clinical integration, a perspective often overlooked in existing literature.

Keywords

Medical large language models (Med-LLM); AI in healthcare; natural language processing (NLP) in medicine; fine-tuning medical LLMs; retrieval-augmented generation (RAG) in medicine; multi-modal learning in healthcare; explainability and transparency in medical AI; FDA regulations for AI in medicine; evaluation and benchmarking of medical large language models

Cite This Article

APA Style

Saha, H.N., Bhattacharya, D.C., Dutta, S., Bera, A., Basuray, S. et al. (2026). Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework. Computers, Materials & Continua, 86(2), 1–56. https://doi.org/10.32604/cmc.2025.070507

Vancouver Style

Saha HN, Bhattacharya DC, Dutta S, Bera A, Basuray S, Changdar S, et al. Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework. Comput Mater Contin. 2026;86(2):1–56. https://doi.org/10.32604/cmc.2025.070507

IEEE Style

H. N. Saha et al., “Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework,” Comput. Mater. Contin., vol. 86, no. 2, pp. 1–56, 2026. https://doi.org/10.32604/cmc.2025.070507

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Transforming Healthcare with State-of-the-Art Medical-LLMs: A Comprehensive Evaluation of Current Advances Using Benchmarking Framework

Abstract

Keywords

Cite This Article

4630

1865

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link