Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning

Ronghao Pan; Tomás Bernal-Beltrán; Alejandro Rodríguez-González; Ernestina Menasalvas-Ruíz; Rafael Valencia-García

doi:10.32604/cmc.2026.077501

Open Access icon Open Access

ARTICLE

Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning

Ronghao Pan¹, Tomás Bernal-Beltrán¹, Alejandro Rodríguez-González^2,3, Ernestina Menasalvas-Ruíz^2,3, Rafael Valencia-García^1,*

1 Departamento de Informática y Sistemas, Universidad de Murcia, Campus de Espinardo, Murcia, Murcia, Spain
2 Centro de Tecnología Biomédica, Universidad Politécnica de Madrid Campus de Montegancedo, Pozuelo de Alarcón, Madrid, Spain
3 Escuela Técnica Superior de Ingenieros Informáticos, Universidad Politécnica de Madrid, Campus de Montegancedo, Pozuelo de Alarcón, Madrid, Spain

* Corresponding Author: Rafael Valencia-García. Email: email

Computers, Materials & Continua 2026, 87(3), 105 https://doi.org/10.32604/cmc.2026.077501

Received 10 December 2025; Accepted 26 February 2026; Issue published 09 April 2026

Abstract

The digitization of healthcare has resulted in the production of large amounts of structured and unstructured clinical data, creating the need for accurate and efficient named entity recognition (NER) to support medical procedures. This study evaluates and compares three approaches to NER in the medical domain in Spanish: using Large Language Models (LLMs) with In-Context Learning techniques (Zero-Shot, Few-Shot, and Chain-of-Thought); fine-tuning of LLMs; and fine-tuning of encoder-only models. Experiments were conducted on the Meddocan, Meddoprof, Meddoplace and Symptemist benchmark datasets. Fine-tuned encoder-only models achieve the best performance across all datasets, reaching macro-F1 scores of up to 76.71 on Meddocan, 71.51 on Meddoplace, 66.07 on Meddoprof and 63.50 on Symptemist. While LLMs with prompting offer flexibility and require no task-specific training, their performance varies significantly depending on the entity type. In addition, we evaluated fine-tuning of LLMs using QLoRA, but the improvements were limited due to the small amount of training data available per entity type, which made model adaptation less effective.

Keywords

Named entity recognition; medical entity detection; large language models; transformers; prompt-tuning; fine-tuning; in-context learning; natural language processing

Cite This Article

APA Style

Pan, R., Bernal-Beltrán, T., Rodríguez-González, A., Menasalvas-Ruíz, E., Valencia-García, R. (2026). Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning. Computers, Materials & Continua, 87(3), 105. https://doi.org/10.32604/cmc.2026.077501

Vancouver Style

Pan R, Bernal-Beltrán T, Rodríguez-González A, Menasalvas-Ruíz E, Valencia-García R. Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning. Comput Mater Contin. 2026;87(3):105. https://doi.org/10.32604/cmc.2026.077501

IEEE Style

R. Pan, T. Bernal-Beltrán, A. Rodríguez-González, E. Menasalvas-Ruíz, and R. Valencia-García, “Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning,” Comput. Mater. Contin., vol. 87, no. 3, pp. 105, 2026. https://doi.org/10.32604/cmc.2026.077501

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning

Abstract

Keywords

Cite This Article

121

58

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link