Open Access iconOpen Access

ARTICLE

Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning

Ronghao Pan1, Tomás Bernal-Beltrán1, Alejandro Rodríguez-González2,3, Ernestina Menasalvas-Ruíz2,3, Rafael Valencia-García1,*

1 Departamento de Informática y Sistemas, Universidad de Murcia, Campus de Espinardo, Murcia, Murcia, Spain
2 Centro de Tecnología Biomédica, Universidad Politécnica de Madrid Campus de Montegancedo, Pozuelo de Alarcón, Madrid, Spain
3 Escuela Técnica Superior de Ingenieros Informáticos, Universidad Politécnica de Madrid, Campus de Montegancedo, Pozuelo de Alarcón, Madrid, Spain

* Corresponding Author: Rafael Valencia-García. Email: email

Computers, Materials & Continua 2026, 87(3), 105 https://doi.org/10.32604/cmc.2026.077501

Abstract

The digitization of healthcare has resulted in the production of large amounts of structured and unstructured clinical data, creating the need for accurate and efficient named entity recognition (NER) to support medical procedures. This study evaluates and compares three approaches to NER in the medical domain in Spanish: using Large Language Models (LLMs) with In-Context Learning techniques (Zero-Shot, Few-Shot, and Chain-of-Thought); fine-tuning of LLMs; and fine-tuning of encoder-only models. Experiments were conducted on the Meddocan, Meddoprof, Meddoplace and Symptemist benchmark datasets. Fine-tuned encoder-only models achieve the best performance across all datasets, reaching macro-F1 scores of up to 76.71 on Meddocan, 71.51 on Meddoplace, 66.07 on Meddoprof and 63.50 on Symptemist. While LLMs with prompting offer flexibility and require no task-specific training, their performance varies significantly depending on the entity type. In addition, we evaluated fine-tuning of LLMs using QLoRA, but the improvements were limited due to the small amount of training data available per entity type, which made model adaptation less effective.

Keywords

Named entity recognition; medical entity detection; large language models; transformers; prompt-tuning; fine-tuning; in-context learning; natural language processing

Cite This Article

APA Style
Pan, R., Bernal-Beltrán, T., Rodríguez-González, A., Menasalvas-Ruíz, E., Valencia-García, R. (2026). Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning. Computers, Materials & Continua, 87(3), 105. https://doi.org/10.32604/cmc.2026.077501
Vancouver Style
Pan R, Bernal-Beltrán T, Rodríguez-González A, Menasalvas-Ruíz E, Valencia-García R. Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning. Comput Mater Contin. 2026;87(3):105. https://doi.org/10.32604/cmc.2026.077501
IEEE Style
R. Pan, T. Bernal-Beltrán, A. Rodríguez-González, E. Menasalvas-Ruíz, and R. Valencia-García, “Evaluating Spanish Medical Entity Recognition: Large Language Models with Prompting versus Fine-Tuning,” Comput. Mater. Contin., vol. 87, no. 3, pp. 105, 2026. https://doi.org/10.32604/cmc.2026.077501



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 121

    View

  • 58

    Download

  • 0

    Like

Share Link