Open Access iconOpen Access

ARTICLE

Effective Data Balancing and Fine-Tuning Techniques for Medical sLLMs in Resource-Constrained Domains

Seohyun Yoo, Joonseo Hyeon, Jaehyuk Cho*

Department of Software Engineering & Division of Electronics and Information Engineering, Jeonbuk National University, Jeonju-si, Republic of Korea

* Corresponding Author: Jaehyuk Cho. Email: email

(This article belongs to the Special Issue: Bridging the Gap: AutoML and Explainable AI for Industrial and Healthcare Innovations)

Computers, Materials & Continua 2026, 87(3), 20 https://doi.org/10.32604/cmc.2026.077579

Abstract

Despite remarkable advances in medical large language models (LLMs), their deployment in real clinical settings remains impractical due to prohibitive computational requirements and privacy regulations that restrict cloud-based solutions. Small LLMs (sLLMs) offer a promising alternative for on-premise deployment, yet they require domain-specific fine-tuning that still exceeds the hardware capacity of most healthcare institutions. Furthermore, the impact of multilingual data composition on medical sLLM performance remains poorly understood. We present a resource-efficient fine-tuning pipeline that integrates Quantized Low-Rank Adaptation (QLoRA), Fully Sharded Data Parallelism (FSDP), and Sequence Packing, validated across two model scales: MedGemma 4B for efficiency analysis and LLaMA 3.3 70B for data balance experiments. Our approach achieves 58.3% reduction in video random access memory (VRAM) usage (from 48 GB to 20 GB) and 5× training speedup on MedGemma 4B using NVIDIA L40s GPUs. Critically, experiments on LLaMA 3.3 70B reveal that English-heavy data mixing (10:3 ratio) degrades Korean medical law performance by 1.23 percentage points while providing only marginal English gains (+1.49 pp), demonstrating catastrophic forgetting in multilingual medical fine-tuning. Our work provides three contributions: (1) a practical fine-tuning pipeline operable within 20 GB VRAM, (2) empirical evidence that data balance—not volume—determines multilingual medical QA performance, and (3) actionable guidelines for deploying medical sLLMs in non-English clinical environments.

Keywords

Medical LLM; sLLM; QLoRA; FSDP; sequence packing; data balancing; efficient fine-tuning

Cite This Article

APA Style
Yoo, S., Hyeon, J., Cho, J. (2026). Effective Data Balancing and Fine-Tuning Techniques for Medical sLLMs in Resource-Constrained Domains. Computers, Materials & Continua, 87(3), 20. https://doi.org/10.32604/cmc.2026.077579
Vancouver Style
Yoo S, Hyeon J, Cho J. Effective Data Balancing and Fine-Tuning Techniques for Medical sLLMs in Resource-Constrained Domains. Comput Mater Contin. 2026;87(3):20. https://doi.org/10.32604/cmc.2026.077579
IEEE Style
S. Yoo, J. Hyeon, and J. Cho, “Effective Data Balancing and Fine-Tuning Techniques for Medical sLLMs in Resource-Constrained Domains,” Comput. Mater. Contin., vol. 87, no. 3, pp. 20, 2026. https://doi.org/10.32604/cmc.2026.077579



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1144

    View

  • 825

    Download

  • 0

    Like

Share Link