Home / Journals / CMC / Online First / doi:10.32604/cmc.2026.080068
Special Issues
Table of Content

Open Access

ARTICLE

H-LoRA: Rethinking Rank Selection for Controllable Knowledge Retention in Edge AI

Darren Chai Xin Lun, Lim Tong Ming*
Centre for Business Incubation and Entrepreneurial Ventures, Tunku Abdul Rahman University of Management and Technology, Jalan Genting Kelang, Setapak, Kuala Lumpur, Malaysia
* Corresponding Author: Lim Tong Ming. Email: email
(This article belongs to the Special Issue: Advanced Edge Computing and Artificial Intelligence in Smart Environment)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.080068

Received 02 February 2026; Accepted 30 March 2026; Published online 23 April 2026

Abstract

The deployment of specialized language models in resource-constrained edge environments (1B parameters, 2 GB memory, 100 ms latency) faces a critical challenge: Supervised Fine-Tuning (SFT) achieves domain expertise but suffers from irreversible catastrophic forgetting, while traditional Low-Rank Adaptation (LoRA) with conservative ranks (r  64) often underperforms due to insufficient adaptation capacity. This work introduces H-LoRA (High-Rank LoRA) for edge-deployable models and establishes a fundamental distinction between destructive forgetting and controllable knowledge retention. Through comprehensive experiments on compact models (0.12B Minimind and Qwen-0.5B) across three domains (Human Resources, Medical, Mathematics) using 29,647 samples, we demonstrate that while both SFT and H-LoRA exhibit general capability degradation, they differ fundamentally: SFT completely destroys the original knowledge structure (1% topic retention), while H-LoRA maintains knowledge integrity with 90% topic retention—an 89 percentage point improvement—enabling post-deployment capability recovery. H-LoRA employs simplified scaling and strategic high-rank adaptation at approximately two-thirds of the model’s hidden dimension (r = 512 for d = 768), achieving SFT-level domain performance (99.81% precision) with 5× greater parameter efficiency (20.35% trainable parameters) and robust cross-domain generalization (93.5 ± 6.8% average precision). In addition, H-LoRA reduces over-the-air (OTA) update size from 1.4 GB to 96 MB (93%), enabling practical and frequent deployment of specialized models in bandwidth-limited edge environments. Beyond demonstrating effectiveness, this work establishes the first comprehensive framework for characterizing specialization-retention trade-offs in parameter-efficient fine-tuning, providing practical guidance for method selection in real-world deployments.

Keywords

LoRA; edge AI; knowledge retention; domain adaptation; parameter-efficient fine-tuning; catastrophic forgetting
  • 102

    View

  • 14

    Download

  • 0

    Like

Share Link