H-LoRA: Rethinking Rank Selection for Controllable Knowledge Retention in Edge AI

Darren Chai Xin Lun, Lim Tong Ming^*
Centre for Business Incubation and Entrepreneurial Ventures, Tunku Abdul Rahman University of Management and Technology, Jalan Genting Kelang, Setapak, Kuala Lumpur, Malaysia
* Corresponding Author: Lim Tong Ming. Email: email
(This article belongs to the Special Issue: Advanced Edge Computing and Artificial Intelligence in Smart Environment)

Computers, Materials & Continua https://doi.org/10.32604/cmc.2026.080068

Received 02 February 2026; Accepted 30 March 2026; Published online 23 April 2026

Download PDF

Abstract

The deployment of specialized language models in resource-constrained edge environments (≤1B parameters, ≤2 GB memory, ≤100 ms latency) faces a critical challenge: Supervised Fine-Tuning (SFT) achieves domain expertise but suffers from irreversible catastrophic forgetting, while traditional Low-Rank Adaptation (LoRA) with conservative ranks (r ≤ 64) often underperforms due to insufficient adaptation capacity. This work introduces H-LoRA (High-Rank LoRA) for edge-deployable models and establishes a fundamental distinction between destructive forgetting and controllable knowledge retention. Through comprehensive experiments on compact models (0.12B Minimind and Qwen-0.5B) across three domains (Human Resources, Medical, Mathematics) using 29,647 samples, we demonstrate that while both SFT and H-LoRA exhibit general capability degradation, they differ fundamentally: SFT completely destroys the original knowledge structure (1% topic retention), while H-LoRA maintains knowledge integrity with 90% topic retention—an 89 percentage point improvement—enabling post-deployment capability recovery. H-LoRA employs simplified scaling and strategic high-rank adaptation at approximately two-thirds of the model’s hidden dimension (r = 512 for d = 768), achieving SFT-level domain performance (99.81% precision) with 5× greater parameter efficiency (20.35% trainable parameters) and robust cross-domain generalization (93.5 ± 6.8% average precision). In addition, H-LoRA reduces over-the-air (OTA) update size from 1.4 GB to 96 MB (≈93%), enabling practical and frequent deployment of specialized models in bandwidth-limited edge environments. Beyond demonstrating effectiveness, this work establishes the first comprehensive framework for characterizing specialization-retention trade-offs in parameter-efficient fine-tuning, providing practical guidance for method selection in real-world deployments.

Keywords

LoRA; edge AI; knowledge retention; domain adaptation; parameter-efficient fine-tuning; catastrophic forgetting

Downloads
- Full-Text PDF
Citation Tools
- BibTex
- EndNote
- RIS

102

View
14

Download
0

Like

Skill Optimization Algorithm: A New Human-Based Metaheuristic Technique
Hadi Givi, Marie Hubalovska
A Deep Learning Approach for Detecting Covid-19 Using the Chest X-Ray Images
Fatemeh Sadeghi, Omid Rostami,...
LoRa Backscatter Network Efficient Data Transmission Using RF Source Range Control
Dae-Young Kim, SoYeon Lee, Seokhoon...
Novel Optimized Feature Selection Using Metaheuristics Applied to Physical Benchmark Datasets
Doaa Sami Khafaga, El-Sayed M....
Hybrid Dipper Throated and Grey Wolf Optimization for Feature Selection Applied to Life Benchmark Datasets
Doaa Sami Khafaga, El-Sayed M....

All issues

Online First

2026

2025

2024

2023

2022

2021

2020

2019

2018

2017

2016

2015

2014

2013

2012

2011

2010

2009

2008

2007

2006

2005

2004

H-LoRA: Rethinking Rank Selection for Controllable Knowledge Retention in Edge AI

Abstract

Keywords

102

14

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link