DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection

Jixin Wu; Mingtao Zhou; Di Wu; Wenqi Ren; Jiatian Mei; Shu Zhang

doi:10.32604/cmc.2025.072964

Open Access icon Open Access

ARTICLE

DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection

Jixin Wu^1,2, Mingtao Zhou^2,3, Di Wu^2,3, Wenqi Ren⁴, Jiatian Mei^2,3, Shu Zhang^1,*

1 School of Information Science and Technology, Yunnan Normal University, Kunming, 650500, China
2 Yunnan Key Laboratory of Smart Education, Yunnan Normal University, Kunming, 650500, China
3 Key Laboratory of Education Informatization for Nationalities, Ministry of Education, Yunnan Normal University, Kunming, 650500, China
4 School of Cyber Science and Technology, Sun Yat-sen University, Guangzhou, 510275, China

* Corresponding Author: Shu Zhang. Email: email

(This article belongs to the Special Issue: Advances in Action Recognition: Algorithms, Applications, and Emerging Trends)

Computers, Materials & Continua 2026, 86(3), 92 https://doi.org/10.32604/cmc.2025.072964

Received 08 September 2025; Accepted 12 November 2025; Issue published 12 January 2026

Abstract

End-to-end Temporal Action Detection (TAD) has achieved remarkable progress in recent years, driven by innovations in model architectures and the emergence of Video Foundation Models (VFMs). However, existing TAD methods that perform full fine-tuning of pretrained video models often incur substantial computational costs, which become particularly pronounced when processing long video sequences. Moreover, the need for precise temporal boundary annotations makes data labeling extremely expensive. In low-resource settings where annotated samples are scarce, direct fine-tuning tends to cause overfitting. To address these challenges, we introduce Dynamic Low-Rank Adapter (DyLoRA), a lightweight fine-tuning framework tailored specifically for the TAD task. Built upon the Low-Rank Adaptation (LoRA) architecture, DyLoRA adapts only the key layers of the pretrained model via low-rank decomposition, reducing the number of trainable parameters to less than 5% of full fine-tuning methods. This significantly lowers memory consumption and mitigates overfitting in low-resource settings. Notably, DyLoRA enhances the temporal modeling capability of pretrained models by optimizing temporal dimension weights, thereby alleviating the representation misalignment of temporal features. Experimental results demonstrate that DyLoRA-TAD achieves impressive performance, with 73.9% mAP on THUMOS14, 39.52% on ActivityNet-1.3, and 28.2% on Charades, substantially surpassing the best traditional feature-based methods.

Keywords

Temporal action detection; end-to-end training; dynamic low-rank adapter; parameter-efficient fine-tuning; video understanding

Cite This Article

APA Style

Wu, J., Zhou, M., Wu, D., Ren, W., Mei, J. et al. (2026). DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection. Computers, Materials & Continua, 86(3), 92. https://doi.org/10.32604/cmc.2025.072964

Vancouver Style

Wu J, Zhou M, Wu D, Ren W, Mei J, Zhang S. DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection. Comput Mater Contin. 2026;86(3):92. https://doi.org/10.32604/cmc.2025.072964

IEEE Style

J. Wu, M. Zhou, D. Wu, W. Ren, J. Mei, and S. Zhang, “DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection,” Comput. Mater. Contin., vol. 86, no. 3, pp. 92, 2026. https://doi.org/10.32604/cmc.2025.072964

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

DyLoRA-TAD: Dynamic Low-Rank Adapter for End-to-End Temporal Action Detection

Abstract

Keywords

Cite This Article

1235

258

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link