Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion

Saifullah Tumrani; Abdul Siddiqui

doi:10.32604/cmc.2025.075152

Open Access icon Open Access

ARTICLE

Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion

Saifullah Tumrani^1,2,*, Abdul Jabbar Siddiqui^2,3,*

1 BioQuant, Ruprecht-Karls-Universität Heidelberg (Uni Heidelberg), Heidelberg, 69120, Germany
2 SDAIA-KFUPM Joint Research Center for Artificial Intelligence, King Fahd University of Petroleum & Minerals (KFUPM), Dhahran, 31261, Saudi Arabia
3 Computer Engineering Department, King Fahd University of Petroleum & Minerals (KFUPM), Dhahran, 31261, Saudi Arabia

* Corresponding Authors: Saifullah Tumrani. Email: email ; Abdul Jabbar Siddiqui. Email: email

Computers, Materials & Continua 2026, 87(2), 25 https://doi.org/10.32604/cmc.2025.075152

Received 26 October 2025; Accepted 17 December 2025; Issue published 12 March 2026

Abstract

Vehicle re-identification (ReID) is a challenging task in intelligent transportation, and urban surveillance systems due to its complications in camera viewpoints, vehicle scales, and environmental conditions. Recent transformer-based approaches have shown impressive performance by utilizing global dependencies, these models struggle with aspect ratio distortions and may overlook fine-grained local attributes crucial for distinguishing visually similar vehicles. We introduce a framework based on Swin Transformers that addresses these challenges by implementing three components. First, to improve feature robustness and maintain vehicle proportions, our Aspect Ratio-Aware Swin Transformer (AR-Swin) preserve the native ratio via letterbox, uses a non-square (16 × 8) patch-embedding stem, and keeps fixed 7 × 7 token windows. Second, we introduce a Dynamic Feature Fusion Network (DFFNet) that adaptively integrates global Swin features with local attribute embeddings; such as color and vehicle type enabling more discriminative representations. Third, our Regional Attention Blocks incorporate regional masks into the transformer’s windowed attention mechanism, effectively highlighting critical details like manufacturer logos or lights. On VeRi-776, we obtain 82.55 mAP, 97.26 Rank-1 and 99.23 Rank-5, and on VehicleID we obtain 91.8 Rank-1 and 97.75 Rank-5. The design is drop-in for Swin backbones and emphasizes robustness without increasing architectural complexity. Code: https://github.com/sft110/Swinvreid.

Keywords

Vehicle ReID; swin transformer; aspect ratio robustness; multi-attribute learning

Cite This Article

APA Style

Tumrani, S., Siddiqui, A.J. (2026). Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion. Computers, Materials & Continua, 87(2), 25. https://doi.org/10.32604/cmc.2025.075152

Vancouver Style

Tumrani S, Siddiqui AJ. Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion. Comput Mater Contin. 2026;87(2):25. https://doi.org/10.32604/cmc.2025.075152

IEEE Style

S. Tumrani and A. J. Siddiqui, “Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion,” Comput. Mater. Contin., vol. 87, no. 2, pp. 25, 2026. https://doi.org/10.32604/cmc.2025.075152

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Robust Swin Transformer for Vehicle Re-Identification with Dynamic Feature Fusion

Abstract

Keywords

Cite This Article

1173

420

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link