Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification

Asaad Algarni; Aysha Naseer; Mohammed Alshehri; Yahya AlQahtani; Abdulmonem Alshahrani; Jeongmin Park

doi:10.32604/cmc.2025.064268

Open Access icon Open Access

ARTICLE

Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification

Asaad Algarni¹, Aysha Naseer ², Mohammed Alshehri³, Yahya AlQahtani⁴, Abdulmonem Alshahrani⁴, Jeongmin Park^5,*

1 Department of Computer Sciences, Faculty of Computing and Information Technology, Northern Border University, Rafha, 91911, Saudi Arabia
2 Department of Computer Science, Air University, Islamabad, 44000, Pakistan
3 Department of Computer Science, King Khalid University, Abha, 61421, Saudi Arabia
4 Department of Informatics and Computer Systems, King Khalid University, Abha, 61421, Saudi Arabia
5 Department of Computer Engineering, Tech University of Korea, 237 Sangidaehak-ro, Siheung-si, 15073, Gyeonggi-do, Republic of Korea

* Corresponding Author: Jeongmin Park. Email: email

Computers, Materials & Continua 2025, 85(1), 1981-1998. https://doi.org/10.32604/cmc.2025.064268

Received 10 February 2025; Accepted 20 June 2025; Issue published 29 August 2025

Abstract

Remote sensing plays a pivotal role in environmental monitoring, disaster relief, and urban planning, where accurate scene classification of aerial images is essential. However, conventional convolutional neural networks (CNNs) struggle with long-range dependencies and preserving high-resolution features, limiting their effectiveness in complex aerial image analysis. To address these challenges, we propose a Hybrid HRNet-Swin Transformer model that synergizes the strengths of HRNet-W48 for high-resolution segmentation and the Swin Transformer for global feature extraction. This hybrid architecture ensures robust multi-scale feature fusion, capturing fine-grained details and broader contextual relationships in aerial imagery. Our methodology begins with preprocessing steps, including normalization, histogram equalization, and noise reduction, to enhance input data quality. The HRNet-W48 backbone maintains high-resolution feature maps throughout the network, enabling precise segmentation, while the Swin Transformer leverages hierarchical self-attention to model long-range dependencies efficiently. By integrating these components, our model achieves superior performance in segmentation and classification tasks compared to traditional CNNs and standalone transformer models. We evaluate our approach on two benchmark datasets: UC Merced and WHU-RS19. Experimental results demonstrate that the proposed hybrid model outperforms existing methods, achieving state-of-the-art accuracy while maintaining computational efficiency. Specifically, it excels in preserving fine spatial details and contextual understanding, critical for applications like land-use classification and disaster assessment.

Keywords

Remote sensing; computer vision; aerial imagery; scene classification; feature extraction; transformer

Cite This Article

APA Style

Algarni, A., Naseer, A., Alshehri, M., AlQahtani, Y., Alshahrani, A. et al. (2025). Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification. Computers, Materials & Continua, 85(1), 1981–1998. https://doi.org/10.32604/cmc.2025.064268

Vancouver Style

Algarni A, Naseer A, Alshehri M, AlQahtani Y, Alshahrani A, Park J. Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification. Comput Mater Contin. 2025;85(1):1981–1998. https://doi.org/10.32604/cmc.2025.064268

IEEE Style

A. Algarni, A. Naseer, M. Alshehri, Y. AlQahtani, A. Alshahrani, and J. Park, “Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification,” Comput. Mater. Contin., vol. 85, no. 1, pp. 1981–1998, 2025. https://doi.org/10.32604/cmc.2025.064268

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Hybrid HRNet-Swin Transformer: Multi-Scale Feature Fusion for Aerial Segmentation and Classification

Abstract

Keywords

Cite This Article

2573

2093

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link