Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.069374
Special Issues
Table of Content

Open Access

ARTICLE

A Hybrid Deep Learning Approach Using Vision Transformer and U-Net for Flood Segmentation

Cyreneo Dofitas1, Yong-Woon Kim2, Yung-Cheol Byun3,*
1 Department of Institute of Information Science and Technology, Jeju National University, Jeju-si, 63243, Republic of Korea
2 Department of Computer Engineering, Jeju National University, Jeju, 63243, Republic of Korea
3 Department of Computer Engineering, Major of Electronic Engineering, Jeju National University, Food Tech Center (FTC), Jeju National University, Jeju, 63243, Republic of Korea
* Corresponding Author: Yung-Cheol Byun. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.069374

Received 21 June 2025; Accepted 27 August 2025; Published online 30 October 2025

Abstract

Recent advances in deep learning have significantly improved flood detection and segmentation from aerial and satellite imagery. However, conventional convolutional neural networks (CNNs) often struggle in complex flood scenarios involving reflections, occlusions, or indistinct boundaries due to limited contextual modeling. To address these challenges, we propose a hybrid flood segmentation framework that integrates a Vision Transformer (ViT) encoder with a U-Net decoder, enhanced by a novel Flood-Aware Refinement Block (FARB). The FARB module improves boundary delineation and suppresses noise by combining residual smoothing with spatial-channel attention mechanisms. We evaluate our model on a UAV-acquired flood imagery dataset, demonstrating that the proposed ViT-UNet+FARB architecture outperforms existing CNN and Transformer-based models in terms of accuracy and mean Intersection over Union (mIoU). Detailed ablation studies further validate the contribution of each component, confirming that the FARB design significantly enhances segmentation quality. To its better performance and computational efficiency, the proposed framework is well-suited for flood monitoring and disaster response applications, particularly in resource-constrained environments.

Keywords

Flood detection; vision transformer (ViT); U-Net segmentation; image processing; deep learning; artificial intelligence
  • 308

    View

  • 69

    Download

  • 0

    Like

Share Link