Open Access iconOpen Access

ARTICLE

crossmark

Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas

Chongyang Wang1, Qiongyan Li1, Shu Liu2, Pengle Cheng1,*, Ying Huang3

1 School of Technology, Beijing Forestry University, Beijing, 100083, China
2 HES Technology Group Co., Ltd., Beijing, 100071, China
3 Department of Civil, Construction, and Environmental Engineering, North Dakota State University, Fargo, ND 58102, USA

* Corresponding Author: Pengle Cheng. Email: email

(This article belongs to the Special Issue: New Trends in Image Processing)

Computers, Materials & Continua 2025, 84(3), 5157-5176. https://doi.org/10.32604/cmc.2025.067367

Abstract

With rapid urbanization, fires pose significant challenges in urban governance. Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations in viewing angles. This study proposes a novel multimodal smoke detection method that fuses infrared and visible imagery using a transformer-based deep learning model. By capturing both thermal and visual cues, our approach significantly enhances the accuracy and robustness of smoke detection in business parks scenes. We first established a dual-view dataset comprising infrared and visible light videos, implemented an innovative image feature fusion strategy, and designed a deep learning model based on the transformer architecture and attention mechanism for smoke classification. Experimental results demonstrate that our method outperforms existing methods, under the condition of multi-view input, it achieves an accuracy rate of 90.88%, precision rate of 98.38%, recall rate of 92.41% and false positive and false negative rates both below 5%, underlining the effectiveness of the proposed multimodal and multi-view fusion approach. The attention mechanism plays a crucial role in improving detection performance, particularly in identifying subtle smoke features.

Keywords

Multimodal image processing; smoke recognition; urban safety; environmental monitoring

Cite This Article

APA Style
Wang, C., Li, Q., Liu, S., Cheng, P., Huang, Y. (2025). Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas. Computers, Materials & Continua, 84(3), 5157–5176. https://doi.org/10.32604/cmc.2025.067367
Vancouver Style
Wang C, Li Q, Liu S, Cheng P, Huang Y. Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas. Comput Mater Contin. 2025;84(3):5157–5176. https://doi.org/10.32604/cmc.2025.067367
IEEE Style
C. Wang, Q. Li, S. Liu, P. Cheng, and Y. Huang, “Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas,” Comput. Mater. Contin., vol. 84, no. 3, pp. 5157–5176, 2025. https://doi.org/10.32604/cmc.2025.067367



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1172

    View

  • 635

    Download

  • 0

    Like

Share Link