Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas

Chongyang Wang; Qiongyan Li; Shu Liu; Pengle Cheng; Ying Huang

doi:10.32604/cmc.2025.067367

Open Access icon Open Access

ARTICLE

Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas

Chongyang Wang¹, Qiongyan Li¹, Shu Liu², Pengle Cheng^1,*, Ying Huang³

1 School of Technology, Beijing Forestry University, Beijing, 100083, China
2 HES Technology Group Co., Ltd., Beijing, 100071, China
3 Department of Civil, Construction, and Environmental Engineering, North Dakota State University, Fargo, ND 58102, USA

* Corresponding Author: Pengle Cheng. Email: email

(This article belongs to the Special Issue: New Trends in Image Processing)

Computers, Materials & Continua 2025, 84(3), 5157-5176. https://doi.org/10.32604/cmc.2025.067367

Received 01 May 2025; Accepted 13 June 2025; Issue published 30 July 2025

Abstract

With rapid urbanization, fires pose significant challenges in urban governance. Traditional fire detection methods often struggle to detect smoke in complex urban scenes due to environmental interferences and variations in viewing angles. This study proposes a novel multimodal smoke detection method that fuses infrared and visible imagery using a transformer-based deep learning model. By capturing both thermal and visual cues, our approach significantly enhances the accuracy and robustness of smoke detection in business parks scenes. We first established a dual-view dataset comprising infrared and visible light videos, implemented an innovative image feature fusion strategy, and designed a deep learning model based on the transformer architecture and attention mechanism for smoke classification. Experimental results demonstrate that our method outperforms existing methods, under the condition of multi-view input, it achieves an accuracy rate of 90.88%, precision rate of 98.38%, recall rate of 92.41% and false positive and false negative rates both below 5%, underlining the effectiveness of the proposed multimodal and multi-view fusion approach. The attention mechanism plays a crucial role in improving detection performance, particularly in identifying subtle smoke features.

Keywords

Multimodal image processing; smoke recognition; urban safety; environmental monitoring

Cite This Article

APA Style

Wang, C., Li, Q., Liu, S., Cheng, P., Huang, Y. (2025). Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas. Computers, Materials & Continua, 84(3), 5157–5176. https://doi.org/10.32604/cmc.2025.067367

Vancouver Style

Wang C, Li Q, Liu S, Cheng P, Huang Y. Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas. Comput Mater Contin. 2025;84(3):5157–5176. https://doi.org/10.32604/cmc.2025.067367

IEEE Style

C. Wang, Q. Li, S. Liu, P. Cheng, and Y. Huang, “Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas,” Comput. Mater. Contin., vol. 84, no. 3, pp. 5157–5176, 2025. https://doi.org/10.32604/cmc.2025.067367

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Transformer-Based Fusion of Infrared and Visible Imagery for Smoke Recognition in Commercial Areas

Abstract

Keywords

Cite This Article

1542

758

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link