Open Access iconOpen Access



Traffic Scene Captioning with Multi-Stage Feature Enhancement

Dehai Zhang*, Yu Ma, Qing Liu, Haoxing Wang, Anquan Ren, Jiashu Liang

School of Software, Yunnan University, Kunming, 650091, China

* Corresponding Author: Dehai Zhang. Email: email

(This article belongs to the Special Issue: Transport Resilience and Emergency Management in the Era of Artificial Intelligence)

Computers, Materials & Continua 2023, 76(3), 2901-2920.


Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images, ensuring road safety while providing an important decision-making function for sustainable transportation. In order to provide a comprehensive and reasonable description of complex traffic scenes, a traffic scene semantic captioning model with multi-stage feature enhancement is proposed in this paper. In general, the model follows an encoder-decoder structure. First, multi-level granularity visual features are used for feature enhancement during the encoding process, which enables the model to learn more detailed content in the traffic scene image. Second, the scene knowledge graph is applied to the decoding process, and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again, so that the model can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions. This paper reports extensive experiments on the challenging MS-COCO dataset, evaluated by five standard automatic evaluation metrics, and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods, especially achieving a score of 129.0 on the CIDEr-D evaluation metric, which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.


Cite This Article

APA Style
Zhang, D., Ma, Y., Liu, Q., Wang, H., Ren, A. et al. (2023). Traffic scene captioning with multi-stage feature enhancement. Computers, Materials & Continua, 76(3), 2901-2920.
Vancouver Style
Zhang D, Ma Y, Liu Q, Wang H, Ren A, Liang J. Traffic scene captioning with multi-stage feature enhancement. Comput Mater Contin. 2023;76(3):2901-2920
IEEE Style
D. Zhang, Y. Ma, Q. Liu, H. Wang, A. Ren, and J. Liang "Traffic Scene Captioning with Multi-Stage Feature Enhancement," Comput. Mater. Contin., vol. 76, no. 3, pp. 2901-2920. 2023.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 390


  • 192


  • 0


Share Link