Traffic Scene Captioning with Multi-Stage Feature Enhancement

Dehai Zhang; Yu Ma; Qing Liu; Haoxing Wang; Anquan Ren; Jiashu Liang

doi:10.32604/cmc.2023.038264

Open Access icon Open Access

ARTICLE

Traffic Scene Captioning with Multi-Stage Feature Enhancement

Dehai Zhang^*, Yu Ma, Qing Liu, Haoxing Wang, Anquan Ren, Jiashu Liang

School of Software, Yunnan University, Kunming, 650091, China

* Corresponding Author: Dehai Zhang. Email: email

Computers, Materials & Continua 2023, 76(3), 2901-2920. https://doi.org/10.32604/cmc.2023.038264

Received 05 December 2022; Accepted 10 April 2023; Issue published 08 October 2023

Abstract

Traffic scene captioning technology automatically generates one or more sentences to describe the content of traffic scenes by analyzing the content of the input traffic scene images, ensuring road safety while providing an important decision-making function for sustainable transportation. In order to provide a comprehensive and reasonable description of complex traffic scenes, a traffic scene semantic captioning model with multi-stage feature enhancement is proposed in this paper. In general, the model follows an encoder-decoder structure. First, multi-level granularity visual features are used for feature enhancement during the encoding process, which enables the model to learn more detailed content in the traffic scene image. Second, the scene knowledge graph is applied to the decoding process, and the semantic features provided by the scene knowledge graph are used to enhance the features learned by the decoder again, so that the model can learn the attributes of objects in the traffic scene and the relationships between objects to generate more reasonable captions. This paper reports extensive experiments on the challenging MS-COCO dataset, evaluated by five standard automatic evaluation metrics, and the results show that the proposed model has improved significantly in all metrics compared with the state-of-the-art methods, especially achieving a score of 129.0 on the CIDEr-D evaluation metric, which also indicates that the proposed model can effectively provide a more reasonable and comprehensive description of the traffic scene.

Keywords

Traffic scene captioning; sustainable transportation; feature enhancement; encoder-decoder structure; multi-level granularity; scene knowledge graph

Cite This Article

APA Style

Zhang, D., Ma, Y., Liu, Q., Wang, H., Ren, A. et al. (2023). Traffic Scene Captioning with Multi-Stage Feature Enhancement. Computers, Materials & Continua, 76(3), 2901–2920. https://doi.org/10.32604/cmc.2023.038264

Vancouver Style

Zhang D, Ma Y, Liu Q, Wang H, Ren A, Liang J. Traffic Scene Captioning with Multi-Stage Feature Enhancement. Comput Mater Contin. 2023;76(3):2901–2920. https://doi.org/10.32604/cmc.2023.038264

IEEE Style

D. Zhang, Y. Ma, Q. Liu, H. Wang, A. Ren, and J. Liang, “Traffic Scene Captioning with Multi-Stage Feature Enhancement,” Comput. Mater. Contin., vol. 76, no. 3, pp. 2901–2920, 2023. https://doi.org/10.32604/cmc.2023.038264

BibTex EndNote RIS

Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Traffic Scene Captioning with Multi-Stage Feature Enhancement

Abstract

Keywords

Cite This Article

1608

1365

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link