Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (3)
  • Open Access


    Enhancing Cross-Lingual Image Description: A Multimodal Approach for Semantic Relevance and Stylistic Alignment

    Emran Al-Buraihy, Dan Wang*

    CMC-Computers, Materials & Continua, Vol.79, No.3, pp. 3913-3938, 2024, DOI:10.32604/cmc.2024.048104

    Abstract Cross-lingual image description, the task of generating image captions in a target language from images and descriptions in a source language, is addressed in this study through a novel approach that combines neural network models and semantic matching techniques. Experiments conducted on the Flickr8k and AraImg2k benchmark datasets, featuring images and descriptions in English and Arabic, showcase remarkable performance improvements over state-of-the-art methods. Our model, equipped with the Image & Cross-Language Semantic Matching module and the Target Language Domain Evaluation module, significantly enhances the semantic relevance of generated image descriptions. For English-to-Arabic and Arabic-to-English cross-language… More >

  • Open Access


    Enhancing Image Description Generation through Deep Reinforcement Learning: Fusing Multiple Visual Features and Reward Mechanisms

    Yan Li, Qiyuan Wang*, Kaidi Jia

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 2469-2489, 2024, DOI:10.32604/cmc.2024.047822

    Abstract Image description task is the intersection of computer vision and natural language processing, and it has important prospects, including helping computers understand images and obtaining information for the visually impaired. This study presents an innovative approach employing deep reinforcement learning to enhance the accuracy of natural language descriptions of images. Our method focuses on refining the reward function in deep reinforcement learning, facilitating the generation of precise descriptions by aligning visual and textual features more closely. Our approach comprises three key architectures. Firstly, it utilizes Residual Network 101 (ResNet-101) and Faster Region-based Convolutional Neural Network… More >

  • Open Access


    Feedback LSTM Network Based on Attention for Image Description Generator

    Zhaowei Qu1,*, Bingyu Cao1, Xiaoru Wang1, Fu Li2, Peirong Xu1, Luhan Zhang1

    CMC-Computers, Materials & Continua, Vol.59, No.2, pp. 575-589, 2019, DOI:10.32604/cmc.2019.05569

    Abstract Images are complex multimedia data which contain rich semantic information. Most of current image description generator algorithms only generate plain description, with the lack of distinction between primary and secondary object, leading to insufficient high-level semantic and accuracy under public evaluation criteria. The major issue is the lack of effective network on high-level semantic sentences generation, which contains detailed description for motion and state of the principal object. To address the issue, this paper proposes the Attention-based Feedback Long Short-Term Memory Network (AFLN). Based on existing codec framework, there are two independent sub tasks in… More >

Displaying 1-10 on page 1 of 3. Per Page