Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (1)
  • Open Access

    ARTICLE

    Triple Multimodal Cyclic Fusion and Self-Adaptive Balancing for Video Q&A Systems

    Xiliang Zhang1, Jin Liu1,*, Yue Li1, Zhongdai Wu2,3, Y. Ken Wang4

    CMC-Computers, Materials & Continua, Vol.73, No.3, pp. 6407-6424, 2022, DOI:10.32604/cmc.2022.027097

    Abstract Performance of Video Question and Answer (VQA) systems relies on capturing key information of both visual images and natural language in the context to generate relevant questions’ answers. However, traditional linear combinations of multimodal features focus only on shallow feature interactions, fall far short of the need of deep feature fusion. Attention mechanisms were used to perform deep fusion, but most of them can only process weight assignment of single-modal information, leading to attention imbalance for different modalities. To address above problems, we propose a novel VQA model based on Triple Multimodal feature Cyclic Fusion (TMCF) and Self-Adaptive Multimodal Balancing… More >

Displaying 1-10 on page 1 of 1. Per Page