WMA: A Multi-Scale Self-Attention Feature Extraction Network Based on  Weight Sharing for VQA

Yue Li,; Shengjie Shang

doi:10.32604/jbd.2021.017169

Open Access icon Open Access

ARTICLE

WMA: A Multi-Scale Self-Attention Feature Extraction Network Based on Weight Sharing for VQA

Yue Li, Jin Liu^*, Shengjie Shang

Shanghai Maritime University, Shanghai, 201306, China

* Corresponding Author: Jin Liu. Email: email

Journal on Big Data 2021, 3(3), 111-118. https://doi.org/10.32604/jbd.2021.017169

Received 22 January 2021; Accepted 20 June 2021; Issue published 22 November 2021

Download PDF

Abstract

Visual Question Answering (VQA) has attracted extensive research focus and has become a hot topic in deep learning recently. The development of computer vision and natural language processing technology has contributed to the advancement of this research area. Key solutions to improve the performance of VQA system exist in feature extraction, multimodal fusion, and answer prediction modules. There exists an unsolved issue in the popular VQA image feature extraction module that extracts the fine-grained features from objects of different scale difficultly. In this paper, a novel feature extraction network that combines multi-scale convolution and self-attention branches to solve the above problem is designed. Our approach achieves the state-of-the-art performance of a single model on Pascal VOC 2012, VQA 1.0, and VQA 2.0 datasets.

Keywords

VQA; feature extraction; self-attention; fine-grained

Cite This Article

APA Style

Li, Y., Liu, J., Shang, S. (2021). WMA: A multi-scale self-attention feature extraction network based on weight sharing for VQA. Journal on Big Data, 3(3), 111-118. https://doi.org/10.32604/jbd.2021.017169

Vancouver Style

Li Y, Liu J, Shang S. WMA: A multi-scale self-attention feature extraction network based on weight sharing for VQA. J Big Data . 2021;3(3):111-118 https://doi.org/10.32604/jbd.2021.017169

IEEE Style

Y. Li, J. Liu, and S. Shang "WMA: A Multi-Scale Self-Attention Feature Extraction Network Based on Weight Sharing for VQA," J. Big Data , vol. 3, no. 3, pp. 111-118. 2021. https://doi.org/10.32604/jbd.2021.017169

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

WMA: A Multi-Scale Self-Attention Feature Extraction Network Based on Weight Sharing for VQA

Abstract

Keywords

Cite This Article

1379

853

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link