Open Access iconOpen Access



Multimodal Sentiment Analysis Based on a Cross-Modal Multihead Attention Mechanism

Lujuan Deng, Boyi Liu*, Zuhe Li

School of Computer and Communication Engineering, Zhengzhou University of Light Industry, Zhengzhou, 450002, China

* Corresponding Author: Boyi Liu. Email: email

Computers, Materials & Continua 2024, 78(1), 1157-1170.


Multimodal sentiment analysis aims to understand people’s emotions and opinions from diverse data. Concatenating or multiplying various modalities is a traditional multi-modal sentiment analysis fusion method. This fusion method does not utilize the correlation information between modalities. To solve this problem, this paper proposes a model based on a multi-head attention mechanism. First, after preprocessing the original data. Then, the feature representation is converted into a sequence of word vectors and positional encoding is introduced to better understand the semantic and sequential information in the input sequence. Next, the input coding sequence is fed into the transformer model for further processing and learning. At the transformer layer, a cross-modal attention consisting of a pair of multi-head attention modules is employed to reflect the correlation between modalities. Finally, the processed results are input into the feedforward neural network to obtain the emotional output through the classification layer. Through the above processing flow, the model can capture semantic information and contextual relationships and achieve good results in various natural language processing tasks. Our model was tested on the CMU Multimodal Opinion Sentiment and Emotion Intensity (CMU-MOSEI) and Multimodal EmotionLines Dataset (MELD), achieving an accuracy of 82.04% and F1 parameters reached 80.59% on the former dataset.


Cite This Article

L. Deng, B. Liu and Z. Li, "Multimodal sentiment analysis based on a cross-modal multihead attention mechanism," Computers, Materials & Continua, vol. 78, no.1, pp. 1157–1170, 2024.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 196


  • 69


  • 0


Share Link