Chenquan Gan1,2,*, Xu Liu1, Yu Tang2, Xianrong Yu3, Qingyi Zhu1, Deepak Kumar Jain4
CMC-Computers, Materials & Continua, Vol.85, No.3, pp. 5399-5421, 2025, DOI:10.32604/cmc.2025.068126
- 23 October 2025
Abstract Multimodal sentiment analysis aims to understand emotions from text, speech, and video data. However, current methods often overlook the dominant role of text and suffer from feature loss during integration. Given the varying importance of each modality across different contexts, a central and pressing challenge in multimodal sentiment analysis lies in maximizing the use of rich intra-modal features while minimizing information loss during the fusion process. In response to these critical limitations, we propose a novel framework that integrates spatial position encoding and fusion embedding modules to address these issues. In our model, text is… More >