Open Access iconOpen Access

ARTICLE

crossmark

GLAMSNet: A Gated-Linear Aspect-Aware Multimodal Sentiment Network with Alignment Supervision and External Knowledge Guidance

Dan Wang1, Zhoubin Li1, Yuze Xia1,2,*, Zhenhua Yu1,*

1 College of Artificial Intelligence & Computer Science, Xi’an University of Science and Technology, Xi’an, 710054, China
2 Institute of Systems Engineering, Macau University of Science and Technology, Macau, 999078, China

* Corresponding Authors: Yuze Xia. Email: email; Zhenhua Yu. Email: email

(This article belongs to the Special Issue: Sentiment Analysis for Social Media Data: Lexicon-Based and Large Language Model Approaches)

Computers, Materials & Continua 2025, 85(3), 5823-5845. https://doi.org/10.32604/cmc.2025.071656

Abstract

Multimodal Aspect-Based Sentiment Analysis (MABSA) aims to detect sentiment polarity toward specific aspects by leveraging both textual and visual inputs. However, existing models suffer from weak aspect-image alignment, modality imbalance dominated by textual signals, and limited reasoning for implicit or ambiguous sentiments requiring external knowledge. To address these issues, we propose a unified framework named Gated-Linear Aspect-Aware Multimodal Sentiment Network (GLAMSNet). First of all, an input encoding module is employed to construct modality-specific and aspect-aware representations. Subsequently, we introduce an image–aspect correlation matching module to provide hierarchical supervision for visual-textual alignment. Building upon these components, we further design a Gated-Linear Aspect-Aware Fusion (GLAF) module to enhance aspect-aware representation learning by adaptively filtering irrelevant textual information and refining semantic alignment under aspect guidance. Additionally, an External Language Model Knowledge-Guided mechanism is integrated to incorporate sentiment-aware prior knowledge from GPT-4o, enabling robust semantic reasoning especially under noisy or ambiguous inputs. Experimental studies conducted based on Twitter-15 and Twitter-17 datasets demonstrate that the proposed model outperforms most state-of-the-art methods, achieving 79.36% accuracy and 74.72% F1-score, and 74.31% accuracy and 72.01% F1-score, respectively.

Keywords

Sentiment analysis; multimodal aspect-based sentiment analysis; cross-modal alignment; multimodal sentiment classification; large language model

Cite This Article

APA Style
Wang, D., Li, Z., Xia, Y., Yu, Z. (2025). GLAMSNet: A Gated-Linear Aspect-Aware Multimodal Sentiment Network with Alignment Supervision and External Knowledge Guidance. Computers, Materials & Continua, 85(3), 5823–5845. https://doi.org/10.32604/cmc.2025.071656
Vancouver Style
Wang D, Li Z, Xia Y, Yu Z. GLAMSNet: A Gated-Linear Aspect-Aware Multimodal Sentiment Network with Alignment Supervision and External Knowledge Guidance. Comput Mater Contin. 2025;85(3):5823–5845. https://doi.org/10.32604/cmc.2025.071656
IEEE Style
D. Wang, Z. Li, Y. Xia, and Z. Yu, “GLAMSNet: A Gated-Linear Aspect-Aware Multimodal Sentiment Network with Alignment Supervision and External Knowledge Guidance,” Comput. Mater. Contin., vol. 85, no. 3, pp. 5823–5845, 2025. https://doi.org/10.32604/cmc.2025.071656



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 619

    View

  • 177

    Download

  • 0

    Like

Share Link