SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion

Akmalbek Abdusalomov; Mukhriddin Mukhiddinov; Kamola Abdurashidova; Alpamis Kutlimuratov; Avazjon Marakhimov; Kuanishbay Seytnazarov; Young-Im Cho

doi:10.32604/cmc.2026.077057

Open Access icon Open Access

ARTICLE

SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion

Akmalbek Abdusalomov¹, Mukhriddin Mukhiddinov^2,3, Kamola Abdurashidova², Alpamis Kutlimuratov⁴, Avazjon Marakhimov⁵, Kuanishbay Seytnazarov⁶, Young-Im Cho^1,*

1 Department of Computer Engineering, Gachon University Sujeong-Gu, Seongnam-si, Gyeonggi-Do, Republic of Korea
2 Department of Computer Systems, Tashkent University of Information Technologies Named after Muhammad Al-Khwarizmi, Tashkent, Uzbekistan
3 Department of Industrial Management and Digital Technologies, Nordic International University, Tashkent, Uzbekistan
4 Department of Applied Informatics, Kimyo International University in Tashkent, Tashkent, Uzbekistan
5 Department of Information Processing and Management Systems, Tashkent State Technical University, Tashkent, Uzbekistan
6 Department of General Education Disciplines and Distance Education, Nukus State Pedagogical Institute Named after Ajiniyaz, Nukus, Uzbekistan

* Corresponding Author: Young-Im Cho. Email: email

(This article belongs to the Special Issue: Deep Learning for Emotion Recognition)

Computers, Materials & Continua 2026, 88(1), 74 https://doi.org/10.32604/cmc.2026.077057

Received 01 December 2025; Accepted 03 March 2026; Issue published 08 May 2026

Abstract

Human emotions are intricate and difficult to decipher through various modalities. Current methodologies frequently employ inflexible fusion strategies that do not consider the dynamic and context-sensitive characteristics of emotional expressions in both visual and textual mediums. This paper presents SYMPHONIA (Synchronizing Facial and Textual Modalities for Emotion Understanding), an innovative architecture engineered to capture and amalgamate emotional signals from facial expressions and language, attuned to contextual and modality interactions. There are two parts to SYMPHONIA: a Facial Emotion Branch that uses Vision Transformers and facial landmarks, and a Textual Emotion Branch that uses RoBERTa embeddings and graph-based reasoning. A Dual-Branch Dynamic Attention Mechanism and a Hierarchical Adaptive Fusion Module are used to connect these branches. SYMPHONIA beat the best models on four datasets: IEMOCAP, MELD, CMU-MOSI, and CMU-MOSEI. It got 80.9% accuracy and 80.1% F1-score on IEMOCAP, which was better than Dualgats (74.8%) and EmoCLIP (75.3%). SYMPHONIA got 74.2% accuracy and 73.5% F1-score for MELD. It beat its competitors by getting a 0.86 Pearson correlation on MOSI and a 0.83 on MOSEI for predicting sentiment. Cross-dataset tests showed that SYMPHONIA could generalize, with 66.9% accuracy when trained on IEMOCAP and tested on MELD. This was better than all the baselines. These results show that SYMPHONIA is good at recognizing emotions and analyzing sentiment in different situations, which shows that it can adapt and do well in different settings.

Keywords

Multimodal emotion recognition; RoBERTa; cross-modal attention; graph neural networks; contrastive learning; adaptive fusion; temporal modeling; affective computing; context-aware representation

Cite This Article

APA Style

Abdusalomov, A., Mukhiddinov, M., Abdurashidova, K., Kutlimuratov, A., Marakhimov, A. et al. (2026). SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion. Computers, Materials & Continua, 88(1), 74. https://doi.org/10.32604/cmc.2026.077057

Vancouver Style

Abdusalomov A, Mukhiddinov M, Abdurashidova K, Kutlimuratov A, Marakhimov A, Seytnazarov K, et al. SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion. Comput Mater Contin. 2026;88(1):74. https://doi.org/10.32604/cmc.2026.077057

IEEE Style

A. Abdusalomov et al., “SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion,” Comput. Mater. Contin., vol. 88, no. 1, pp. 74, 2026. https://doi.org/10.32604/cmc.2026.077057

BibTex EndNote RIS

Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion

Abstract

Keywords

Cite This Article

634

245

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link