Open Access iconOpen Access

ARTICLE

NeuroVision: Multimodal Emotion Recognition via Dynamic Frame Enhancement and EEG-Guided Fusion

Ramakrishna Gandi1,*, Geetha A.1, Ramasubbareddy B.2

1 Computer Science & Engineering Department, Annamalai University, Annamalainagar, India
2 Department of CSE, Mohan Babu University, Tirupati, India

* Corresponding Author: Ramakrishna Gandi. Email: email

Computers, Materials & Continua 2026, 88(2), 100 https://doi.org/10.32604/cmc.2026.077569

Abstract

In the fields of affective computing, human-computer interaction, and psychological evaluation, the capacity to recognize emotions is crucial. Unimodal systems in the form of visual systems or of the physiological type are usually not designed to capture the complexity that exists in emotional states. The paper proposes NeuroVision: Multimodal Emotion Recognition System, combining facial video frames information and electroencephalogram (EEG) based information to enhance the accuracy and stability of the system. The system applies ResNet50 on the spatial information of facial expressions, Vision Transformer (ViT) on the temporal movements in the video, and an EEG-MLP Encoder to read the signal, without preprocessing, that captures pure neural patterns. The fully connected layers receive the fused characteristics and label them into three emotional conditions, using a softmax operational condition, which are: Happy, Sad, and Neutral. This model was evaluated on the LUMED-2 set, and the obtained accuracy of classification (96.9%) is higher than that observed by the existing unimodal and multimodal systems. Performing a lot of evaluations, like learning curves, confusion matrices, and benchmarking, proves that NeuroVision is effective and shows generalization capability, and can be used in real-time with an adaptive system, like mental health tracking and responsive interfaces.

Keywords

Multimodal emotion recognition; EEG; facial expression analysis; vision transformer; ReseNet50; MLP encoder; human-computer interaction; affective computing; deep learning; neurophysiological signals

Cite This Article

APA Style
Gandi, R., A., G., B., R. (2026). NeuroVision: Multimodal Emotion Recognition via Dynamic Frame Enhancement and EEG-Guided Fusion. Computers, Materials & Continua, 88(2), 100. https://doi.org/10.32604/cmc.2026.077569
Vancouver Style
Gandi R, A. G, B. R. NeuroVision: Multimodal Emotion Recognition via Dynamic Frame Enhancement and EEG-Guided Fusion. Comput Mater Contin. 2026;88(2):100. https://doi.org/10.32604/cmc.2026.077569
IEEE Style
R. Gandi, G. A., and R. B., “NeuroVision: Multimodal Emotion Recognition via Dynamic Frame Enhancement and EEG-Guided Fusion,” Comput. Mater. Contin., vol. 88, no. 2, pp. 100, 2026. https://doi.org/10.32604/cmc.2026.077569



cc Copyright © 2026 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 280

    View

  • 65

    Download

  • 0

    Like

Share Link