Special Issues
Table of Content

Deep Learning for Emotion Recognition

Submission Deadline: 30 June 2026 View: 1598 Submit to Special Issue

Guest Editor(s)

Dr. Thuseethan Selvarajah

Email: thuseethan.selvarajah@cdu.edu.au

Affiliation: Faculty of Science and Technology, Charles Darwin University, Casuarina, NT 0810, Australia

Homepage:

Research Interests: deep learning, emotion recognition

图片2.png


Dr Md Rafiqul Islam

Email: mdrafiqul.islam@cdu.edu.au

Affiliation: Faculty of Science and Technology, Charles Darwin University, Casuarina, NT 0810, Australia

Homepage:

Research Interests: data visualization, machine learning, pattern mining, deep learning

图片3.png


Summary

The field of emotion recognition is experiencing a paradigm shift with the rise of deep learning. Neural architectures now enable greater accuracy, robustness, and real-world applicability, moving beyond traditional rule-based and shallow learning approaches. Deep learning solutions capture complex patterns across diverse modalities, including facial expressions, speech, physiological signals, and text, enabling multimodal systems that can better interpret human affective states.

Recent advances, such as transformer-based models, contrastive learning, and self-supervised representations, are driving scalable and generalizable emotion recognition. Multimodal deep learning, integrating vision, audio, text, and physiological data, is proving especially effective in capturing nuanced emotional cues for applications in healthcare, education, entertainment, and human–computer interaction. Emerging trends like lightweight architectures for real-time use, federated learning for privacy-preserving analysis, and explainable AI are further improving the practicality and trustworthiness of deep learning-driven emotion recognition. Nonetheless, challenges remain, including data imbalance, cultural variability, annotation subjectivity, and ethical concerns.

This Special Issue aims to showcase the transformative potential of deep learning for emotion recognition by presenting recent advancements, innovative frameworks, and applied case studies. Contributions addressing methodological challenges, cross-cultural generalization, and integration of multimodal data are highly encouraged. Topics of interest include, but are not limited to, the following:
· Deep learning architectures for emotion recognition;
· Multimodal emotion recognition from speech, facial expressions, text, and physiological signals;
· Transformer-based and self-supervised approaches for affective computing;
· Contrastive learning and representation learning for emotion analysis;
· Lightweight and efficient deep learning models for real-time emotion recognition;
· Federated learning and privacy-preserving emotion recognition systems;
· Explainable and interpretable deep learning models in emotion recognition;
· Cross-cultural and domain adaptation in emotion recognition;
· Ethical considerations and responsible deployment of emotion recognition systems;


Keywords

Multimodal Emotion Recognition, Deep Learning Architectures, Transformer Models, Self-Supervised Learning, Federated Learning, Explainable AI (XAI), Affective Computing

Published Papers


  • Open Access

    ARTICLE

    SYMPHONIA–Enhanced Multimodal Emotion Recognition with Dual-Branch Dynamic Attention and Hierarchical Adaptive Fusion

    Akmalbek Abdusalomov, Mukhriddin Mukhiddinov, Kamola Abdurashidova, Alpamis Kutlimuratov, Avazjon Marakhimov, Kuanishbay Seytnazarov, Young-Im Cho
    CMC-Computers, Materials & Continua, DOI:10.32604/cmc.2026.077057
    (This article belongs to the Special Issue: Deep Learning for Emotion Recognition)
    Abstract Human emotions are intricate and difficult to decipher through various modalities. Current methodologies frequently employ inflexible fusion strategies that do not consider the dynamic and context-sensitive characteristics of emotional expressions in both visual and textual mediums. This paper presents SYMPHONIA (Synchronizing Facial and Textual Modalities for Emotion Understanding), an innovative architecture engineered to capture and amalgamate emotional signals from facial expressions and language, attuned to contextual and modality interactions. There are two parts to SYMPHONIA: a Facial Emotion Branch that uses Vision Transformers and facial landmarks, and a Textual Emotion Branch that uses RoBERTa embeddings… More >

  • Open Access

    ARTICLE

    Quantum-Inspired Complex-Valued Fusion Framework: Optimizing Intra-Modal Semantics and Inter-Modal Fusion in Multimodal Sarcasm Detection

    Dong Zhang, Lianhe Shao, Weijie Xu, Xihan Wang, Quanli Gao
    CMC-Computers, Materials & Continua, DOI:10.32604/cmc.2026.078074
    (This article belongs to the Special Issue: Deep Learning for Emotion Recognition)
    Abstract With the popularization of multimodal content on social media, accurately identifying sarcastic intent is of great significance for understanding public attitudes and grasping public opinion trends. However, sarcastic expressions rely on context, exhibit inconsistencies in multimodal information, and have implicitly contradictory semantics. These characteristics pose challenges to traditional single-text modality methods. Existing multimodal methods, due to their default assumption of symmetric modal interactions and difficulty in capturing the subtlety of sarcasm and modal contradictions, yield limited detection performance. Therefore, this paper proposes a quantum-inspired complex-valued fusion framework to optimize the intra-modal semantics and inter-modal fusion… More >

Share Link