Home / Journals / CMC / Online First / doi:10.32604/cmc.2025.071011
Special Issues
Table of Content

Open Access

ARTICLE

Speech Emotion Recognition Based on the Adaptive Acoustic Enhancement and Refined Attention Mechanism

Jun Li1, Chunyan Liang1,*, Zhiguo Liu1, Fengpei Ge2
1 School of Computer Science and Technology, Shandong University of Technology, Zibo, 255000, China
2 Library, Beijing University of Posts and Telecommunications, Beijing, 100876, China
* Corresponding Author: Chunyan Liang. Email: email

Computers, Materials & Continua https://doi.org/10.32604/cmc.2025.071011

Received 29 July 2025; Accepted 10 November 2025; Published online 08 December 2025

Abstract

To enhance speech emotion recognition capability, this study constructs a speech emotion recognition model integrating the adaptive acoustic mixup (AAM) and improved coordinate and shuffle attention (ICASA) methods. The AAM method optimizes data augmentation by combining a sample selection strategy and dynamic interpolation coefficients, thus enabling information fusion of speech data with different emotions at the acoustic level. The ICASA method enhances feature extraction capability through dynamic fusion of the improved coordinate attention (ICA) and shuffle attention (SA) techniques. The ICA technique reduces computational overhead by employing depth-separable convolution and an h-swish activation function and captures long-range dependencies of multi-scale time-frequency features using the attention weights. The SA technique promotes feature interaction through channel shuffling, which helps the model learn richer and more discriminative emotional features. Experimental results demonstrate that, compared to the baseline model, the proposed model improves the weighted accuracy by 5.42% and 4.54%, and the unweighted accuracy by 3.37% and 3.85% on the IEMOCAP and RAVDESS datasets, respectively. These improvements were confirmed to be statistically significant by independent samples t-tests, further supporting the practical reliability and applicability of the proposed model in real-world emotion-aware speech systems.

Keywords

Speech emotion recognition; adaptive acoustic mixup enhancement; improved coordinate attention; shuffle attention; attention mechanism; deep learning
  • 64

    View

  • 10

    Download

  • 0

    Like

Share Link