Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (11)
  • Open Access

    ARTICLE

    Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition

    Liya Yue1, Pei Hu2, Shu-Chuan Chu3, Jeng-Shyang Pan3,4,*

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 1957-1975, 2024, DOI:10.32604/cmc.2024.046962

    Abstract Speech emotion recognition (SER) uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions. The number of features acquired with acoustic analysis is extremely high, so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system. The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy. First, we use the information gain and Fisher Score to sort the features extracted from signals. Then, we employ a multi-objective ranking method to evaluate these features and… More >

  • Open Access

    ARTICLE

    Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition

    Fatma Harby1, Mansor Alohali2, Adel Thaljaoui2,3,*, Amira Samy Talaat4

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 2689-2719, 2024, DOI:10.32604/cmc.2024.046623

    Abstract Machine Learning (ML) algorithms play a pivotal role in Speech Emotion Recognition (SER), although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state. The examination of the emotional states of speakers holds significant importance in a range of real-time applications, including but not limited to virtual reality, human-robot interaction, emergency centers, and human behavior assessment. Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs. Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients (MFCCs) due to their ability to capture the periodic nature of audio… More >

  • Open Access

    ARTICLE

    Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation

    Akmalbek Abdusalomov1, Alpamis Kutlimuratov2, Rashid Nasimov3, Taeg Keun Whangbo1,*

    CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 2915-2933, 2023, DOI:10.32604/cmc.2023.044466

    Abstract The performance of a speech emotion recognition (SER) system is heavily influenced by the efficacy of its feature extraction techniques. The study was designed to advance the field of SER by optimizing feature extraction techniques, specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation of Mel Frequency Cepstral Coefficients (MFCC). This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches. Ultimately, the primary objective was to elevate both the intricacy and effectiveness of our SER model, with a focus on augmenting its proficiency in the accurate identification of emotions… More >

  • Open Access

    ARTICLE

    Using Speaker-Specific Emotion Representations in Wav2vec 2.0-Based Modules for Speech Emotion Recognition

    Somin Park1, Mpabulungi Mark1, Bogyung Park2, Hyunki Hong1,*

    CMC-Computers, Materials & Continua, Vol.77, No.1, pp. 1009-1030, 2023, DOI:10.32604/cmc.2023.041332

    Abstract Speech emotion recognition is essential for frictionless human-machine interaction, where machines respond to human instructions with context-aware actions. The properties of individuals’ voices vary with culture, language, gender, and personality. These variations in speaker-specific properties may hamper the performance of standard representations in downstream tasks such as speech emotion recognition (SER). This study demonstrates the significance of speaker-specific speech characteristics and how considering them can be leveraged to improve the performance of SER models. In the proposed approach, two wav2vec-based modules (a speaker-identification network and an emotion classification network) are trained with the Arcface loss. The speaker-identification network has a… More >

  • Open Access

    ARTICLE

    A Multi-Modal Deep Learning Approach for Emotion Recognition

    H. M. Shahzad1,3, Sohail Masood Bhatti1,3,*, Arfan Jaffar1,3, Muhammad Rashid2

    Intelligent Automation & Soft Computing, Vol.36, No.2, pp. 1561-1570, 2023, DOI:10.32604/iasc.2023.032525

    Abstract In recent years, research on facial expression recognition (FER) under mask is trending. Wearing a mask for protection from Covid 19 has become a compulsion and it hides the facial expressions that is why FER under the mask is a difficult task. The prevailing unimodal techniques for facial recognition are not up to the mark in terms of good results for the masked face, however, a multimodal technique can be employed to generate better results. We proposed a multimodal methodology based on deep learning for facial recognition under a masked face using facial and vocal expressions. The multimodal has been… More >

  • Open Access

    ARTICLE

    A Multi-Level Circulant Cross-Modal Transformer for Multimodal Speech Emotion Recognition

    Peizhu Gong1, Jin Liu1, Zhongdai Wu2, Bing Han2, Y. Ken Wang3, Huihua He4,*

    CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 4203-4220, 2023, DOI:10.32604/cmc.2023.028291

    Abstract Speech emotion recognition, as an important component of human-computer interaction technology, has received increasing attention. Recent studies have treated emotion recognition of speech signals as a multimodal task, due to its inclusion of the semantic features of two different modalities, i.e., audio and text. However, existing methods often fail in effectively represent features and capture correlations. This paper presents a multi-level circulant cross-modal Transformer (MLCCT) for multimodal speech emotion recognition. The proposed model can be divided into three steps, feature extraction, interaction and fusion. Self-supervised embedding models are introduced for feature extraction, which give a more powerful representation of the… More >

  • Open Access

    ARTICLE

    Performance Analysis of a Chunk-Based Speech Emotion Recognition Model Using RNN

    Hyun-Sam Shin1, Jun-Ki Hong2,*

    Intelligent Automation & Soft Computing, Vol.36, No.1, pp. 235-248, 2023, DOI:10.32604/iasc.2023.033082

    Abstract Recently, artificial-intelligence-based automatic customer response system has been widely used instead of customer service representatives. Therefore, it is important for automatic customer service to promptly recognize emotions in a customer’s voice to provide the appropriate service accordingly. Therefore, we analyzed the performance of the emotion recognition (ER) accuracy as a function of the simulation time using the proposed chunk-based speech ER (CSER) model. The proposed CSER model divides voice signals into 3-s long chunks to efficiently recognize characteristically inherent emotions in the customer’s voice. We evaluated the performance of the ER of voice signal chunks by applying four RNN techniques—long… More >

  • Open Access

    ARTICLE

    The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition

    Mohammad Amaz Uddin1, Mohammad Salah Uddin Chowdury1, Mayeen Uddin Khandaker2,*, Nissren Tamam3, Abdelmoneim Sulieman4

    CMC-Computers, Materials & Continua, Vol.74, No.1, pp. 1709-1722, 2023, DOI:10.32604/cmc.2023.031177

    Abstract Human speech indirectly represents the mental state or emotion of others. The use of Artificial Intelligence (AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech. In this study, we introduced a robust method for emotion recognition from human speech using a well-performed preprocessing technique together with the deep learning-based mixed model consisting of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). About 2800 audio files were extracted from the Toronto emotional speech set (TESS) database for this study. A high pass and Savitzky Golay Filter have been used to obtain noise-free as well as… More >

  • Open Access

    ARTICLE

    Multilayer Neural Network Based Speech Emotion Recognition for Smart Assistance

    Sandeep Kumar1, MohdAnul Haq2, Arpit Jain3, C. Andy Jason4, Nageswara Rao Moparthi1, Nitin Mittal5, Zamil S. Alzamil2,*

    CMC-Computers, Materials & Continua, Vol.74, No.1, pp. 1523-1540, 2023, DOI:10.32604/cmc.2023.028631

    Abstract Day by day, biometric-based systems play a vital role in our daily lives. This paper proposed an intelligent assistant intended to identify emotions via voice message. A biometric system has been developed to detect human emotions based on voice recognition and control a few electronic peripherals for alert actions. This proposed smart assistant aims to provide a support to the people through buzzer and light emitting diodes (LED) alert signals and it also keep track of the places like households, hospitals and remote areas, etc. The proposed approach is able to detect seven emotions: worry, surprise, neutral, sadness, happiness, hate… More >

  • Open Access

    ARTICLE

    Design of Hierarchical Classifier to Improve Speech Emotion Recognition

    P. Vasuki*

    Computer Systems Science and Engineering, Vol.44, No.1, pp. 19-33, 2023, DOI:10.32604/csse.2023.024441

    Abstract Automatic Speech Emotion Recognition (SER) is used to recognize emotion from speech automatically. Speech Emotion recognition is working well in a laboratory environment but real-time emotion recognition has been influenced by the variations in gender, age, the cultural and acoustical background of the speaker. The acoustical resemblance between emotional expressions further increases the complexity of recognition. Many recent research works are concentrated to address these effects individually. Instead of addressing every influencing attribute individually, we would like to design a system, which reduces the effect that arises on any factor. We propose a two-level Hierarchical classifier named Interpreter of responses… More >

Displaying 1-10 on page 1 of 11. Per Page