Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (8)
  • Open Access


    Using Speaker-Specific Emotion Representations in Wav2vec 2.0-Based Modules for Speech Emotion Recognition

    Somin Park1, Mpabulungi Mark1, Bogyung Park2, Hyunki Hong1,*

    CMC-Computers, Materials & Continua, Vol.77, No.1, pp. 1009-1030, 2023, DOI:10.32604/cmc.2023.041332

    Abstract Speech emotion recognition is essential for frictionless human-machine interaction, where machines respond to human instructions with context-aware actions. The properties of individuals’ voices vary with culture, language, gender, and personality. These variations in speaker-specific properties may hamper the performance of standard representations in downstream tasks such as speech emotion recognition (SER). This study demonstrates the significance of speaker-specific speech characteristics and how considering them can be leveraged to improve the performance of SER models. In the proposed approach, two wav2vec-based modules (a speaker-identification network and an emotion classification network) are trained with the Arcface loss. The speaker-identification network has a… More >

  • Open Access


    A Multi-Modal Deep Learning Approach for Emotion Recognition

    H. M. Shahzad1,3, Sohail Masood Bhatti1,3,*, Arfan Jaffar1,3, Muhammad Rashid2

    Intelligent Automation & Soft Computing, Vol.36, No.2, pp. 1561-1570, 2023, DOI:10.32604/iasc.2023.032525

    Abstract In recent years, research on facial expression recognition (FER) under mask is trending. Wearing a mask for protection from Covid 19 has become a compulsion and it hides the facial expressions that is why FER under the mask is a difficult task. The prevailing unimodal techniques for facial recognition are not up to the mark in terms of good results for the masked face, however, a multimodal technique can be employed to generate better results. We proposed a multimodal methodology based on deep learning for facial recognition under a masked face using facial and vocal expressions. The multimodal has been… More >

  • Open Access


    A Multi-Level Circulant Cross-Modal Transformer for Multimodal Speech Emotion Recognition

    Peizhu Gong1, Jin Liu1, Zhongdai Wu2, Bing Han2, Y. Ken Wang3, Huihua He4,*

    CMC-Computers, Materials & Continua, Vol.74, No.2, pp. 4203-4220, 2023, DOI:10.32604/cmc.2023.028291

    Abstract Speech emotion recognition, as an important component of human-computer interaction technology, has received increasing attention. Recent studies have treated emotion recognition of speech signals as a multimodal task, due to its inclusion of the semantic features of two different modalities, i.e., audio and text. However, existing methods often fail in effectively represent features and capture correlations. This paper presents a multi-level circulant cross-modal Transformer (MLCCT) for multimodal speech emotion recognition. The proposed model can be divided into three steps, feature extraction, interaction and fusion. Self-supervised embedding models are introduced for feature extraction, which give a more powerful representation of the… More >

  • Open Access


    Performance Analysis of a Chunk-Based Speech Emotion Recognition Model Using RNN

    Hyun-Sam Shin1, Jun-Ki Hong2,*

    Intelligent Automation & Soft Computing, Vol.36, No.1, pp. 235-248, 2023, DOI:10.32604/iasc.2023.033082

    Abstract Recently, artificial-intelligence-based automatic customer response system has been widely used instead of customer service representatives. Therefore, it is important for automatic customer service to promptly recognize emotions in a customer’s voice to provide the appropriate service accordingly. Therefore, we analyzed the performance of the emotion recognition (ER) accuracy as a function of the simulation time using the proposed chunk-based speech ER (CSER) model. The proposed CSER model divides voice signals into 3-s long chunks to efficiently recognize characteristically inherent emotions in the customer’s voice. We evaluated the performance of the ER of voice signal chunks by applying four RNN techniques—long… More >

  • Open Access


    The Efficacy of Deep Learning-Based Mixed Model for Speech Emotion Recognition

    Mohammad Amaz Uddin1, Mohammad Salah Uddin Chowdury1, Mayeen Uddin Khandaker2,*, Nissren Tamam3, Abdelmoneim Sulieman4

    CMC-Computers, Materials & Continua, Vol.74, No.1, pp. 1709-1722, 2023, DOI:10.32604/cmc.2023.031177

    Abstract Human speech indirectly represents the mental state or emotion of others. The use of Artificial Intelligence (AI)-based techniques may bring revolution in this modern era by recognizing emotion from speech. In this study, we introduced a robust method for emotion recognition from human speech using a well-performed preprocessing technique together with the deep learning-based mixed model consisting of Long Short-Term Memory (LSTM) and Convolutional Neural Network (CNN). About 2800 audio files were extracted from the Toronto emotional speech set (TESS) database for this study. A high pass and Savitzky Golay Filter have been used to obtain noise-free as well as… More >

  • Open Access


    Multilayer Neural Network Based Speech Emotion Recognition for Smart Assistance

    Sandeep Kumar1, MohdAnul Haq2, Arpit Jain3, C. Andy Jason4, Nageswara Rao Moparthi1, Nitin Mittal5, Zamil S. Alzamil2,*

    CMC-Computers, Materials & Continua, Vol.74, No.1, pp. 1523-1540, 2023, DOI:10.32604/cmc.2023.028631

    Abstract Day by day, biometric-based systems play a vital role in our daily lives. This paper proposed an intelligent assistant intended to identify emotions via voice message. A biometric system has been developed to detect human emotions based on voice recognition and control a few electronic peripherals for alert actions. This proposed smart assistant aims to provide a support to the people through buzzer and light emitting diodes (LED) alert signals and it also keep track of the places like households, hospitals and remote areas, etc. The proposed approach is able to detect seven emotions: worry, surprise, neutral, sadness, happiness, hate… More >

  • Open Access


    Design of Hierarchical Classifier to Improve Speech Emotion Recognition

    P. Vasuki*

    Computer Systems Science and Engineering, Vol.44, No.1, pp. 19-33, 2023, DOI:10.32604/csse.2023.024441

    Abstract Automatic Speech Emotion Recognition (SER) is used to recognize emotion from speech automatically. Speech Emotion recognition is working well in a laboratory environment but real-time emotion recognition has been influenced by the variations in gender, age, the cultural and acoustical background of the speaker. The acoustical resemblance between emotional expressions further increases the complexity of recognition. Many recent research works are concentrated to address these effects individually. Instead of addressing every influencing attribute individually, we would like to design a system, which reduces the effect that arises on any factor. We propose a two-level Hierarchical classifier named Interpreter of responses… More >

  • Open Access


    1D-CNN: Speech Emotion Recognition System Using a Stacked Network with Dilated CNN Features

    Mustaqeem, Soonil Kwon*

    CMC-Computers, Materials & Continua, Vol.67, No.3, pp. 4039-4059, 2021, DOI:10.32604/cmc.2021.015070

    Abstract Emotion recognition from speech data is an active and emerging area of research that plays an important role in numerous applications, such as robotics, virtual reality, behavior assessments, and emergency call centers. Recently, researchers have developed many techniques in this field in order to ensure an improvement in the accuracy by utilizing several deep learning approaches, but the recognition rate is still not convincing. Our main aim is to develop a new technique that increases the recognition rate with reasonable cost computations. In this paper, we suggested a new technique, which is a one-dimensional dilated convolutional neural network (1D-DCNN) for… More >

Displaying 1-10 on page 1 of 8. Per Page