Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (94)
  • Open Access

    ARTICLE

    An Adaptive Hate Speech Detection Approach Using Neutrosophic Neural Networks for Social Media Forensics

    Yasmine M. Ibrahim1,2, Reem Essameldin3, Saad M. Darwish1,*

    CMC-Computers, Materials & Continua, Vol.79, No.1, pp. 243-262, 2024, DOI:10.32604/cmc.2024.047840

    Abstract Detecting hate speech automatically in social media forensics has emerged as a highly challenging task due to the complex nature of language used in such platforms. Currently, several methods exist for classifying hate speech, but they still suffer from ambiguity when differentiating between hateful and offensive content and they also lack accuracy. The work suggested in this paper uses a combination of the Whale Optimization Algorithm (WOA) and Particle Swarm Optimization (PSO) to adjust the weights of two Multi-Layer Perceptron (MLPs) for neutrosophic sets classification. During the training process of the MLP, the WOA is employed to explore and determine… More >

  • Open Access

    ARTICLE

    Audio-Text Multimodal Speech Recognition via Dual-Tower Architecture for Mandarin Air Traffic Control Communications

    Shuting Ge1,2, Jin Ren2,3,*, Yihua Shi4, Yujun Zhang1, Shunzhi Yang2, Jinfeng Yang2

    CMC-Computers, Materials & Continua, Vol.78, No.3, pp. 3215-3245, 2024, DOI:10.32604/cmc.2023.046746

    Abstract In air traffic control communications (ATCC), misunderstandings between pilots and controllers could result in fatal aviation accidents. Fortunately, advanced automatic speech recognition technology has emerged as a promising means of preventing miscommunications and enhancing aviation safety. However, most existing speech recognition methods merely incorporate external language models on the decoder side, leading to insufficient semantic alignment between speech and text modalities during the encoding phase. Furthermore, it is challenging to model acoustic context dependencies over long distances due to the longer speech sequences than text, especially for the extended ATCC data. To address these issues, we propose a speech-text multimodal… More >

  • Open Access

    ARTICLE

    Multi-Objective Equilibrium Optimizer for Feature Selection in High-Dimensional English Speech Emotion Recognition

    Liya Yue1, Pei Hu2, Shu-Chuan Chu3, Jeng-Shyang Pan3,4,*

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 1957-1975, 2024, DOI:10.32604/cmc.2024.046962

    Abstract Speech emotion recognition (SER) uses acoustic analysis to find features for emotion recognition and examines variations in voice that are caused by emotions. The number of features acquired with acoustic analysis is extremely high, so we introduce a hybrid filter-wrapper feature selection algorithm based on an improved equilibrium optimizer for constructing an emotion recognition system. The proposed algorithm implements multi-objective emotion recognition with the minimum number of selected features and maximum accuracy. First, we use the information gain and Fisher Score to sort the features extracted from signals. Then, we employ a multi-objective ranking method to evaluate these features and… More >

  • Open Access

    ARTICLE

    Exploring Sequential Feature Selection in Deep Bi-LSTM Models for Speech Emotion Recognition

    Fatma Harby1, Mansor Alohali2, Adel Thaljaoui2,3,*, Amira Samy Talaat4

    CMC-Computers, Materials & Continua, Vol.78, No.2, pp. 2689-2719, 2024, DOI:10.32604/cmc.2024.046623

    Abstract Machine Learning (ML) algorithms play a pivotal role in Speech Emotion Recognition (SER), although they encounter a formidable obstacle in accurately discerning a speaker’s emotional state. The examination of the emotional states of speakers holds significant importance in a range of real-time applications, including but not limited to virtual reality, human-robot interaction, emergency centers, and human behavior assessment. Accurately identifying emotions in the SER process relies on extracting relevant information from audio inputs. Previous studies on SER have predominantly utilized short-time characteristics such as Mel Frequency Cepstral Coefficients (MFCCs) due to their ability to capture the periodic nature of audio… More >

  • Open Access

    ARTICLE

    Improved Speech Emotion Recognition Focusing on High-Level Data Representations and Swift Feature Extraction Calculation

    Akmalbek Abdusalomov1, Alpamis Kutlimuratov2, Rashid Nasimov3, Taeg Keun Whangbo1,*

    CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 2915-2933, 2023, DOI:10.32604/cmc.2023.044466

    Abstract The performance of a speech emotion recognition (SER) system is heavily influenced by the efficacy of its feature extraction techniques. The study was designed to advance the field of SER by optimizing feature extraction techniques, specifically through the incorporation of high-resolution Mel-spectrograms and the expedited calculation of Mel Frequency Cepstral Coefficients (MFCC). This initiative aimed to refine the system’s accuracy by identifying and mitigating the shortcomings commonly found in current approaches. Ultimately, the primary objective was to elevate both the intricacy and effectiveness of our SER model, with a focus on augmenting its proficiency in the accurate identification of emotions… More >

  • Open Access

    ARTICLE

    Joint On-Demand Pruning and Online Distillation in Automatic Speech Recognition Language Model Optimization

    Soonshin Seo1,2, Ji-Hwan Kim2,*

    CMC-Computers, Materials & Continua, Vol.77, No.3, pp. 2833-2856, 2023, DOI:10.32604/cmc.2023.042816

    Abstract Automatic speech recognition (ASR) systems have emerged as indispensable tools across a wide spectrum of applications, ranging from transcription services to voice-activated assistants. To enhance the performance of these systems, it is important to deploy efficient models capable of adapting to diverse deployment conditions. In recent years, on-demand pruning methods have obtained significant attention within the ASR domain due to their adaptability in various deployment scenarios. However, these methods often confront substantial trade-offs, particularly in terms of unstable accuracy when reducing the model size. To address challenges, this study introduces two crucial empirical findings. Firstly, it proposes the incorporation of… More >

  • Open Access

    ARTICLE

    Recognition of Human Actions through Speech or Voice Using Machine Learning Techniques

    Oscar Peña-Cáceres1,2,*, Henry Silva-Marchan3, Manuela Albert4, Miriam Gil1

    CMC-Computers, Materials & Continua, Vol.77, No.2, pp. 1873-1891, 2023, DOI:10.32604/cmc.2023.043176

    Abstract The development of artificial intelligence (AI) and smart home technologies has driven the need for speech recognition-based solutions. This demand stems from the quest for more intuitive and natural interaction between users and smart devices in their homes. Speech recognition allows users to control devices and perform everyday actions through spoken commands, eliminating the need for physical interfaces or touch screens and enabling specific tasks such as turning on or off the light, heating, or lowering the blinds. The purpose of this study is to develop a speech-based classification model for recognizing human actions in the smart home. It seeks… More >

  • Open Access

    ARTICLE

    A Robust Conformer-Based Speech Recognition Model for Mandarin Air Traffic Control

    Peiyuan Jiang1, Weijun Pan1,*, Jian Zhang1, Teng Wang1, Junxiang Huang2

    CMC-Computers, Materials & Continua, Vol.77, No.1, pp. 911-940, 2023, DOI:10.32604/cmc.2023.041772

    Abstract

    This study aims to address the deviation in downstream tasks caused by inaccurate recognition results when applying Automatic Speech Recognition (ASR) technology in the Air Traffic Control (ATC) field. This paper presents a novel cascaded model architecture, namely Conformer-CTC/Attention-T5 (CCAT), to build a highly accurate and robust ATC speech recognition model. To tackle the challenges posed by noise and fast speech rate in ATC, the Conformer model is employed to extract robust and discriminative speech representations from raw waveforms. On the decoding side, the Attention mechanism is integrated to facilitate precise alignment between input features and output characters. The Text-To-Text… More >

  • Open Access

    ARTICLE

    Using Speaker-Specific Emotion Representations in Wav2vec 2.0-Based Modules for Speech Emotion Recognition

    Somin Park1, Mpabulungi Mark1, Bogyung Park2, Hyunki Hong1,*

    CMC-Computers, Materials & Continua, Vol.77, No.1, pp. 1009-1030, 2023, DOI:10.32604/cmc.2023.041332

    Abstract Speech emotion recognition is essential for frictionless human-machine interaction, where machines respond to human instructions with context-aware actions. The properties of individuals’ voices vary with culture, language, gender, and personality. These variations in speaker-specific properties may hamper the performance of standard representations in downstream tasks such as speech emotion recognition (SER). This study demonstrates the significance of speaker-specific speech characteristics and how considering them can be leveraged to improve the performance of SER models. In the proposed approach, two wav2vec-based modules (a speaker-identification network and an emotion classification network) are trained with the Arcface loss. The speaker-identification network has a… More >

  • Open Access

    ARTICLE

    Speech Recognition via CTC-CNN Model

    Wen-Tsai Sung1, Hao-Wei Kang1, Sung-Jung Hsiao2,*

    CMC-Computers, Materials & Continua, Vol.76, No.3, pp. 3833-3858, 2023, DOI:10.32604/cmc.2023.040024

    Abstract In the speech recognition system, the acoustic model is an important underlying model, and its accuracy directly affects the performance of the entire system. This paper introduces the construction and training process of the acoustic model in detail and studies the Connectionist temporal classification (CTC) algorithm, which plays an important role in the end-to-end framework, established a convolutional neural network (CNN) combined with an acoustic model of Connectionist temporal classification to improve the accuracy of speech recognition. This study uses a sound sensor, ReSpeaker Mic Array v2.0.1, to convert the collected speech signals into text or corresponding speech signals to… More >

Displaying 1-10 on page 1 of 94. Per Page