Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (9)
  • Open Access

    ARTICLE

    TC-Net: A Modest & Lightweight Emotion Recognition System Using Temporal Convolution Network

    Muhammad Ishaq1, Mustaqeem Khan1,2, Soonil Kwon1,*

    Computer Systems Science and Engineering, Vol.46, No.3, pp. 3355-3369, 2023, DOI:10.32604/csse.2023.037373

    Abstract Speech signals play an essential role in communication and provide an efficient way to exchange information between humans and machines. Speech Emotion Recognition (SER) is one of the critical sources for human evaluation, which is applicable in many real-world applications such as healthcare, call centers, robotics, safety, and virtual reality. This work developed a novel TCN-based emotion recognition system using speech signals through a spatial-temporal convolution network to recognize the speaker’s emotional state. The authors designed a Temporal Convolutional Network (TCN) core block to recognize long-term dependencies in speech signals and then feed these temporal cues to a dense network… More >

  • Open Access

    ARTICLE

    Implementation of Hybrid Deep Reinforcement Learning Technique for Speech Signal Classification

    R. Gayathri1,*, K. Sheela Sobana Rani2

    Computer Systems Science and Engineering, Vol.46, No.1, pp. 43-56, 2023, DOI:10.32604/csse.2023.032491

    Abstract Classification of speech signals is a vital part of speech signal processing systems. With the advent of speech coding and synthesis, the classification of the speech signal is made accurate and faster. Conventional methods are considered inaccurate due to the uncertainty and diversity of speech signals in the case of real speech signal classification. In this paper, we use efficient speech signal classification using a series of neural network classifiers with reinforcement learning operations. Prior classification of speech signals, the study extracts the essential features from the speech signal using Cepstral Analysis. The features are extracted by converting the speech… More >

  • Open Access

    ARTICLE

    Nonlinear Dynamic System Identification of ARX Model for Speech Signal Identification

    Rakesh Kumar Pattanaik1, Mihir N. Mohanty1,*, Srikanta Ku. Mohapatra2, Binod Ku. Pattanayak3

    Computer Systems Science and Engineering, Vol.46, No.1, pp. 195-208, 2023, DOI:10.32604/csse.2023.029591

    Abstract System Identification becomes very crucial in the field of nonlinear and dynamic systems or practical systems. As most practical systems don’t have prior information about the system behaviour thus, mathematical modelling is required. The authors have proposed a stacked Bidirectional Long-Short Term Memory (Bi-LSTM) model to handle the problem of nonlinear dynamic system identification in this paper. The proposed model has the ability of faster learning and accurate modelling as it can be trained in both forward and backward directions. The main advantage of Bi-LSTM over other algorithms is that it processes inputs in two ways: one from the past… More >

  • Open Access

    REVIEW

    Challenges and Limitations in Speech Recognition Technology: A Critical Review of Speech Signal Processing Algorithms, Tools and Systems

    Sneha Basak1, Himanshi Agrawal1, Shreya Jena1, Shilpa Gite2,*, Mrinal Bachute2, Biswajeet Pradhan3,4,5,*, Mazen Assiri4

    CMES-Computer Modeling in Engineering & Sciences, Vol.135, No.2, pp. 1053-1089, 2023, DOI:10.32604/cmes.2022.021755

    Abstract Speech recognition systems have become a unique human-computer interaction (HCI) family. Speech is one of the most naturally developed human abilities; speech signal processing opens up a transparent and hand-free computation experience. This paper aims to present a retrospective yet modern approach to the world of speech recognition systems. The development journey of ASR (Automatic Speech Recognition) has seen quite a few milestones and breakthrough technologies that have been highlighted in this paper. A step-by-step rundown of the fundamental stages in developing speech recognition systems has been presented, along with a brief discussion of various modern-day developments and applications in… More >

  • Open Access

    ARTICLE

    Speech Encryption with Fractional Watermark

    Yan Sun1,2,*, Cun Zhu1, Qi Cui3

    CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 1817-1825, 2022, DOI:10.32604/cmc.2022.029408

    Abstract Research on the feature of speech and image signals are carried out from two perspectives, the time domain and the frequency domain. The speech and image signals are a non-stationary signal, so FT is not used for the non-stationary characteristics of the signal. When short-term stable speech is obtained by windowing and framing the subsequent processing of the signal is completed by the Discrete Fourier Transform (DFT). The Fast Discrete Fourier Transform is a commonly used analysis method for speech and image signal processing in frequency domain. It has the problem of adjusting window size to a for desired resolution.… More >

  • Open Access

    ARTICLE

    Parkinson's Detection Using RNN-Graph-LSTM with Optimization Based on Speech Signals

    Ahmed S. Almasoud1, Taiseer Abdalla Elfadil Eisa2, Fahd N. Al-Wesabi3,4, Abubakar Elsafi5, Mesfer Al Duhayyim6, Ishfaq Yaseen7, Manar Ahmed Hamza7,*, Abdelwahed Motwakel7

    CMC-Computers, Materials & Continua, Vol.72, No.1, pp. 871-886, 2022, DOI:10.32604/cmc.2022.024596

    Abstract Early detection of Parkinson's Disease (PD) using the PD patients’ voice changes would avoid the intervention before the identification of physical symptoms. Various machine learning algorithms were developed to detect PD detection. Nevertheless, these ML methods are lack in generalization and reduced classification performance due to subject overlap. To overcome these issues, this proposed work apply graph long short term memory (GLSTM) model to classify the dynamic features of the PD patient speech signal. The proposed classification model has been further improved by implementing the recurrent neural network (RNN) in batch normalization layer of GLSTM and optimized with adaptive moment… More >

  • Open Access

    ARTICLE

    Enhancing Parkinson’s Disease Diagnosis Accuracy Through Speech Signal Algorithm Modeling

    Omar M. El-Habbak1, Abdelrahman M. Abdelalim1, Nour H. Mohamed1, Habiba M. Abd-Elaty1, Mostafa A. Hammouda1, Yasmeen Y. Mohamed1, Mohanad A. Taifor1, Ali W. Mohamed2,3,*

    CMC-Computers, Materials & Continua, Vol.70, No.2, pp. 2953-2969, 2022, DOI:10.32604/cmc.2022.020109

    Abstract Parkinson’s disease (PD), one of whose symptoms is dysphonia, is a prevalent neurodegenerative disease. The use of outdated diagnosis techniques, which yield inaccurate and unreliable results, continues to represent an obstacle in early-stage detection and diagnosis for clinical professionals in the medical field. To solve this issue, the study proposes using machine learning and deep learning models to analyze processed speech signals of patients’ voice recordings. Datasets of these processed speech signals were obtained and experimented on by random forest and logistic regression classifiers. Results were highly successful, with 90% accuracy produced by the random forest classifier and 81.5% by… More >

  • Open Access

    ARTICLE

    Superposition of Functional Contours Based Prosodic Feature Extraction for Speech Processing

    Shahid Ali Mahar1, Mumtaz Hussain Mahar1, Javed Ahmed Mahar1, Mehedi Masud2, Muneer Ahmad3, NZ Jhanjhi4,*, Mirza Abdur Razzaq1

    Intelligent Automation & Soft Computing, Vol.29, No.1, pp. 183-197, 2021, DOI:10.32604/iasc.2021.015755

    Abstract Speech signal analysis for the extraction of speech elements is viable in natural language applications. Rhythm, intonation, stress, and tone are the elements of prosody. These features are essential in emotional speech, speech to speech, speech recognition, and other applications. The current study attempts to extract the pitch and duration from historical Sindhi sound clips using the functional contours model’s superposition. The sampled sound clips contained the speech of 273 undergraduates living in 5 districts of the Sindhi province. Several Python libraries are available for the application of this model. We used these libraries for the extraction of prosodic data… More >

  • Open Access

    ARTICLE

    A Novel System for Recognizing Recording Devices from Recorded Speech Signals

    Yongqiang Bao1, *, Qi Shao1, Xuxu Zhang1, Jiahui Jiang1, Yue Xie1, Tingting Liu1, Weiye Xu2

    CMC-Computers, Materials & Continua, Vol.65, No.3, pp. 2557-2570, 2020, DOI:10.32604/cmc.2020.011241

    Abstract The field of digital audio forensics aims to detect threats and fraud in audio signals. Contemporary audio forensic techniques use digital signal processing to detect the authenticity of recorded speech, recognize speakers, and recognize recording devices. User-generated audio recordings from mobile phones are very helpful in a number of forensic applications. This article proposed a novel method for recognizing recording devices based on recorded audio signals. First, a database of the features of various recording devices was constructed using 32 recording devices (20 mobile phones of different brands and 12 kinds of recording pens) in various environments. Second, the audio… More >

Displaying 1-10 on page 1 of 9. Per Page