Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Automated Speech Recognition System to Detect Babies’ Feelings through Feature Analysis

Sana Yasin¹, Umar Draz^2,3,*, Tariq Ali⁴, Kashaf Shahid¹, Amna Abid¹, Rukhsana Bibi¹, Muhammad Irfan⁵, Mohammed A. Huneif⁶, Sultan A. Almedhesh⁶, Seham M. Alqahtani⁶, Alqahtani Abdulwahab⁶, Mohammed Jamaan Alzahrani⁶, Dhafer Batti Alshehri⁶, Alshehri Ali Abdullah⁷, Saifur Rahman⁵

CMC-Computers, Materials & Continua, Vol.73, No.2, pp. 4349-4367, 2022, DOI:10.32604/cmc.2022.028251

Abstract Diagnosing a baby’s feelings poses a challenge for both doctors and parents because babies cannot explain their feelings through expression or speech. Understanding the emotions of babies and their associated expressions during different sensations such as hunger, pain, etc., is a complicated task. In infancy, all communication and feelings are propagated through cry-speech, which is a natural phenomenon. Several clinical methods can be used to diagnose a baby’s diseases, but nonclinical methods of diagnosing a baby’s feelings are lacking. As such, in this study, we aimed to identify babies’ feelings and emotions through their cry using a nonclinical method. Changes… More >

Open Access

ARTICLE

Cross-Language Transfer Learning-based Lhasa-Tibetan Speech Recognition

Zhijie Wang¹, Yue Zhao^1,*, Licheng Wu¹, Xiaojun Bi¹, Zhuoma Dawa², Qiang Ji³

CMC-Computers, Materials & Continua, Vol.73, No.1, pp. 629-639, 2022, DOI:10.32604/cmc.2022.027092

Abstract As one of Chinese minority languages, Tibetan speech recognition technology was not researched upon as extensively as Chinese and English were until recently. This, along with the relatively small Tibetan corpus, has resulted in an unsatisfying performance of Tibetan speech recognition based on an end-to-end model. This paper aims to achieve an accurate Tibetan speech recognition using a small amount of Tibetan training data. We demonstrate effective methods of Tibetan end-to-end speech recognition via cross-language transfer learning from three aspects: modeling unit selection, transfer learning method, and source language selection. Experimental results show that the Chinese-Tibetan multi-language learning method using… More >

Open Access

ARTICLE

Speak-Correct: A Computerized Interface for the Analysis of Mispronounced Errors

Kamal Jambi^1,*, Hassanin Al-Barhamtoshy¹, Wajdi Al-Jedaibi¹, Mohsen Rashwan², Sherif Abdou³

Computer Systems Science and Engineering, Vol.43, No.3, pp. 1155-1173, 2022, DOI:10.32604/csse.2022.024967

Abstract Any natural language may have dozens of accents. Even though the equivalent phonemic formation of the word, if it is properly called in different accents, humans do have audio signals that are distinct from one another. Among the most common issues with speech, the processing is discrepancies in pronunciation, accent, and enunciation. This research study examines the issues of detecting, fixing, and summarising accent defects of average Arabic individuals in English-speaking speech. The article then discusses the key approaches and structure that will be utilized to address both accent flaws and pronunciation issues. The proposed SpeakCorrect computerized interface employs a… More >

Open Access

ARTICLE

An Innovative Approach Utilizing Binary-View Transformer for Speech Recognition Task

Muhammad Babar Kamal¹, Arfat Ahmad Khan², Faizan Ahmed Khan³, Malik Muhammad Ali Shahid⁴, Chitapong Wechtaisong^2,*, Muhammad Daud Kamal⁵, Muhammad Junaid Ali⁶, Peerapong Uthansakul²

CMC-Computers, Materials & Continua, Vol.72, No.3, pp. 5547-5562, 2022, DOI:10.32604/cmc.2022.024590

Abstract The deep learning advancements have greatly improved the performance of speech recognition systems, and most recent systems are based on the Recurrent Neural Network (RNN). Overall, the RNN works fine with the small sequence data, but suffers from the gradient vanishing problem in case of large sequence. The transformer networks have neutralized this issue and have shown state-of-the-art results on sequential or speech-related data. Generally, in speech recognition, the input audio is converted into an image using Mel-spectrogram to illustrate frequencies and intensities. The image is classified by the machine learning mechanism to generate a classification transcript. However, the audio… More >

Open Access

ARTICLE

Enhanced Marathi Speech Recognition Facilitated by Grasshopper Optimisation-Based Recurrent Neural Network

Ravindra Parshuram Bachate¹, Ashok Sharma², Amar Singh³, Ayman A. Aly⁴, Abdulaziz H. Alghtani⁴, Dac-Nhuong Le^5,6,*

Computer Systems Science and Engineering, Vol.43, No.2, pp. 439-454, 2022, DOI:10.32604/csse.2022.024214

Abstract Communication is a significant part of being human and living in the world. Diverse kinds of languages and their variations are there; thus, one person can speak any language and cannot effectively communicate with one who speaks that language in a different accent. Numerous application fields such as education, mobility, smart systems, security, and health care systems utilize the speech or voice recognition models abundantly. Though, various studies are focused on the Arabic or Asian and English languages by ignoring other significant languages like Marathi that leads to the broader research motivations in regional languages. It is necessary to understand… More >

Open Access

ARTICLE

Hybrid In-Vehicle Background Noise Reduction for Robust Speech Recognition: The Possibilities of Next Generation 5G Data Networks

Radek Martinek¹, Jan Baros¹, Rene Jaros¹, Lukas Danys^1,*, Jan Nedoma²

CMC-Computers, Materials & Continua, Vol.71, No.3, pp. 4659-4676, 2022, DOI:10.32604/cmc.2022.019904

Abstract This pilot study focuses on employment of hybrid LMS-ICA system for in-vehicle background noise reduction. Modern vehicles are nowadays increasingly supporting voice commands, which are one of the pillars of autonomous and SMART vehicles. Robust speaker recognition for context-aware in-vehicle applications is limited to a certain extent by in-vehicle background noise. This article presents the new concept of a hybrid system, which is implemented as a virtual instrument. The highly modular concept of the virtual car used in combination with real recordings of various driving scenarios enables effective testing of the investigated methods of in-vehicle background noise reduction. The study… More >

Open Access

ARTICLE

End-to-End Speech Recognition of Tamil Language

Mohamed Hashim Changrampadi^1,*, A. Shahina², M. Badri Narayanan², A. Nayeemulla Khan³

Intelligent Automation & Soft Computing, Vol.32, No.2, pp. 1309-1323, 2022, DOI:10.32604/iasc.2022.022021

Abstract Research in speech recognition is progressing with numerous state-of-the-art results in recent times. However, relatively fewer research is being carried out in Automatic Speech Recognition (ASR) for languages with low resources. We present a method to develop speech recognition model with minimal resources using Mozilla DeepSpeech architecture. We have utilized freely available online computational resources for training, enabling similar approaches to be carried out for research in a low-resourced languages in a financially constrained environments. We also present novel ways to build an efficient language model from publicly available web resources to improve accuracy in ASR. The proposed ASR model… More >

Open Access

ARTICLE

Deep Learning-Based Approach for Arabic Visual Speech Recognition

Nadia H. Alsulami^1,*, Amani T. Jamal¹, Lamiaa A. Elrefaei²

CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 85-108, 2022, DOI:10.32604/cmc.2022.019450

Abstract Lip-reading technologies are rapidly progressing following the breakthrough of deep learning. It plays a vital role in its many applications, such as: human-machine communication practices or security applications. In this paper, we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms. The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers. The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase. Firstly, we extract keyframes from our dataset. Secondly, we produce… More >

Open Access

ARTICLE

Speech Recognition-Based Automated Visual Acuity Testing with Adaptive Mel Filter Bank

Shibli Nisar¹, Muhammad Asghar Khan^2,*, Fahad Algarni³, Abdul Wakeel¹, M. Irfan Uddin⁴, Insaf Ullah²

CMC-Computers, Materials & Continua, Vol.70, No.2, pp. 2991-3004, 2022, DOI:10.32604/cmc.2022.020376

Abstract One of the most commonly reported disabilities is vision loss, which can be diagnosed by an ophthalmologist in order to determine the visual system of a patient. This procedure, however, usually requires an appointment with an ophthalmologist, which is both time-consuming and expensive process. Other issues that can arise include a lack of appropriate equipment and trained practitioners, especially in rural areas. Centered on a cognitively motivated attribute extraction and speech recognition approach, this paper proposes a novel idea that immediately determines the eyesight deficiency. The proposed system uses an adaptive filter bank with weighted mel frequency cepstral coefficients for… More >

Open Access

ARTICLE

Noise Reduction in Industry Based on Virtual Instrumentation

Radek Martinek¹, Rene Jaros¹, Jan Baros¹, Lukas Danys¹, Aleksandra Kawala-Sterniuk², Jan Nedoma^3,*, Zdenek Machacek¹, Jiri Koziorek¹

CMC-Computers, Materials & Continua, Vol.69, No.1, pp. 1073-1096, 2021, DOI:10.32604/cmc.2021.017568

Abstract This paper discusses the reduction of background noise in an industrial environment to extend human-machine-interaction. In the Industry 4.0 era, the mass development of voice control (speech recognition) in various industrial applications is possible, especially as related to augmented reality (such as hands-free control via voice commands). As Industry 4.0 relies heavily on radiofrequency technologies, some brief insight into this problem is provided, including the Internet of things (IoT) and 5G deployment. This study was carried out in cooperation with the industrial partner Brose CZ spol. s.r.o., where sound recordings were made to produce a dataset. The experimental environment comprised… More >

Displaying 11-20 on page 2 of 29. Per Page

View

1330

Download

563

Like

0

View

1059

Download

794

Like

0

View

1046

Download

619

Like

0

View

1339

Download

771

Like

0

View

1333

Download

787

Like

0

View

1486

Download

967

Like

0

View

1935

Download

1649

Like

0

View

2533

Download

1637

Like

0

View

2022

Download

1102

Like

0

View

1846

Download

1355

Like

0

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: