Home / Advanced Search

  • Title/Keywords

  • Author/Affliations

  • Journal

  • Article Type

  • Start Year

  • End Year

Update SearchingClear
  • Articles
  • Online
Search Results (32)
  • Open Access

    ARTICLE

    Deep Learning-Based Approach for Arabic Visual Speech Recognition

    Nadia H. Alsulami1,*, Amani T. Jamal1, Lamiaa A. Elrefaei2

    CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 85-108, 2022, DOI:10.32604/cmc.2022.019450 - 03 November 2021

    Abstract Lip-reading technologies are rapidly progressing following the breakthrough of deep learning. It plays a vital role in its many applications, such as: human-machine communication practices or security applications. In this paper, we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms. The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers. The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase. Firstly, we extract keyframes from… More >

  • Open Access

    ARTICLE

    Speech Recognition-Based Automated Visual Acuity Testing with Adaptive Mel Filter Bank

    Shibli Nisar1, Muhammad Asghar Khan2,*, Fahad Algarni3, Abdul Wakeel1, M. Irfan Uddin4, Insaf Ullah2

    CMC-Computers, Materials & Continua, Vol.70, No.2, pp. 2991-3004, 2022, DOI:10.32604/cmc.2022.020376 - 27 September 2021

    Abstract One of the most commonly reported disabilities is vision loss, which can be diagnosed by an ophthalmologist in order to determine the visual system of a patient. This procedure, however, usually requires an appointment with an ophthalmologist, which is both time-consuming and expensive process. Other issues that can arise include a lack of appropriate equipment and trained practitioners, especially in rural areas. Centered on a cognitively motivated attribute extraction and speech recognition approach, this paper proposes a novel idea that immediately determines the eyesight deficiency. The proposed system uses an adaptive filter bank with weighted… More >

  • Open Access

    ARTICLE

    Noise Reduction in Industry Based on Virtual Instrumentation

    Radek Martinek1, Rene Jaros1, Jan Baros1, Lukas Danys1, Aleksandra Kawala-Sterniuk2, Jan Nedoma3,*, Zdenek Machacek1, Jiri Koziorek1

    CMC-Computers, Materials & Continua, Vol.69, No.1, pp. 1073-1096, 2021, DOI:10.32604/cmc.2021.017568 - 04 June 2021

    Abstract This paper discusses the reduction of background noise in an industrial environment to extend human-machine-interaction. In the Industry 4.0 era, the mass development of voice control (speech recognition) in various industrial applications is possible, especially as related to augmented reality (such as hands-free control via voice commands). As Industry 4.0 relies heavily on radiofrequency technologies, some brief insight into this problem is provided, including the Internet of things (IoT) and 5G deployment. This study was carried out in cooperation with the industrial partner Brose CZ spol. s.r.o., where sound recordings were made to produce a dataset.… More >

  • Open Access

    ARTICLE

    HLR-Net: A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks

    Amany M. Sarhan1, Nada M. Elshennawy1, Dina M. Ibrahim1,2,*

    CMC-Computers, Materials & Continua, Vol.68, No.2, pp. 1531-1549, 2021, DOI:10.32604/cmc.2021.016509 - 13 April 2021

    Abstract

    Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking. This is a task of decoding the text from the speaker’s mouth movement. This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles. Using deep learning technologies makes it easier for users to extract a large number of different features, which can then be converted to probabilities of letters to obtain accurate results.

    More >

  • Open Access

    ARTICLE

    Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

    Hao Wu1,*, Arun Kumar Sangaiah2

    Intelligent Automation & Soft Computing, Vol.28, No.1, pp. 121-132, 2021, DOI:10.32604/iasc.2021.016457 - 17 March 2021

    Abstract In oral English teaching in China, teachers usually improve students’ pronunciation by their subjective judgment. Even to the same student, the teacher gives different suggestions at different times. Students’ oral pronunciation features can be obtained from the reconstructed acoustic and natural language features of speech audio, but the task is complicated due to the embedding of multimodal sentences. To solve this problem, this paper proposes an English speech recognition based on enhanced temporal convolution network. Firstly, a suitable UNet network model is designed to extract the noise of speech signal and achieve the purpose of… More >

  • Open Access

    ARTICLE

    Intelligent Speech Communication Using Double Humanoid Robots

    Li-Hong Juang1,*, Yi-Hua Zhao2

    Intelligent Automation & Soft Computing, Vol.26, No.2, pp. 291-301, 2020, DOI:10.31209/2020.100000164

    Abstract Speech recognition is one of the most convenient forms of human beings engaging in the exchanging of information. In this research, we want to make robots understand human language and communicate with each other through the human language, and to realize man–machine interactive and humanoid– robot interactive. Therefore, this research mainly studies NAO robots’ speech recognition and humanoid communication between double -humanoid robots. This paper introduces the future direction and application prospect of speech recognition as well as its basic method and knowledge of speech recognition fields. This research also proposes the application of the… More >

  • Open Access

    ARTICLE

    Noise Cancellation Based on Voice Activity Detection Using Spectral Variation for Speech Recognition in Smart Home Devices

    Jeong-Sik Park1, Seok-Hoon Kim2,*

    Intelligent Automation & Soft Computing, Vol.26, No.1, pp. 149-159, 2020, DOI:10.31209/2019.100000136

    Abstract Variety types of smart home devices have a main function of a human-machine interaction by speech recognition. Speech recognition system may be vulnerable to rapidly changing noises in home environments. This study proposes an efficient noise cancellation approach to eliminate the noises directly on the devices in real time. Firstly, we propose an advanced voice activity detection (VAD) technique to efficiently detect speech and non-speech regions on the basis of spectral property of speech signals. The VAD is then employed to enhance the conventional spectral subtraction method by steadily estimating noise signals in non-speech regions. More >

  • Open Access

    ARTICLE

    Implementation of a Biometric Interface in Voice Controlled Wheelchairs

    Lamia Bouafif1, Noureddine Ellouze2,*

    Sound & Vibration, Vol.54, No.1, pp. 1-15, 2020, DOI:10.32604/sv.2020.08665 - 01 March 2020

    Abstract In order to assist physically handicapped persons in their movements, we developed an embedded isolated word speech recognition system (ASR) applied to voice control of smart wheelchairs. However, in spite of the existence in the industrial market of several kinds of electric wheelchairs, the problem remains the need to manually control this device by hand via joystick; which limits their use especially by people with severe disabilities. Thus, a significant number of disabled people cannot use a standard electric wheelchair or drive it with difficulty. The proposed solution is to use the voice to control and… More >

  • Open Access

    ARTICLE

    Modified Viterbi Scoring for HMM‐Based Speech Recognition

    Jihyuck Joa, Han‐Gyu Kimb, In‐Cheol Parka, Bang Chul Jungc, Hoyoung Yooc

    Intelligent Automation & Soft Computing, Vol.25, No.2, pp. 351-358, 2019, DOI:10.31209/2019.100000096

    Abstract A modified Viterbi scoring procedure is presented in this paper based on Dijkstra’s shortest-path algorithm. In HMM-based speech recognition systems, the Viterbi scoring plays a significant role in finding the best matching model, but its computational complexity is linearly proportional to the number of reference models and their states. Therefore, the complexity is serious in implementing a high-speed speech recognition system. In the proposed method, the Viterbi scoring is translated into the searching of a minimum path, and the shortest-path algorithm is exploited to decrease the computational complexity while preventing the recognition accuracy from deteriorating. More >

  • Open Access

    ARTICLE

    Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

    Yue Zhao1, Jianjian Yue1, Wei Song1,*, Xiaona Xu1, Xiali Li1, Licheng Wu1, Qiang Ji2

    Journal on Internet of Things, Vol.1, No.1, pp. 17-23, 2019, DOI:10.32604/jiot.2019.05866

    Abstract We proposed a method using latent regression Bayesian network (LRBN) to extract the shared speech feature for the input of end-to-end speech recognition model. The structure of LRBN is compact and its parameter learning is fast. Compared with Convolutional Neural Network, it has a simpler and understood structure and less parameters to learn. Experimental results show that the advantage of hybrid LRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classification architecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN is helpful to differentiate among multiple language speech sets. More >

Displaying 21-30 on page 3 of 32. Per Page