Tech Science Press - Publisher of Open Access Journals

Open Access

ARTICLE

Deep Learning-Based Approach for Arabic Visual Speech Recognition

Nadia H. Alsulami^1,*, Amani T. Jamal¹, Lamiaa A. Elrefaei²

CMC-Computers, Materials & Continua, Vol.71, No.1, pp. 85-108, 2022, DOI:10.32604/cmc.2022.019450 - 03 November 2021

Abstract Lip-reading technologies are rapidly progressing following the breakthrough of deep learning. It plays a vital role in its many applications, such as: human-machine communication practices or security applications. In this paper, we propose to develop an effective lip-reading recognition model for Arabic visual speech recognition by implementing deep learning algorithms. The Arabic visual datasets that have been collected contains 2400 records of Arabic digits and 960 records of Arabic phrases from 24 native speakers. The primary purpose is to provide a high-performance model in terms of enhancing the preprocessing phase. Firstly, we extract keyframes from… More >

Open Access

ARTICLE

Speech Recognition-Based Automated Visual Acuity Testing with Adaptive Mel Filter Bank

Shibli Nisar¹, Muhammad Asghar Khan^2,*, Fahad Algarni³, Abdul Wakeel¹, M. Irfan Uddin⁴, Insaf Ullah²

CMC-Computers, Materials & Continua, Vol.70, No.2, pp. 2991-3004, 2022, DOI:10.32604/cmc.2022.020376 - 27 September 2021

Abstract One of the most commonly reported disabilities is vision loss, which can be diagnosed by an ophthalmologist in order to determine the visual system of a patient. This procedure, however, usually requires an appointment with an ophthalmologist, which is both time-consuming and expensive process. Other issues that can arise include a lack of appropriate equipment and trained practitioners, especially in rural areas. Centered on a cognitively motivated attribute extraction and speech recognition approach, this paper proposes a novel idea that immediately determines the eyesight deficiency. The proposed system uses an adaptive filter bank with weighted… More >

Open Access

ARTICLE

Noise Reduction in Industry Based on Virtual Instrumentation

Radek Martinek¹, Rene Jaros¹, Jan Baros¹, Lukas Danys¹, Aleksandra Kawala-Sterniuk², Jan Nedoma^3,*, Zdenek Machacek¹, Jiri Koziorek¹

CMC-Computers, Materials & Continua, Vol.69, No.1, pp. 1073-1096, 2021, DOI:10.32604/cmc.2021.017568 - 04 June 2021

Abstract This paper discusses the reduction of background noise in an industrial environment to extend human-machine-interaction. In the Industry 4.0 era, the mass development of voice control (speech recognition) in various industrial applications is possible, especially as related to augmented reality (such as hands-free control via voice commands). As Industry 4.0 relies heavily on radiofrequency technologies, some brief insight into this problem is provided, including the Internet of things (IoT) and 5G deployment. This study was carried out in cooperation with the industrial partner Brose CZ spol. s.r.o., where sound recordings were made to produce a dataset.… More >

Open Access

ARTICLE

HLR-Net: A Hybrid Lip-Reading Model Based on Deep Convolutional Neural Networks

Amany M. Sarhan¹, Nada M. Elshennawy¹, Dina M. Ibrahim^1,2,*

CMC-Computers, Materials & Continua, Vol.68, No.2, pp. 1531-1549, 2021, DOI:10.32604/cmc.2021.016509 - 13 April 2021

Abstract

Lip reading is typically regarded as visually interpreting the speaker’s lip movements during the speaking. This is a task of decoding the text from the speaker’s mouth movement. This paper proposes a lip-reading model that helps deaf people and persons with hearing problems to understand a speaker by capturing a video of the speaker and inputting it into the proposed model to obtain the corresponding subtitles. Using deep learning technologies makes it easier for users to extract a large number of different features, which can then be converted to probabilities of letters to obtain accurate results.

… More >

Open Access

ARTICLE

Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

Hao Wu^1,*, Arun Kumar Sangaiah²

Intelligent Automation & Soft Computing, Vol.28, No.1, pp. 121-132, 2021, DOI:10.32604/iasc.2021.016457 - 17 March 2021

Abstract In oral English teaching in China, teachers usually improve students’ pronunciation by their subjective judgment. Even to the same student, the teacher gives different suggestions at different times. Students’ oral pronunciation features can be obtained from the reconstructed acoustic and natural language features of speech audio, but the task is complicated due to the embedding of multimodal sentences. To solve this problem, this paper proposes an English speech recognition based on enhanced temporal convolution network. Firstly, a suitable UNet network model is designed to extract the noise of speech signal and achieve the purpose of… More >

Open Access

ARTICLE

Intelligent Speech Communication Using Double Humanoid Robots

Li-Hong Juang^1,*, Yi-Hua Zhao²

Intelligent Automation & Soft Computing, Vol.26, No.2, pp. 291-301, 2020, DOI:10.31209/2020.100000164

Abstract Speech recognition is one of the most convenient forms of human beings engaging in the exchanging of information. In this research, we want to make robots understand human language and communicate with each other through the human language, and to realize man–machine interactive and humanoid– robot interactive. Therefore, this research mainly studies NAO robots’ speech recognition and humanoid communication between double -humanoid robots. This paper introduces the future direction and application prospect of speech recognition as well as its basic method and knowledge of speech recognition fields. This research also proposes the application of the… More >

Open Access

ARTICLE

Noise Cancellation Based on Voice Activity Detection Using Spectral Variation for Speech Recognition in Smart Home Devices

Jeong-Sik Park¹, Seok-Hoon Kim^2,*

Intelligent Automation & Soft Computing, Vol.26, No.1, pp. 149-159, 2020, DOI:10.31209/2019.100000136

Abstract Variety types of smart home devices have a main function of a human-machine interaction by speech recognition. Speech recognition system may be vulnerable to rapidly changing noises in home environments. This study proposes an efficient noise cancellation approach to eliminate the noises directly on the devices in real time. Firstly, we propose an advanced voice activity detection (VAD) technique to efficiently detect speech and non-speech regions on the basis of spectral property of speech signals. The VAD is then employed to enhance the conventional spectral subtraction method by steadily estimating noise signals in non-speech regions. More >

Open Access

ARTICLE

Implementation of a Biometric Interface in Voice Controlled Wheelchairs

Lamia Bouafif¹, Noureddine Ellouze^2,*

Sound & Vibration, Vol.54, No.1, pp. 1-15, 2020, DOI:10.32604/sv.2020.08665 - 01 March 2020

Abstract In order to assist physically handicapped persons in their movements, we developed an embedded isolated word speech recognition system (ASR) applied to voice control of smart wheelchairs. However, in spite of the existence in the industrial market of several kinds of electric wheelchairs, the problem remains the need to manually control this device by hand via joystick; which limits their use especially by people with severe disabilities. Thus, a significant number of disabled people cannot use a standard electric wheelchair or drive it with difficulty. The proposed solution is to use the voice to control and… More >

Open Access

ARTICLE

Modified Viterbi Scoring for HMM‐Based Speech Recognition

Jihyuck Jo^a, Han‐Gyu Kim^b, In‐Cheol Park^a, Bang Chul Jung^c, Hoyoung Yoo^c

Intelligent Automation & Soft Computing, Vol.25, No.2, pp. 351-358, 2019, DOI:10.31209/2019.100000096

Abstract A modified Viterbi scoring procedure is presented in this paper based on Dijkstra’s shortest-path algorithm. In HMM-based speech recognition systems, the Viterbi scoring plays a significant role in finding the best matching model, but its computational complexity is linearly proportional to the number of reference models and their states. Therefore, the complexity is serious in implementing a high-speed speech recognition system. In the proposed method, the Viterbi scoring is translated into the searching of a minimum path, and the shortest-path algorithm is exploited to decrease the computational complexity while preventing the recognition accuracy from deteriorating. More >

Open Access

ARTICLE

Tibetan Multi-Dialect Speech Recognition Using Latent Regression Bayesian Network and End-To-End Mode

Yue Zhao¹, Jianjian Yue¹, Wei Song^1,*, Xiaona Xu¹, Xiali Li¹, Licheng Wu¹, Qiang Ji²

Journal on Internet of Things, Vol.1, No.1, pp. 17-23, 2019, DOI:10.32604/jiot.2019.05866

Abstract We proposed a method using latent regression Bayesian network (LRBN) to extract the shared speech feature for the input of end-to-end speech recognition model. The structure of LRBN is compact and its parameter learning is fast. Compared with Convolutional Neural Network, it has a simpler and understood structure and less parameters to learn. Experimental results show that the advantage of hybrid LRBN/Bidirectional Long Short-Term Memory-Connectionist Temporal Classification architecture for Tibetan multi-dialect speech recognition, and demonstrate the LRBN is helpful to differentiate among multiple language speech sets. More >

Displaying 21-30 on page 3 of 32. Per Page

View

4285

Download

2697

View

3197

Download

1778

View

2949

Download

1826

View

4746

Download

3305

Cited by

2

View

3056

Download

1749

View

2558

Download

1904

View

2789

Download

1653

View

4095

Download

2691

View

2268

Download

1475

View

4396

Download

2394

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp: