Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Nada Aljohani; Emad Jaha

doi:10.32604/csse.2023.037113

Open Access icon Open Access

ARTICLE

Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Nada Faisal Aljohani^*, Emad Sami Jaha

Department of Computer Science, Faculty of Computing and Information Technology, King Abdulaziz University, Jeddah, 21589, Saudi Arabia

* Corresponding Author: Nada Faisal Aljohani. Email: email

Computer Systems Science and Engineering 2023, 46(3), 3037-3058. https://doi.org/10.32604/csse.2023.037113

Received 24 October 2022; Accepted 21 December 2022; Issue published 03 April 2023

Abstract

The continuing advances in deep learning have paved the way for several challenging ideas. One such idea is visual lip-reading, which has recently drawn many research interests. Lip-reading, often referred to as visual speech recognition, is the ability to understand and predict spoken speech based solely on lip movements without using sounds. Due to the lack of research studies on visual speech recognition for the Arabic language in general, and its absence in the Quranic research, this research aims to fill this gap. This paper introduces a new publicly available Arabic lip-reading dataset containing 10490 videos captured from multiple viewpoints and comprising data samples at the letter level (i.e., single letters (single alphabets) and Quranic disjoined letters) and in the word level based on the content and context of the book Al-Qaida Al-Noorania. This research uses visual speech recognition to recognize spoken Arabic letters (Arabic alphabets), Quranic disjoined letters, and Quranic words, mainly phonetic as they are recited in the Holy Quran according to Quranic study aid entitled Al-Qaida Al-Noorania. This study could further validate the correctness of pronunciation and, subsequently, assist people in correctly reciting Quran. Furthermore, a detailed description of the created dataset and its construction methodology is provided. This new dataset is used to train an effective pre-trained deep learning CNN model throughout transfer learning for lip-reading, achieving the accuracies of 83.3%, 80.5%, and 77.5% on words, disjoined letters, and single letters, respectively, where an extended analysis of the results is provided. Finally, the experimental outcomes, different research aspects, and dataset collection consistency and challenges are discussed and concluded with several new promising trends for future work.

Keywords

Visual speech recognition; lip-reading; deep learning; quranic Arabic dataset; Tajwid

Cite This Article

APA Style

Aljohani, N.F., Jaha, E.S. (2023). Visual lip-reading for quranic arabic alphabets and words using deep learning. Computer Systems Science and Engineering, 46(3), 3037-3058. https://doi.org/10.32604/csse.2023.037113

Vancouver Style

Aljohani NF, Jaha ES. Visual lip-reading for quranic arabic alphabets and words using deep learning. Comput Syst Sci Eng. 2023;46(3):3037-3058 https://doi.org/10.32604/csse.2023.037113

IEEE Style

N.F. Aljohani and E.S. Jaha, "Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning," Comput. Syst. Sci. Eng., vol. 46, no. 3, pp. 3037-3058. 2023. https://doi.org/10.32604/csse.2023.037113

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Visual Lip-Reading for Quranic Arabic Alphabets and Words Using Deep Learning

Abstract

Keywords

Cite This Article

1666

690

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link