Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

Hao Wu; Arun Sangaiah

doi:10.32604/iasc.2021.016457

Open Access icon Open Access

ARTICLE

Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

Hao Wu^1,*, Arun Kumar Sangaiah²

1 Hunan Radio and TV University, Changsha, 410004, China
2 School of Computing Science and Engineering, Vellore Institute of Technology, Tamil Nadu, 632014, India

* Corresponding Author: Hao Wu. Email: email

Intelligent Automation & Soft Computing 2021, 28(1), 121-132. https://doi.org/10.32604/iasc.2021.016457

Received 02 January 2021; Accepted 02 February 2021; Issue published 17 March 2021

Abstract

In oral English teaching in China, teachers usually improve students’ pronunciation by their subjective judgment. Even to the same student, the teacher gives different suggestions at different times. Students’ oral pronunciation features can be obtained from the reconstructed acoustic and natural language features of speech audio, but the task is complicated due to the embedding of multimodal sentences. To solve this problem, this paper proposes an English speech recognition based on enhanced temporal convolution network. Firstly, a suitable UNet network model is designed to extract the noise of speech signal and achieve the purpose of speech enhancement. Secondly, a network model with stable parameters is obtained by pre training, which is helpful to distinguish the spoken speech signals. Thirdly, a temporal convolution network with residual connection is designed to infer the meaning of pronunciation. Finally, the speech is graded according to the difference between the output value and the real result, according to the details of students’ oral pronunciation, the intelligent guidance of students’ oral pronunciation can be realized. The experimental results show that the model file obtained after training is improved under the controlling of file size. From the test results of LibriSpeech ASR corpus, it demonstrates the effectiveness and advantage of this approach.

Keywords

Temporal convolutional network; college English teaching; speech recognition; teaching model

Cite This Article

APA Style

Wu, H., Sangaiah, A.K. (2021). Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network. Intelligent Automation & Soft Computing, 28(1), 121–132. https://doi.org/10.32604/iasc.2021.016457

Vancouver Style

Wu H, Sangaiah AK. Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network. Intell Automat Soft Comput. 2021;28(1):121–132. https://doi.org/10.32604/iasc.2021.016457

IEEE Style

H. Wu and A. K. Sangaiah, “Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network,” Intell. Automat. Soft Comput., vol. 28, no. 1, pp. 121–132, 2021. https://doi.org/10.32604/iasc.2021.016457

BibTex EndNote RIS

Copyright © 2021 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

Abstract

Keywords

Cite This Article

2938

1696

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link