Open Access iconOpen Access



Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network

Hao Wu1,*, Arun Kumar Sangaiah2

1 Hunan Radio and TV University, Changsha, 410004, China
2 School of Computing Science and Engineering, Vellore Institute of Technology, Tamil Nadu, 632014, India

* Corresponding Author: Hao Wu. Email: email

Intelligent Automation & Soft Computing 2021, 28(1), 121-132.


In oral English teaching in China, teachers usually improve students’ pronunciation by their subjective judgment. Even to the same student, the teacher gives different suggestions at different times. Students’ oral pronunciation features can be obtained from the reconstructed acoustic and natural language features of speech audio, but the task is complicated due to the embedding of multimodal sentences. To solve this problem, this paper proposes an English speech recognition based on enhanced temporal convolution network. Firstly, a suitable UNet network model is designed to extract the noise of speech signal and achieve the purpose of speech enhancement. Secondly, a network model with stable parameters is obtained by pre training, which is helpful to distinguish the spoken speech signals. Thirdly, a temporal convolution network with residual connection is designed to infer the meaning of pronunciation. Finally, the speech is graded according to the difference between the output value and the real result, according to the details of students’ oral pronunciation, the intelligent guidance of students’ oral pronunciation can be realized. The experimental results show that the model file obtained after training is improved under the controlling of file size. From the test results of LibriSpeech ASR corpus, it demonstrates the effectiveness and advantage of this approach.


Cite This Article

APA Style
Wu, H., Sangaiah, A.K. (2021). Oral english speech recognition based on enhanced temporal convolutional network. Intelligent Automation & Soft Computing, 28(1), 121-132.
Vancouver Style
Wu H, Sangaiah AK. Oral english speech recognition based on enhanced temporal convolutional network. Intell Automat Soft Comput . 2021;28(1):121-132
IEEE Style
H. Wu and A.K. Sangaiah, "Oral English Speech Recognition Based on Enhanced Temporal Convolutional Network," Intell. Automat. Soft Comput. , vol. 28, no. 1, pp. 121-132. 2021.

cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 2150


  • 1177


  • 0


Share Link