Open Access

ARTICLE

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Walaa Gad1,*, Anas Alokla1, Waleed Nazih2, Mustafa Aref1, Abdel-badeeh Salem1
1 Faculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo, 11566, Egypt
2 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj, 11942, Saudi Arabia
* Corresponding Author: Walaa Gad. Email: .e.g
(This article belongs to this Special Issue: Emerging Applications of Artificial Intelligence, Machine learning and Data Science)

Computers, Materials & Continua 2022, 70(2), 3117-3132. https://doi.org/10.32604/cmc.2022.019884

Received 29 April 2021; Accepted 21 June 2021; Issue published 27 September 2021

Abstract

Understanding the content of the source code and its regular expression is very difficult when they are written in an unfamiliar language. Pseudo-code explains and describes the content of the code without using syntax or programming language technologies. However, writing Pseudo-code to each code instruction is laborious. Recently, neural machine translation is used to generate textual descriptions for the source code. In this paper, a novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The proposed model uses deep learning which is based on Neural Machine Translation (NMT) to work as a language translator. The DLBT is based on the transformer which is an encoder-decoder structure. There are three major components: tokenizer and embeddings, transformer, and post-processing. Each code line is tokenized to dense vector. Then transformer captures the relatedness between the source code and the matching Pseudo-code without the need of Recurrent Neural Network (RNN). At the post-processing step, the generated Pseudo-code is optimized. The proposed model is assessed using a real Python dataset, which contains more than 18,800 lines of a source code written in Python. The experiments show promising performance results compared with other machine translation methods such as Recurrent Neural Network (RNN). The proposed DLBT records 47.32, 68. 49 accuracy and BLEU performance measures, respectively.

Keywords

Natural language processing; long short-term memory; neural machine translation; pseudo-code generation; deep learning-based transformer

Cite This Article

W. Gad, A. Alokla, W. Nazih, M. Aref and A. Salem, "Dlbt: deep learning-based transformer to generate pseudo-code from source code," Computers, Materials & Continua, vol. 70, no.2, pp. 3117–3132, 2022.

Citations




This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1277

    View

  • 1359

    Download

  • 0

    Like

Share Link

WeChat scan