DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Walaa Gad; Anas Alokla; Waleed Nazih; Mustafa Aref; Abdel-badeeh Salem

doi:10.32604/cmc.2022.019884

Open Access icon Open Access

ARTICLE

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Walaa Gad^1,*, Anas Alokla¹, Waleed Nazih², Mustafa Aref¹, Abdel-badeeh Salem¹

1 Faculty of Computers and Information Sciences, Ain Shams University, Abassia, Cairo, 11566, Egypt
2 College of Computer Engineering and Sciences, Prince Sattam Bin Abdulaziz University, Al Kharj, 11942, Saudi Arabia

* Corresponding Author: Walaa Gad. Email: email .e.g

(This article belongs to the Special Issue: Emerging Applications of Artificial Intelligence, Machine learning and Data Science)

Computers, Materials & Continua 2022, 70(2), 3117-3132. https://doi.org/10.32604/cmc.2022.019884

Received 29 April 2021; Accepted 21 June 2021; Issue published 27 September 2021

Abstract

Understanding the content of the source code and its regular expression is very difficult when they are written in an unfamiliar language. Pseudo-code explains and describes the content of the code without using syntax or programming language technologies. However, writing Pseudo-code to each code instruction is laborious. Recently, neural machine translation is used to generate textual descriptions for the source code. In this paper, a novel deep learning-based transformer (DLBT) model is proposed for automatic Pseudo-code generation from the source code. The proposed model uses deep learning which is based on Neural Machine Translation (NMT) to work as a language translator. The DLBT is based on the transformer which is an encoder-decoder structure. There are three major components: tokenizer and embeddings, transformer, and post-processing. Each code line is tokenized to dense vector. Then transformer captures the relatedness between the source code and the matching Pseudo-code without the need of Recurrent Neural Network (RNN). At the post-processing step, the generated Pseudo-code is optimized. The proposed model is assessed using a real Python dataset, which contains more than 18,800 lines of a source code written in Python. The experiments show promising performance results compared with other machine translation methods such as Recurrent Neural Network (RNN). The proposed DLBT records 47.32, 68. 49 accuracy and BLEU performance measures, respectively.

Keywords

Natural language processing; long short-term memory; neural machine translation; pseudo-code generation; deep learning-based transformer

Cite This Article

W. Gad, A. Alokla, W. Nazih, M. Aref and A. Salem, "Dlbt: deep learning-based transformer to generate pseudo-code from source code," Computers, Materials & Continua, vol. 70, no.2, pp. 3117–3132, 2022. https://doi.org/10.32604/cmc.2022.019884

BibTex EndNote RIS

Citations

1

[click to view]

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

DLBT: Deep Learning-Based Transformer to Generate Pseudo-Code from Source Code

Abstract

Keywords

Cite This Article

Citations

2194

1992

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link