NewBee: Context-Free Grammar (CFG) of a New Programming Language for Novice Programmers
1 Department of Computer Science, Bahria University Lahore, 54000, Pakistan
2 Department of Law, Science and Technology, University of Bologna, 40126, Italy
3 Department of Multidisciplinary Engineering, Texas A & M University, College Station, 77843, USA
* Corresponding Author: Muhammad Aasim Qureshi. Email:
Intelligent Automation & Soft Computing 2023, 37(1), 439-453. https://doi.org/10.32604/iasc.2023.036102
Received 17 September 2022; Accepted 23 November 2022; Issue published 29 April 2023
AbstractLearning programming and using programming languages are the essential aspects of computer science education. Students use programming languages to write their programs. These computer programs (students or practitioners written) make computers artificially intelligent and perform the tasks needed by the users. Without these programs, the computer may be visioned as a pointless machine. As the premise of writing programs is situated with specific programming languages, enormous efforts have been made to develop and create programming languages. However, each programming language is domain-specific and has its nuances, syntax and semantics, with specific pros and cons. These language-specific details, including syntax and semantics, are significant hurdles for novice programmers. Also, the instructors of introductory programming courses find these language specificities as the biggest hurdle in students learning, where more focus is on syntax than logic development and actual implementation of the program. Considering the conceptual difficulty of programming languages and novice students’ struggles with the language syntax, this paper describes the design and development of a Context-Free Grammar (CFG) of a programming language for the novice, newcomers and students who do not have computer science as their major. Due to its syntax proximity to daily conversations, this paper hypothesizes that this language will be easy to use and understand by novice programmers. This paper systematically designed the language by identifying themes from various existing programming languages (e.g., C, Python). Additionally, this paper surveyed computer science experts from industry and academia, where experts self-reported their satisfaction with the newly designed language. The results indicate that 93% of the experts reported satisfaction with the NewBee for novice, newcomer and non-Computer Science (CS) major students.
A structured communication system between humans, computers or their intersection (i.e., human to human, computer to computer and human to computer or computer to human) is known as language . In communications and computer science, the role of two types of languages is pre-dominant: natural and formal languages. Natural language is the mode of communication between humans. This communication has different forms in natural languages, e.g., spoken, written and symbolic (i.e., signs, gestures). However, formal languages are inherent languages extracted from the natural languages with a specific set of rules. In computer science, formal languages serve as the basis for defining the grammar and syntactical aspects of computer programming languages.
Computer Programming Languages (CPL) (languages needed to write software programs) are formal languages that provide an interface between computer-to-human, computer-to-computer, or human-to-computer interactions. Besides interface, CPL expresses computing-specific logic according to language rules and syntax .
In computing, programming languages are a central component . These languages are designed to bridge the gap between the machine and the human who want to perform some specific tasks on the machine . In today’s era, learning programming and the ability to program are essential skills as reflected by various international policy documents for pre-college  and undergraduate  students. Considering the importance of computing and computational thinking, skills and abilities to understand computer languages are fundamental skills. The knowledge and ability to program allow the students to get the maximum benefit from computers as it helps in problem-solving, computational thinking and simulations. Through these programming languages, one can guide computers to perform some specific tasks. Programming knowledge is the fundamental and central component of computing, computational thinking, and computer science education . Over the time, many programming languages have emerged, highlighting varying aspects of programming mechanisms, including low-level, structured, object-oriented and high-level programming languages. However, choosing which language to use is critical as every language has its specifications, limitations and affordances .
The existence of many programming languages raises another critical question; why do we build another language? . Futhermore, what fundamental variation the new language actually provides viable tool to write programs in relation to others. Literature suggests that every new language overcomes the weakness of the existing ones or meets the new specific goals . For example, some languages have exceptionally rigid language structures (syntactic), while others enforce indentations. It becomes a nightmare for new developers, novice programmers and non-CS major students to remember such superficial punctuations and constraints , primarily when they work on relatively large and complex projects. Literature suggests that the right way of learning programming is to shift programming students’ attention to logic instead of worrying over the sentence structure of the code . For example, a Java developer would be consuming their energies over the brackets and blocks, while a python coder will be stressing over the indentation. Similarly, the developer of C and C++ must stress over the semi-colon for the termination of the instructions . Furthermore, sometimes programmers must consider the data types while writing the code, which leads to fatal errors if handled incorrectly .
Given these language nuances, present programming languages can be summed up in two significant classes, i.e., firmly bound and approximately bound dialects . The firmly bound languages are early dialects in which developers must deal with each detail like proper blocks, indentation, semi-colons and different keywords. C, C++ and Java are examples of early dialect languages . Approximately bound languages have relatively less stress over such structural aspects. However, developers have to follow some sentence structure level constraints in these languages—examples of such languages include PHP and Python .
This study aims to introduce a context-free grammar of a programming language for new developers, novice programmers and non-CS major students. This paper hypothesizes that using features, this language “NewBee” will help students in learning the language in a more relaxed way. The premise of NewBee is to shift students’ focus from syntax to logic building. Consequently, students will be learning to remove logical errors and will have less focus on syntactical errors.
The rest of the paper is organized into seven sections. Section 2 focuses on the previous related research on the grammar of different programming languages. Section 3 discusses the approach which is adopted to hypothesise the design of a language including the steps involved. Section 4 focuses on the grammar of the language. Section 5 list the limitation of the language. In Section 6 some sample programs are given. Finally, Section 7 concludes the paper.
A program is a set of instructions given to any computing machine to perform some specific tasks to achieve a predefined target. Programming languages are characterized into two types—Domain-Specific Languages (DSLs) and General-Purpose Languages (GPLs) . DSL deals with the specific class of problems that belong to a specific domain, including databases, web applications, etc. Examples of these languages are HyperText Markup Language (HTML), Cascading Style Sheets (CSS) and Structured Query Language (SQL) .
GPLs are used for building software in various application domains. C, C++, C#, Python and Java are examples of General-Purpose Languages .
DSLs can be implemented as External Domain Specific Language (EDSL) and Internal Domain Specific Language (IDSL) . IDSL is the domain-specific language that is implanted into a general-purpose language that is limited with the grammar of the host language .
Few practices were carried out in the past in domain-specific languages. For example, Van Deursen et al.  listed the practices for the attainment approach at the aspect level only for semantics. Similarly, Oliveira and colleagues  surveyed Domain-Specific Language where the attainment details were not given. Furthermore, authors of [22,23] focused on the composability aspects of language and provided details on how they can cover different language workbenches.
Prior research studies have used various strategies to propose new languages. For example, in , a new language was presented to evaluate the DSL attainment approaches based on the unified state machine. The authors claimed that no single approach is valid for all scenarios. Similarly, prior literature studies [24–27] have used systematic mapping of existing languages. For example, in , authors surveyed to examine the methods and techniques of the existing DSL. They evaluated the domains and tools used in the creation of DSL. The authors concluded that external DSL for various domains were gaining a lot of attraction and described them as invaluable and relevant across domains.
Various literature studies [25,26,28] conducted their examination based on systematic mapping and captured the DSL field’s research space and trends over a given period. The authors concluded that DSLs would be the primary programming language for the foreseeable future. Furthermore, one of the unresolved issues in the DSL field is how to make DSL development easier for domain experts. Additionally, the authors examined current approaches to resolve the issue mentioned above through the survey.
Similarly, in the Language Workbench Challenge 2013, studies [29,30] proposed a feature model for language workbenches and categorized them using the model. The authors presented a uniform challenge (i.e., a DSL for surveys) implemented by ten workbenches. The study examined the properties of various workbenches in several ways. It showed that no single language workbench could deliver all the required features. The authors are more concerned with the available features of the workbench than with the methods employed to obtain them.
Motivation: Although these studies used various methods to examine the methods and techniques of existing DSL, they highlighted unresolved issues. It is noteworthy that these unresolved issues and heavy grammatical rules can be very confusing for novice programmers which provides the motivation to conduct this study.
When novice programmers, newcomers to computing majors and non-cs major students start learning logic development, using some programming language, they face many issues and difficulties. Many of these issues are related to the intrinsic hard nature of programming  Also, some issues are associated with the complexities of logic building, syntax and sentence structure . The most hazardous thing in writing the code for a complicated program is not always logical, but it is syntax or sentence structure . To our surprise, the lack of even a single semi-colon or bracket in the complicated code is one of the worst nightmares of a programmer, while in some cases, the data type parsing issues are one of the major concerns for programmers . This study presents a context-free grammar for a programming language for novice programmers, newcomers to CS majors and students with non-cs major backgrounds. This language supports these learners in logic building and provides an easy way to comprehend the language’s syntax.
The programming language, i.e., NewBee, addresses the sentence structure level issues raised by the programmers. It makes the syntax simple and allows novice students to concentrate on the logic. The significant goal is to develop a language for which new students don’t need to struggle to remember or memorize the syntax. Along with ease, a few language constraints are added to the language. The premise of these constraints is rooted in the need and level of novice programmers. This paper used a systematic approach to accomplish such a degree of facilitation. The methodology of this paper handpicked language rules that concentrate on straightforwardness for the programmers. This paper chose these rules from previous languages and combined them into one language. We believe that the language built from straightforward rules and emphasizing less structure will provide the necessary ease to novice programmers for learning and implementation in the NewBee. The methodology of the proposed language is shown in Fig. 1.
For instance, in the proposed language, novice programmers don’t have to end the statement with a semi-colon as is required in C++ or Java. Similarly, the NewBee is not space touchy as Python. Additionally, the proposed language facilitates the programmers on sections stress to begin a block. They need to begin a block with a mark which is an idea preoccupied with the low-level computing construct.
In this article, a new language is being proposed to give syntax-level ease to the novice, newcomers and non-CS major students. By keeping the syntax close to students’ daily language, syntax requirements may be least bothersome causing fewer syntax errors. The following section explains our proposed language sentence structure and introduces the syntax of the language.
To identify the ideal features, this paper conducted a short survey including a sample of industry developers, students and instructors NInstructor = 7; NStudents = 23; and NDevelopers = 11. The survey results indicate that the following features of the NewBee are a necessary component for the novice, newcomers and students who do not have computer science as their major:
1) Program Execution Sequence
2) Statements Termination
3) Decision-making Structure
4) Loop Structures
6) Type Specifier
9) Reversed Words
Following are the key features of the proposed language.
The program Execution sequence in NewBee Language begins with the keyword “Start of Program” to let the compiler know that the program has Started. In the same way, the program will end with “End of Program” Anything that was written after “End of Program” will not execute and a developer can write their notes/explanations here.
Start of Program
<Your program code goes here>
End of Program
The termination of the statement will be without any visible terminator; instead, it will be terminated just with the next line.
a = 10
If then structure:
Decision-Making Statement in NewBee will start with the keyword “If” and the block termination will be with “End of If”. The language provides braces-free syntax so there will be no brackets for decision-making structure.
End of If
If then Else structure:
The If Else statements are again the block statements. These statements will also be written in the same way. This means starting with a keyword i.e., “If”. Followed by the then-part statements. The else part will start with “Else” and will end at “End of Else”.
If a > b
Display “a is larger”
Display “b is larger”
End of Else
The language provides three different types of looping structures:
i. “from till” loop structure
ii. “while” loop structure
iii. “do while” loop structure
The language provides “from-till” which iterates from the given value till the condition is true. It has two versions one is with an auto-increment of 1 and the other one will increment the loop counter with the value mentioned after the keyword “with”. Termination of the structure is with the key work “End of Loop”.
From i = 0 till i<=5
Display “Hello World”
End of Loop
The language provides the “While” keyword as the start of the loop and the block will terminate with the keyword “End of Loop”.
a = 0
End of Loop
While a < 5
End of Loop
All functions will be written at the end of the main program. It will start with the keyword “Function” and will end at “End of Function”.
Start of Program
add (10 20)
Function add (a b)
End of Function
This language is free of data type i.e., there is no need to write any data type keyword to declare the identifier. The data type will be automatically cased at the time of assignment means if the value is an integer the identifier will become an integer if it is a float then the identifier will become afloat.
a = 1
b = 2.5
c = abc
d = a
From i = 0 till i<=5 with x
Display “Hello World”
End of Loop
Operators being used in NewBee can be seen in Table 1.
The language supports single and multi-line comments.
Single Line Comments:
Single line comments are written with the double slash (//) sign.
//this is a single line comment
A multiline comment starts with the single forward slash in the beginning (/) and a single forward slash (/) at the end of the comment.
/ this is a multiline comment, us to
Start and end with the backslash/
Following Table 2. provides the list of reserved words/strings. All words will be case-insensitive so that students need not bother with the case-sensitivity of keywords and do not face such errors. The reflection of making these words case-insensitive is not present (but only “If” is made case-insensitive) in the CFG (due to the page limitations).
The grammar of the language is as bellow: The words written in Cap-Bold are the Terminals and the others are the non-terminals.
In order to keep language simple and easy to understand for the newbies, few leverages are restricted. Details can be seen as follows:
• It is a procedural language and doesn’t support Object-Oriented Programming (OOP) features like Inheritance, Encapsulation, and Polymorphism
• Only alphabets—lower case and upper case and “_” is allowed for identifiers
• An identifier with the name of keywords should not be created (the compiler can apply this restriction)
• Function call passing in function arguments is restricted
• Every building block, like loops and if-else structures, start and end with keywords that are close to common sense and English
• Language does not support Arrays and Switch statements
• Language does not support any external libraries.
Some sample programs are given below:
Start of program
Display “Enter value:”
Display “enter the value:”
Display “Result”: ”
End of program
Enter value: 5
Enter the value: 5
Start of program
fact = l
Display “Enter Number: ”
From a=l till a<=num
End of Loop
Display “Factorial of Given Number is =”
End of program
Enter Number: 3
The Factorial of the Given Number is = 6
Start of program
Display “Enter a positive integer: ”
flag = 0
From i = 2 till i<= n And flag = 0 with 1
If n%i = 0
flag = 1
End of If
End of Loop
If n = 1
Display “1 is neither prime nor composite.”
If flag = 0
Display “given number is a prime.”
Display “given number is composite.”
End of If
End of program
Enter a positive integer: 7
the given number is a prime.
This study has presented a grammar for a language for beginners, novice programmers, and the non-cs major student called NewBee. The language proposes a simple and close English sentence structure and syntax to provide ease and comprehension to the beginners and novice programmers. The language is proposed using a systematic approach with feature identification, context-free grammar, and sample programs. Additionally, this paper conducted a short survey and examined the satisfaction of the experts. In this survey, 17 experts (with a minimum of 5 years of industrial experience or teaching experience in multiple languages) were given forms containing the programs (elaborating syntax) and their satisfaction on a 10-Likert scale (1 to 10) was recorded, where one indicated lowest satisfaction and ten indicated the highest satisfaction. The average satisfaction appeared to be 93% among different users of the language (Keeping in mind that this language is for the novice, newcomers and students who do not have computer science as their major).
This study can further be extended by incorporating more new and old language to make things more and more simple for the programmers
Acknowledgement: The authors would like to express their most profound gratitude towards, Mr Rana Muhammad Ijaz and Mr Sikandar Hayyat for their valuable time and efforts in helping us.
Funding Statement: This material is based upon the work supported by the startup fund provided to Dr. Saira Anwar by Texas A&M University, College Station, USA. Any opinions, findings, conclusion, or recommendations expressed in this material do not necessarily reflect those of Texas A&M University.
Conflicts of Interest: The authors declare that they have no conflicts of interest to report regarding the present study.
- A. L. Guzman, “What is human-machine communication, anyway,” in Human-machine Communication Rethinking: Communication Technology and Ourselves, Peter Lang, Book, New York, USA, vol. 1, pp. 1–28, 2018.
- M. Soeken, T. Haener and M. Roetteler, “Programming quantum computers using design automation,” in 2018 Design, Automation & Test in Europe Conf. & Exhibition (DATE), Dresden, Germany, pp. 137–146, 2018.
- D. Johnson and M. Ketel, “IoT: Application protocols and security,” International Journal of Computer Network & Information Security, vol. 11, no. 4, pp. 1–8, 2019.
- K. Vinall and E. A. Hellmich, “Down the rabbit hole: Machine translation, metaphor and instructor identity and agency,” Second Language Research & Practice, vol. 2, no. 1, pp. 99–118, 2021.
- N. G. S. S. L. States, “Next generation science standards: For states, by states,” Washington, DC, USA, Book, 2013.
- S. Olson, “Grand Challenges for Engineering: Imperatives, Prospects and Priorities: Summary of a Forum,” National Academies Press, Washington, DC, USA, 201
- A. Juškevičiene, G. Stupuriene and T. Jevsikova, “Computational thinking development through physical computing activities in STEAM education,” Computer Applications in Engineering Education, vol. 29, no. 1, pp. 175–190, 2021.
- D. Proctor, “The social production of internet space: Affordance, programming and virtuality,” Communication Theory, vol. 31, no. 4, pp. 593–612, 2021.
- R. Pereira, M. Couto, F. Ribeiro, R. Rua, J. Cunha et al., “Ranking programming languages by energy efficiency,” Science of Computer Programming, vol. 205, pp. 102609–102639, 2021.
- O. Grljević and Z. Bošnjak, “Sentiment analysis of customer data,” Strategic Management, vol. 23, no. 3, pp. 38–49, 2018.
- F. Del Bonifro, M. Gabbrielli, A. Lategano and S. Zacchiroli, “Image-based many-language programming language identification,” PeerJ Computer Science, vol. 7, pp. e631–655, 2021.
- A. M. Abubakar and A. A. Mustapha, “Newton’s method cubic equation of state C++ source code for iterative volume computation,” International Journal of Recent Engineering Science, vol. 8, no. 3, pp. 12–22, 2021.
- J. -S. Lee, Y. -W. Su and C. -C. Shen, “A comparative study of wireless protocols: Bluetooth, UWB, ZigBee and Wi-Fi,” in IECON 2007–33rd Annual Conf. of the IEEE Industrial Electronics Society, Teipei, Taiwan, pp. 46–51, 2007.
- J. Peterson, “Speaking ability progress of language learners in online and face-to-face courses,” Foreign Language Annals, vol. 54, no. 1, pp. 27–49, 2021.
- S. G. Kochan, “Programming in C Third Edition,” Book, Developer's Library, Indianapolis, Indiana, 2021.
- X. Chen, D. Song and Y. Tian, “Latent execution for neural program synthesis beyond domain-specific languages,” Advance in Neural Information Processing Systems, vol. 34, pp. 1–13, 2021.
- D. Pollak, V. Layka and A. Sacco, “DSL and Parser Combinator,” in Beginning Scala 3, Berkeley, California: Springer, pp. 237–245, 2022.
- S. Höppner, T. Kehrer and M. Tichy, “Contrasting dedicated model transformation languages versus general purpose languages: A historical perspective on ATL versus java based on complexity and size,” Software and Systems Modelling, vol. 21, pp. 1–33, 2021.
- K. Faldu, A. Sheth, P. Kikani and H. Akbari, “KI-BERT: Infusing knowledge context for better language and domain understanding,” arXiv Prepr. arXiv2104.08145, vol. 2, pp. 1–10, 2021.
- R. Liu, M. Gao, S. Ye and J. Zhang, “IGScript: An interaction grammar for scientific data presentation,” in Proc. of the 2021 CHI Conf. on Human Factors in Computing Systems, Yokohama, Japan, pp. 1–13, 2021.
- A. Van Deursen, P. Klint and J. Visser, “Domain-specific languages: An annotated bibliography,” ACM Sigplan Notices, vol. 35, no. 6, pp. 26–36, 2000.
- S. Erdweg, P. G. Giarrusso and T. Rendel, “Language composition untangled,” in Proc. of the Twelfth Workshop on Language Descriptions, Tools and Applications, New York, USA, pp. 1–8, 2012.
- N. Vasudevan and L. Tratt, “Comparative study of DSL tools,” Electronic Notes Theoretical Computer Science, vol. 264, no. 5, pp. 103–121, 2011.
- L. M. do Nascimento, D. L. Viana, P. A. S. Neto, D. A. Martins, V. C. Garcia et al., “A systematic mapping study on domain-specific languages,” in the Seventh Int. Conf. on Software Engineering Advances (ICSEA 2012), Lisbon, Portugal, pp. 179–187, 2012.
- M. Mernik, “Domain-specific languages: A systematic mapping study,” in Int. Conf. on Current Trends in Theory and Practice of Informatics, Limassol, Cyprus, pp. 464–472, 2017.
- T. Kosar, S. Bohra and M. Mernik, “Domain-specific languages: A systematic mapping study,” Information and Software Technology, vol. 71, pp. 77–91, 2016.
- J. Tanha, Y. Abdi, N. Samadi, N. Razzaghi and M. Asadpour, “Boosting methods for multi-class imbalanced data classification: An experimental review,” Journal of Big Data, vol. 7, no. 1, pp. 1–47, 2020.
- M. Mernik, J. Heering and A. M. Sloane, “When and how to develop domain-specific languages,” ACM Computing Surveys, vol. 37, no. 4, pp. 316–344, 2005.
- S. Erdweg, T. V. D. Storm, M. Volter, R. Bosman, W. R. Cook et al., “The state of the art in language workbenches,” in Int. Conf. on Software Language Engineering, Indianapolis, USA, pp. 197–217, 2013.
- S. Erdweg, T. V. D. Storm, M. Volter, R. Bosman, W. R. Cook et al., “Evaluating and comparing language workbenches: Existing results and benchmarks for the future,” Computer Languages, Systems & Structure, vol. 44, pp. 24–47, 2015.
- P. N. Johnson-Laird, M. Bucciarelli, R. Mackiewicz and S. S. Khemlani, “Recursion in programs, thought, and language,” Psychonomic Bulletin & Review, vol. 29, pp. 430–454, 2022.
- S. Olson, “Grand Challenges for Engineering,” Washington, D.C.: National Academies Press, Book, 2016.
- J. Hartmann, J. Huppertz, C. Schamp and M. Heitmann, “Comparing automated text classification methods,” International Journal of Research in Marketing, vol. 36, no. 1, pp. 20–38, 2019.
- H. M. Gualandi, “The Pallene Programming Language,” Ph. D. Dissertation. Pontifcia Universidade Católica do Rio de Janeiro, 2020.