An Enhanced Automatic Arabic Essay Scoring System Based on Machine Learning Algorithms

Nourmeen Lotfy; Abdulaziz Shehab; Mohammed Elhoseny; Ahmed Abu-Elfetouh

doi:10.32604/cmc.2023.039185

Open Access icon Open Access

ARTICLE

An Enhanced Automatic Arabic Essay Scoring System Based on Machine Learning Algorithms

Nourmeen Lotfy¹, Abdulaziz Shehab^1,2,*, Mohammed Elhoseny^1,3, Ahmed Abu-Elfetouh¹

1 Department of Information Systems, Faculty of Computers and Information Science, Mansoura University, Mansoura, 35516, Egypt
2 Department of Information Systems, College of Computer and Information Sciences, Jouf University, Sakaka, Saudi Arabia
3 College of Computing and Informatics, University of Sharjah, Sharjah, United Arab Emirates

* Corresponding Author: Abdulaziz Shehab. Email: email

(This article belongs to the Special Issue: Cognitive Computing and Systems in Education and Research)

Computers, Materials & Continua 2023, 77(1), 1227-1249. https://doi.org/10.32604/cmc.2023.039185

Received 13 January 2023; Accepted 07 June 2023; Issue published 31 October 2023

Abstract

Despite the extensive effort to improve intelligent educational tools for smart learning environments, automatic Arabic essay scoring remains a big research challenge. The nature of the writing style of the Arabic language makes the problem even more complicated. This study designs, implements, and evaluates an automatic Arabic essay scoring system. The proposed system starts with pre-processing the student answer and model answer dataset using data cleaning and natural language processing tasks. Then, it comprises two main components: the grading engine and the adaptive fusion engine. The grading engine employs string-based and corpus-based similarity algorithms separately. After that, the adaptive fusion engine aims to prepare students’ scores to be delivered to different feature selection algorithms, such as Recursive Feature Elimination and Boruta. Then, some machine learning algorithms such as Decision Tree, Random Forest, Adaboost, Lasso, Bagging, and K-Nearest Neighbor are employed to improve the suggested system’s efficiency. The experimental results in the grading engine showed that Extracting DIStributionally similar words using the CO-occurrences similarity measure achieved the best correlation values. Furthermore, in the adaptive fusion engine, the Random Forest algorithm outperforms all other machine learning algorithms using the (80%–20%) splitting method on the original dataset. It achieves 91.30%, 94.20%, 0.023, 0.106, and 0.153 in terms of Pearson’s Correlation Coefficient, Willmot’s Index of Agreement, Mean Square Error, Mean Absolute Error, and Root Mean Square Error metrics, respectively.

Keywords

Arabic; corpus-based similarity; correlation; machine learning; string-based similarity; text similarity

Cite This Article

APA Style

Lotfy, N., Shehab, A., Elhoseny, M., Abu-Elfetouh, A. (2023). An enhanced automatic arabic essay scoring system based on machine learning algorithms. Computers, Materials & Continua, 77(1), 1227-1249. https://doi.org/10.32604/cmc.2023.039185

Vancouver Style

Lotfy N, Shehab A, Elhoseny M, Abu-Elfetouh A. An enhanced automatic arabic essay scoring system based on machine learning algorithms. Comput Mater Contin. 2023;77(1):1227-1249 https://doi.org/10.32604/cmc.2023.039185

IEEE Style

N. Lotfy, A. Shehab, M. Elhoseny, and A. Abu-Elfetouh "An Enhanced Automatic Arabic Essay Scoring System Based on Machine Learning Algorithms," Comput. Mater. Contin., vol. 77, no. 1, pp. 1227-1249. 2023. https://doi.org/10.32604/cmc.2023.039185

BibTex EndNote RIS

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

An Enhanced Automatic Arabic Essay Scoring System Based on Machine Learning Algorithms

Abstract

Keywords

Cite This Article

410

242

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Share Link