Table of Content

Open Access iconOpen Access

ARTICLE

Improving Performance Prediction on Education Data with Noise and Class Imbalance

Akram M. Radwana,b, Zehra Cataltepea,c

a Computer Engineering Department, Istanbul Technical University, Istanbul, Turkey;
b Department of Information Technology, University College of Applied Sciences, Gaza, Palestine;
c tazi.io Machine Learning Solutions, Istanbul, Turkey

* Corresponding Author: Akram M. Radwan, email

Intelligent Automation & Soft Computing 2018, 24(4), 777-783. https://doi.org/10.1080/10798587.2017.1337673

Abstract

This paper proposes to apply machine learning techniques to predict students’ performance on two real-world educational data-sets. The first data-set is used to predict the response of students with autism while they learn a specific task, whereas the second one is used to predict students’ failure at a secondary school. The two data-sets suffer from two major problems that can negatively impact the ability of classification models to predict the correct label; class imbalance and class noise. A series of experiments have been carried out to improve the quality of training data, and hence improve prediction results. In this paper, we propose two noise filter methods to eliminate the noisy instances from the majority class located inside the borderline area. Our methods combine the over-sampling SMOTE technique with the thresholding technique to balance the training data and choose the best boundary between classes. Then we apply a noise detection approach to identify the noisy instances. We have used the two data-sets to assess the efficacy of class-imbalance approaches as well as both proposed methods. Results for different classifiers show that, the AUC scores significantly improved when the two proposed methods combined with existing class-imbalance techniques.

Keywords


Cite This Article

APA Style
Radwan, A.M., Cataltepe, Z. (2018). Improving performance prediction on education data with noise and class imbalance. Intelligent Automation & Soft Computing, 24(4), 777-783. https://doi.org/10.1080/10798587.2017.1337673
Vancouver Style
Radwan AM, Cataltepe Z. Improving performance prediction on education data with noise and class imbalance. Intell Automat Soft Comput . 2018;24(4):777-783 https://doi.org/10.1080/10798587.2017.1337673
IEEE Style
A.M. Radwan and Z. Cataltepe, "Improving Performance Prediction on Education Data with Noise and Class Imbalance," Intell. Automat. Soft Comput. , vol. 24, no. 4, pp. 777-783. 2018. https://doi.org/10.1080/10798587.2017.1337673



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1379

    View

  • 913

    Download

  • 0

    Like

Share Link