Open Access iconOpen Access

ARTICLE

Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data

I Made Putrama1,2,*, Péter Martinek1

1 Department of Electronics Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, 1111, Hungary
2 Department of Informatics, Faculty of Engineering and Vocational, Universitas Pendidikan Ganesha, Singaraja, 81116, Indonesia

* Corresponding Author: I Made Putrama. Email: email

Computers, Materials & Continua 2025, 83(3), 5699-5727. https://doi.org/10.32604/cmc.2025.063465

Abstract

Imbalanced multiclass datasets pose challenges for machine learning algorithms. They often contain minority classes that are important for accurate predictions. However, when the data is sparsely distributed and overlaps with data points from other classes, it introduces noise. As a result, existing resampling methods may fail to preserve the original data patterns, further disrupting data quality and reducing model performance. This paper introduces Neighbor Displacement-based Enhanced Synthetic Oversampling (NDESO), a hybrid method that integrates a data displacement strategy with a resampling technique to achieve data balance. It begins by computing the average distance of noisy data points to their neighbors and adjusting their positions toward the center before applying random oversampling. Extensive evaluations compare 14 alternatives on nine classifiers across synthetic and 20 real-world datasets with varying imbalance ratios. This evaluation was structured into two distinct test groups. First, the effects of k-neighbor variations and distance metrics are evaluated, followed by a comparison of resampled data distributions against alternatives, and finally, determining the most suitable oversampling technique for data balancing. Second, the overall performance of the NDESO algorithm was assessed, focusing on G-mean and statistical significance. The results demonstrate that our method is robust to a wide range of variations in these parameters and the overall performance achieves an average G-mean score of 0.90, which is among the highest. Additionally, it attains the lowest mean rank of 2.88, indicating statistically significant improvements over existing approaches. This advantage underscores its potential for effectively handling data imbalance in practical scenarios.

Keywords

Neighbor; displacement; synthetic; oversampling; multiclass; imbalanced data

Cite This Article

APA Style
Putrama, I.M., Martinek, P. (2025). Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data. Computers, Materials & Continua, 83(3), 5699–5727. https://doi.org/10.32604/cmc.2025.063465
Vancouver Style
Putrama IM, Martinek P. Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data. Comput Mater Contin. 2025;83(3):5699–5727. https://doi.org/10.32604/cmc.2025.063465
IEEE Style
I. M. Putrama and P. Martinek, “Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data,” Comput. Mater. Contin., vol. 83, no. 3, pp. 5699–5727, 2025. https://doi.org/10.32604/cmc.2025.063465



cc Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 867

    View

  • 333

    Download

  • 0

    Like

Share Link