Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data

I Putrama; Péter Martinek

doi:10.32604/cmc.2025.063465

Open Access icon Open Access

ARTICLE

Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data

I Made Putrama^1,2,*, Péter Martinek¹

1 Department of Electronics Technology, Faculty of Electrical Engineering and Informatics, Budapest University of Technology and Economics, Budapest, 1111, Hungary
2 Department of Informatics, Faculty of Engineering and Vocational, Universitas Pendidikan Ganesha, Singaraja, 81116, Indonesia

* Corresponding Author: I Made Putrama. Email: email

Computers, Materials & Continua 2025, 83(3), 5699-5727. https://doi.org/10.32604/cmc.2025.063465

Received 15 January 2025; Accepted 09 April 2025; Issue published 19 May 2025

Abstract

Imbalanced multiclass datasets pose challenges for machine learning algorithms. They often contain minority classes that are important for accurate predictions. However, when the data is sparsely distributed and overlaps with data points from other classes, it introduces noise. As a result, existing resampling methods may fail to preserve the original data patterns, further disrupting data quality and reducing model performance. This paper introduces Neighbor Displacement-based Enhanced Synthetic Oversampling (NDESO), a hybrid method that integrates a data displacement strategy with a resampling technique to achieve data balance. It begins by computing the average distance of noisy data points to their neighbors and adjusting their positions toward the center before applying random oversampling. Extensive evaluations compare 14 alternatives on nine classifiers across synthetic and 20 real-world datasets with varying imbalance ratios. This evaluation was structured into two distinct test groups. First, the effects of k-neighbor variations and distance metrics are evaluated, followed by a comparison of resampled data distributions against alternatives, and finally, determining the most suitable oversampling technique for data balancing. Second, the overall performance of the NDESO algorithm was assessed, focusing on G-mean and statistical significance. The results demonstrate that our method is robust to a wide range of variations in these parameters and the overall performance achieves an average G-mean score of 0.90, which is among the highest. Additionally, it attains the lowest mean rank of 2.88, indicating statistically significant improvements over existing approaches. This advantage underscores its potential for effectively handling data imbalance in practical scenarios.

Keywords

Neighbor; displacement; synthetic; oversampling; multiclass; imbalanced data

Cite This Article

APA Style

Putrama, I.M., Martinek, P. (2025). Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data. Computers, Materials & Continua, 83(3), 5699–5727. https://doi.org/10.32604/cmc.2025.063465

Vancouver Style

Putrama IM, Martinek P. Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data. Comput Mater Contin. 2025;83(3):5699–5727. https://doi.org/10.32604/cmc.2025.063465

IEEE Style

I. M. Putrama and P. Martinek, “Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data,” Comput. Mater. Contin., vol. 83, no. 3, pp. 5699–5727, 2025. https://doi.org/10.32604/cmc.2025.063465

BibTex EndNote RIS

Copyright © 2025 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

Neighbor Displacement-Based Enhanced Synthetic Oversampling for Multiclass Imbalanced Data

Abstract

Keywords

Cite This Article

1096

439

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link