Open Access iconOpen Access

ARTICLE

crossmark

RE-SMOTE: A Novel Imbalanced Sampling Method Based on SMOTE with Radius Estimation

by Dazhi E1, Jiale Liu2, Ming Zhang1,*, Huiyuan Jiang2, Keming Mao2

1 Shenyang Fire Science and Technology Research Institute, Ministry of Emergency Management of the People’s Republic of China, Shenyang, 110034, China
2 College of Software, Northeastern University, Shenyang, 110006, China

* Corresponding Author: Ming Zhang. Email: email

Computers, Materials & Continua 2024, 81(3), 3853-3880. https://doi.org/10.32604/cmc.2024.057538

Abstract

Imbalance is a distinctive feature of many datasets, and how to make the dataset balanced become a hot topic in the machine learning field. The Synthetic Minority Oversampling Technique (SMOTE) is the classical method to solve this problem. Although much research has been conducted on SMOTE, there is still the problem of synthetic sample singularity. To solve the issues of class imbalance and diversity of generated samples, this paper proposes a hybrid resampling method for binary imbalanced data sets, RE-SMOTE, which is designed based on the improvements of two oversampling methods parameter-free SMOTE (PF-SMOTE) and SMOTE-Weighted Ensemble Nearest Neighbor (SMOTE-WENN). Initially, minority class samples are divided into safe and boundary minority categories. Boundary minority samples are regenerated through linear interpolation with the nearest majority class samples. In contrast, safe minority samples are randomly generated within a circular range centered on the initial safe minority samples with a radius determined by the distance to the nearest majority class samples. Furthermore, we use Weighted Edited Nearest Neighbor (WENN) and relative density methods to clean the generated samples and remove the low-quality samples. Relative density is calculated based on the ratio of majority to minority samples among the reverse k-nearest neighbor samples. To verify the effectiveness and robustness of the proposed model, we conducted a comprehensive experimental study on 40 datasets selected from real applications. The experimental results show the superiority of radius estimation-SMOTE (RE-SMOTE) over other state-of-the-art methods. Code is available at: (accessed on 30 September 2024).

Keywords


Cite This Article

APA Style
E, D., Liu, J., Zhang, M., Jiang, H., Mao, K. (2024). RE-SMOTE: A novel imbalanced sampling method based on SMOTE with radius estimation. Computers, Materials & Continua, 81(3), 3853–3880. https://doi.org/10.32604/cmc.2024.057538
Vancouver Style
E D, Liu J, Zhang M, Jiang H, Mao K. RE-SMOTE: A novel imbalanced sampling method based on SMOTE with radius estimation. Comput Mater Contin. 2024;81(3):3853–3880. https://doi.org/10.32604/cmc.2024.057538
IEEE Style
D. E, J. Liu, M. Zhang, H. Jiang, and K. Mao, “RE-SMOTE: A Novel Imbalanced Sampling Method Based on SMOTE with Radius Estimation,” Comput. Mater. Contin., vol. 81, no. 3, pp. 3853–3880, 2024. https://doi.org/10.32604/cmc.2024.057538



cc Copyright © 2024 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 500

    View

  • 183

    Download

  • 0

    Like

Share Link