Open Access
ARTICLE
Enhancing Employee Turnover Prediction: An Advanced Feature Engineering Analysis with CatBoost
1 Department of Artificial Intelligence Convergence, Chonnam National University, Gwangju, 61186, Republic of Korea
2 Department of Data Science, Gazipur Digital University, Gazipur, 1750, Bangladesh
3 Department of Computer Science and Engineering, Gopalganj Science and Technology University, Gopalganj, 8100, Bangladesh
* Corresponding Authors: Kwanghoon Choi. Email: ; Kyungbaek Kim. Email:
# These authors contributed equally to this work
Computer Systems Science and Engineering 2025, 49, 455-479. https://doi.org/10.32604/csse.2025.069213
Received 17 June 2025; Accepted 18 July 2025; Issue published 19 August 2025
Abstract
Employee turnover presents considerable challenges for organizations, leading to increased recruitment costs and disruptions in ongoing operations. High voluntary attrition rates can result in substantial financial losses, making it essential for Human Resource (HR) departments to prioritize turnover reduction. In this context, Artificial Intelligence (AI) has emerged as a vital tool in strengthening business strategies and people management. This paper incorporates two new representative features, introducing three types of feature engineering to enhance the analysis of employee turnover in the IBM HR Analytics dataset. Key Machine Learning (ML) techniques were subsequently employed in this work, such as Support Vector Machine (SVM), Random Forest (RF), Logistic Regression (LR), Extreme Gradient Boosting (XGBoost), and especially Categorical Boosting (CatBoost), a gradient boosting algorithm optimized for categorical data to analyze employee turnover. Adopting the unique feature engineering process enables CatBoost to enhance model accuracy and robustness while effectively analyzing complex patterns within employee data. Experimental results demonstrate the effectiveness of our proposed methodology, achieving the highest accuracy of 90.14% and an F1-score of 0.88 on the IBM dataset. To assess the capability of our detection system, we have also used an extended dataset, achieving an optimal accuracy of 98.10% and an F1-score of 0.98. These results strongly indicate the efficiency of our proposed methodology and highlight the impact of feature engineering on predictive performance. Moreover, by pinpointing the top ten factors influencing attrition, including “Monthly Income”, “Over Time”, “Total Satisfaction”, and others, this research equips HR departments with insights to implement targeted retention strategies, such as enhancing compensation or job satisfaction, to retain key talent before they consider leaving.Keywords
Cite This Article
Copyright © 2025 The Author(s). Published by Tech Science Press.This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.


Submit a Paper
Propose a Special lssue
View Full Text
Download PDF
Downloads
Citation Tools