Open Access
ARTICLE
Diabetes Prediction Using ADASYN-Based Data Augmentation and CNN-BiGRU Deep Learning Model
1 School of Electronics and Information Engineering, Hebei University of Technology, Tianjin, 300401, China
2 School of Electronics and Communication Engineering, Guangzhou University, Guangzhou, 510006, China
* Corresponding Author: Kewen Xia. Email:
Computers, Materials & Continua 2025, 84(1), 811-826. https://doi.org/10.32604/cmc.2025.063686
Received 21 January 2025; Accepted 13 March 2025; Issue published 09 June 2025
Abstract
The rising prevalence of diabetes in modern society underscores the urgent need for precise and efficient diagnostic tools to support early intervention and treatment. However, the inherent limitations of existing datasets, including significant class imbalances and inadequate sample diversity, pose challenges to the accurate prediction and classification of diabetes. Addressing these issues, this study proposes an innovative diabetes prediction framework that integrates a hybrid Convolutional Neural Network-Bidirectional Gated Recurrent Unit (CNN-BiGRU) model for classification with Adaptive Synthetic Sampling (ADASYN) for data augmentation. ADASYN was employed to generate synthetic yet representative data samples, effectively mitigating class imbalance and enhancing the diversity and representativeness of the dataset. This augmentation process is critical for ensuring the robustness and generalizability of the predictive model, particularly in scenarios where minority class samples are underrepresented. The CNN-BiGRU architecture was designed to leverage the complementary strengths of CNN in extracting spatial features and BiGRU in capturing sequential dependencies, making it well-suited for the complex patterns inherent in medical data. The proposed framework demonstrated exceptional performance, achieving a training accuracy of 98.74% and a test accuracy of 97.78% on the augmented dataset. These results validate the efficacy of the integrated approach in addressing the challenges of class imbalance and dataset heterogeneity, while significantly enhancing the diagnostic precision for diabetes prediction. This study provides a scalable and reliable methodology with promising implications for advancing diagnostic accuracy in medical applications, particularly in resource-constrained and data-limited environments.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.