TY  - EJOU
AU  - Nguyen, Anh Thi Diem 
AU  - Vo, Tham 
AU  - Hoang, Vinh Truong 

TI  - Hybrid-RL: An Incremental Deep Clustering Framework with Reinforcement Learning for Adaptive Customer Segmentation
T2  - Computers, Materials \& Continua

PY  - 
VL  - 
IS  - 
SN  - 1546-2226

AB  - Keeping customers engaged remains a major challenge in appointment-based services, where user behavior continuously shifts due to seasonal, market, and social factors. These dynamic changes often cause concept drift, rendering traditional deep clustering models unreliable because they assume stable data distributions. Most existing approaches handle representation learning, parameter optimization, and model updating as separate components, limiting their adaptability in real-world streaming environments. This study proposes Hybrid-RL, a novel adaptive clustering framework that unifies incremental deep representation learning, multi-head reinforcement learning for joint hyperparameter optimization (number of clusters, latent dimension, and clustering method), incremental model updating, bandit-based decision making, surrogate-model explainable artificial intelligence (XAI), and continuous Gini-based fairness monitoring within a single closed-loop pipeline. The framework updates incrementally via autoencoder fine-tuning and MiniBatchKMeans partial_fit without requiring full retraining, enabling efficient adaptation to evolving customer behavior. Experiments conducted on real proprietary appointment data (10,212 records collected from 2021 to 2025) with natural concept drift demonstrate that Hybrid-RL achieves superior clustering quality, recording a Silhouette score of 0.7542, Davies–Bouldin Index (DBI) of 0.3150, and Calinski–Harabasz (CH) index of 1810.34, while maintaining an ultra-low inference time of 0.0001 s per sample. The model significantly outperforms 13 baseline methods. Under controlled synthetic drift, Hybrid-RL exhibits only a 6.1% drop in Silhouette score, confirming strong robustness. Additional validation on the public UCI Online Retail dataset further confirms the framework’s generalizability. Fairness analysis reports an average Gini coefficient of 0.49 across clusters, indicating balanced action distribution.
KW  - Incremental clustering; reinforcement learning; adaptive segmentation; concept drift; explainable AI; customer retention optimization

DO  - 10.32604/cmc.2026.082845