TY - EJOU
AU - Dung, Duong Tien
AU - Nam, Ha Hai
AU - Giang, Nguyen Long
AU - Lan, Luong Thi Hong
TI - An Active Safe Semi-Supervised Fuzzy Clustering with Pairwise Constraints Based on Cluster Boundary
T2 - Computers, Materials \& Continua
PY - 2025
VL - 85
IS - 3
SN - 1546-2226
AB - Semi-supervised clustering techniques attempt to improve clustering accuracy by utilizing a limited number of labeled data for guidance. This method effectively integrates prior knowledge using pre-labeled data. While semi-supervised fuzzy clustering (SSFC) methods leverage limited labeled data to enhance accuracy, they remain highly susceptible to inappropriate or mislabeled prior knowledge, especially in noisy or overlapping datasets where cluster boundaries are ambiguous. To enhance the effectiveness of clustering algorithms, it is essential to leverage labeled data while ensuring the safety of the previous knowledge. Existing solutions, such as the Trusted Safe Semi-Supervised Fuzzy Clustering Method (TS3FCM), struggle with random centroid initialization, fixed neighbor radius formulas, and handling outliers or noise at cluster overlaps. A new framework called Active Safe Semi-Supervised Fuzzy Clustering with Pairwise Constraints Based on Cluster Boundary (AS3FCPC) is proposed in this paper to deal with these problems. It does this by combining pairwise constraints and active learning. AS3FCPC uses active learning to query only the most informative data instances close to the cluster boundaries. It also uses pairwise constraints to enforce the cluster structure, which makes the system more accurate and robust. Extensive test results on diverse datasets, including challenging noisy and overlapping scenarios, demonstrate that AS3FCPC consistently achieves superior performance compared to state-of-the-art methods like TS3FCM and other baselines, especially when the data is noisy and overlaps. This significant improvement underscores AS3FCPC’s potential for reliable and accurate semi-supervised fuzzy clustering in complex, real-world applications, particularly by effectively managing mislabeled data and ambiguous cluster boundaries.
KW - Active learning; safe semi-supervised fuzzy clustering; confidence weight; boundary identification; pairwise constraints
DO - 10.32604/cmc.2025.069636