Open Access
ARTICLE
kProtoClust: Towards Adaptive k-Prototype Clustering without Known k
1 School of Information Engineering, Xuchang University, Xuchang, 461000, China
2 Henan Province Engineering Technology Research Center of Big Data Security and Applications, Xuchang, 461000, China
3 College of Computer Science and Technology, Guizhou University, Guiyang, 550025, China
4 Here Data Technology, Shenzhen, 518000, China
* Corresponding Author: Yuan Ping. Email:
Computers, Materials & Continua 2025, 82(3), 4949-4976. https://doi.org/10.32604/cmc.2025.057693
Received 25 August 2024; Accepted 16 December 2024; Issue published 06 March 2025
Abstract
Towards optimal k-prototype discovery, k-means-like algorithms give us inspirations of central samples collection, yet the unstable seed samples selection, the hypothesis of a circle-like pattern, and the unknown K are still challenges, particularly for non-predetermined data patterns. We propose an adaptive k-prototype clustering method (kProtoClust) which launches cluster exploration with a sketchy division of K clusters and finds evidence for splitting and merging. On behalf of a group of data samples, support vectors and outliers from the perspective of support vector data description are not the appropriate candidates for prototypes, while inner samples become the first candidates for instability reduction of seeds. Different from the representation of samples in traditional, we extend sample selection by encouraging fictitious samples to emphasize the representativeness of patterns. To get out of the circle-like pattern limitation, we introduce a convex decomposition-based strategy of one-cluster-multiple-prototypes in which convex hulls of varying sizes are prototypes, and accurate connection analysis makes the support of arbitrary cluster shapes possible. Inspired by geometry, the three presented strategies make kProtoClust bypassing the K dependence well with the global and local position relationship analysis for data samples. Experimental results on twelve datasets of irregular cluster shape or high dimension suggest that kProtoClust handles arbitrary cluster shapes with prominent accuracy even without the prior knowledge K.Keywords
Cite This Article

This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.