P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets

Ayman Altameem; Ramesh Poonia; Ankit Kumar; Linesh Raja; Abdul Khader

doi:10.32604/iasc.2023.027579

Open Access icon Open Access

ARTICLE

P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets

Ayman Altameem¹, Ramesh Chandra Poonia², Ankit Kumar³, Linesh Raja⁴, Abdul Khader Jilani Saudagar^5,*

1 Department of Computer Science and Engineering, College of Applied Studies and Community Services, King Saud University, Riyadh, 11533, Saudi Arabia
2 Department of Computer Science, CHRIST (Deemed to be University), Bangalore, 560029, India
3 Department of Computer Engineering and Applications, GLA University, Mathura, UP, India
4 Department of Computer Application, Manipal University Jaipur, Rajasthan, 303007, India
5 Information Systems Department, Imam Mohammad Ibn Saud Islamic University (IMSIU), Riyadh, 11432, Saudi Arabia

* Corresponding Author: Abdul Khader Jilani Saudagar. Email: email

Intelligent Automation & Soft Computing 2023, 35(1), 553-566. https://doi.org/10.32604/iasc.2023.027579

Received 20 January 2022; Accepted 02 March 2022; Issue published 06 June 2022

Abstract

Data clustering is crucial when it comes to data processing and analytics. The new clustering method overcomes the challenge of evaluating and extracting data from big data. Numerical or categorical data can be grouped. Existing clustering methods favor numerical data clustering and ignore categorical data clustering. Until recently, the only way to cluster categorical data was to convert it to a numeric representation and then cluster it using current numeric clustering methods. However, these algorithms could not use the concept of categorical data for clustering. Following that, suggestions for expanding traditional categorical data processing methods were made. In addition to expansions, several new clustering methods and extensions have been proposed in recent years. ROCK is an adaptable and straightforward algorithm for calculating the similarity between data sets to cluster them. This paper aims to modify the algorithm by creating a parameterized version that takes specific algorithm parameters as input and outputs satisfactory cluster structures. The parameterized ROCK algorithm is the name given to the modified algorithm (P-ROCK). The proposed modification makes the original algorithm more flexible by using user-defined parameters. A detailed hypothesis was developed later validated with experimental results on real-world datasets using our proposed P-ROCK algorithm. A comparison with the original ROCK algorithm is also provided. Experiment results show that the proposed algorithm is on par with the original ROCK algorithm with an accuracy of 97.9%. The proposed P-ROCK algorithm has improved the runtime and is more flexible and scalable.

Keywords

ROCK; K-means algorithm; clustering approaches; unsupervised learning; K-histogram

Cite This Article

APA Style

Altameem, A., Poonia, R.C., Kumar, A., Raja, L., Saudagar, A.K.J. (2023). P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets. Intelligent Automation & Soft Computing, 35(1), 553–566. https://doi.org/10.32604/iasc.2023.027579

Vancouver Style

Altameem A, Poonia RC, Kumar A, Raja L, Saudagar AKJ. P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets. Intell Automat Soft Comput. 2023;35(1):553–566. https://doi.org/10.32604/iasc.2023.027579

IEEE Style

A. Altameem, R. C. Poonia, A. Kumar, L. Raja, and A. K. J. Saudagar, “P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets,” Intell. Automat. Soft Comput., vol. 35, no. 1, pp. 553–566, 2023. https://doi.org/10.32604/iasc.2023.027579

BibTex EndNote RIS

Copyright © 2023 The Author(s). Published by Tech Science Press.
This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Table of Content

P-ROCK: A Sustainable Clustering Algorithm for Large Categorical Datasets

Abstract

Keywords

Cite This Article

3256

1418

0

Related articles

Further Information

Guidelines

Follow Us

Join Us

Contact Us

WhatsApp:

Share Link