Open Access iconOpen Access

ARTICLE

crossmark

Clustering Gene Expression Data Through Modified Agglomerative M-CURE Hierarchical Algorithm

E. Kavitha1,*, R. Tamilarasan2, N. Poonguzhali3, M. K. Jayanthi Kannan4

1 A Constituent College of Anna University, University College of Engineering, Villupuram, 605103, India
2 A Constituent College of Anna University, University College of Engineering, Pattukkottai, 614701, India
3 Department of Computer Science and Engineering, Manakula Vinayagar Institute of Technology, Puducherry, 605107, India
4 Department of Computer Science Engineering, Faculty of Engineering and Technology, JAIN (Deemed-To-Be University), Bangalore, 562112, India

* Corresponding Author: E. Kavitha. Email: email

Computer Systems Science and Engineering 2022, 41(3), 1027-141. https://doi.org/10.32604/csse.2022.020634

Abstract

Gene expression refers to the process in which the gene information is used in the functional gene product synthesis. They basically encode the proteins which in turn dictate the functionality of the cell. The first step in gene expression study involves the clustering usage. This is due to the reason that biological networks are very complex and the genes volume increases the comprehending challenges along with the data interpretation which itself inhibit vagueness, noise and imprecision. For a biological system to function, the essential cellular molecules must interact with its surrounding including RNA, DNA, metabolites and proteins. Clustering methods will help to expose the structures and the patterns in the original data for taking further decisions. The traditional clustering techniques involve hierarchical, model based, partitioning, density based, grid based and soft clustering methods. Though many of these methods provide a reliable output in clustering, they fail to incorporate huge data of gene expressions. Also, there are statistical issues along with choosing the right method and the choice of dissimilarity matrix when dealing with gene expression data. We propose to use a modified clustering algorithm using representatives (M-CURE) in this work which is more robust to outliers as compared to K-means clustering and also able to find clusters with size variances.

Keywords


Cite This Article

E. Kavitha, R. Tamilarasan, N. Poonguzhali and M. K. Jayanthi Kannan, "Clustering gene expression data through modified agglomerative m-cure hierarchical algorithm," Computer Systems Science and Engineering, vol. 41, no.3, pp. 1027–141, 2022. https://doi.org/10.32604/csse.2022.020634



cc This work is licensed under a Creative Commons Attribution 4.0 International License , which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
  • 1453

    View

  • 914

    Download

  • 0

    Like

Share Link