Centroid Based Celestial Clustering Algorithm: A Novel Unsupervised Learning Method for Haemogram Data Clustering

Accuracy of clustering is the most important parameter as far as automated disease identification is concerned. There have always been attempts to automate the process of disease prediction from haemogram data. However, there are several components in blood test results and very often we find that a...

Full description

Saved in:

Bibliographic Details
Published in	IEEE transactions on emerging topics in computational intelligence Vol. 7; no. 3; pp. 942 - 956
Main Authors	K. B., Shibu Kumar, Samuel, Philip
Format	Journal Article
Language	English
Published	Piscataway IEEE 01.06.2023 The Institute of Electrical and Electronics Engineers, Inc. (IEEE)
Subjects	Accuracy Algorithms Automation Blood Centroid based celestial clustering Centroids Clustering Clustering algorithms Data analysis Disease disease prediction Diseases Force haemogram data clustering Machine learning nature inspired methods Optimization Parameter identification Prediction algorithms Unsupervised learning
Online Access	Get full text
ISSN	2471-285X 2471-285X
DOI	10.1109/TETCI.2022.3211004

Cover

More Information
Summary:	Accuracy of clustering is the most important parameter as far as automated disease identification is concerned. There have always been attempts to automate the process of disease prediction from haemogram data. However, there are several components in blood test results and very often we find that a variety of combinations of these component results are to be used to detect a disease. This makes identification of diseases really hard and necessitates the use of data analysis techniques. As new diseases are arising from time to time, a useful method for prediction is unsupervised learning and the corresponding data analysis technique is clustering. An easy, efficient and centroid based clustering algorithm that has been in practice widely is k-means. Its simplicity and efficiency make it a natural choice for most of the clustering applications. However, k-means is largely dependent on the selection of initial cluster centers and a bad choice can make it fall to local optima, thereby sacrificing accuracy. Besides, it is non-deterministic in nature. This paper proposes a novel, nature inspired, clustering method, named Centroid Based Celestial Clustering, which overcomes the above issues. Our method is deterministic and converges to global optima on spherical datasets. We experimentally evaluate our algorithm for speed of execution and cluster quality against well-known clustering algorithms using statistical evaluation metrics like silhouette width, adjusted rand index and Dunn index. We use the method to predict diseases identifiable from blood tests and our experiments show that the accuracy of prediction is very promising.
Bibliography:	ObjectType-Article-1 SourceType-Scholarly Journals-1 ObjectType-Feature-2 content type line 14
ISSN:	2471-285X 2471-285X
DOI:	10.1109/TETCI.2022.3211004