An Improved K-means Clustering Algorithm for Multi-dimensional Multi-cluster data Using Meta-heuristics
Loading...
Date
Journal Title
Journal ISSN
Volume Title
Publisher
Institute of Electrical and Electronics Engineers Inc.
Abstract
k-means is the most widely used clustering algorithm which is an unsupervised technique that needs assumptions of centroids to begin the process. Hence, the problem is NP-hard and needs careful consideration and optimization to get a better quality of clusters of data. In this work, a meta-heuristic based genetic algorithm is proposed to optimize the centroid initialization process. The proposed method includes tournament selection, probability-based mutation, and elitism that leads to finding the optimal centroids for the clusters of a given dataset. Nine different and diversified datasets were used to test the performance of the proposed method in terms of the davies-bouldin index and it performed better in all the datasets than the standard k-means and minibatch k-means algorithm.
Description
Citation
Ashraf, F. B., Matin, A., Shafi, M. S. R., & Islam, M. U. (2021, December). An improved k-means clustering algorithm for multi-dimensional multi-cluster data using meta-heuristics. In 2021 24th International Conference on Computer and Information Technology (ICCIT) (pp. 1-6). IEEE.