Design and Performance Evaluation of Efficient Clustering Algorithms for Big Data Applications
Published Online: Feb 05, 2025
Received: Sep 22, 2024
Accepted: Dec 31, 2024
DOI: https://doi.org/10.2478/amns-2025-0056
Keywords
© 2025 Ping Dai et al., published by Sciendo
This work is licensed under the Creative Commons Attribution 4.0 International License.
In recent years the rapid development of big data and cloud computing technology, Internet of Things (IoT) technology and artificial intelligence algorithms has provided data analysis and management support for the development of many fields. Therefore, the article designs efficient clustering algorithms for big data applications. The article first proposes a k-means clustering algorithm based on dimensionality reduction. The information entropy-based kernel principal component analysis is combined with the k-means clustering algorithm, and after removing the attributes with little information, the kernel principal component analysis is applied to analyze the information attributes so as to reduce the dimensionality of the data. The article continues by proposing a weighted k-means clustering algorithm based on optimizing the initial clustering center to overcome the degree of influence of different attributes of the sample data on the clustering results during the clustering calculation process. The article concludes with a series of performance evaluations of the clustering algorithm designed in this paper, as well as its application to specific empirical evidence. In the algorithm effect evaluation experiments, with the increasing size of the dataset, the processing efficiency of the proposed algorithm in this paper increases exponentially, and its superiority is more prominent compared to other algorithms.