High-dimensional text clustering by dimensionality reduction and improved density peak
Loading...
Downloads
6
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Hindawi and Wiley
Location
Signature
License
Abstract
This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. Our proposed algorithm is validated using seven text datasets. Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.
Description
Subject(s)
Citation
Wireless Communications and Mobile Computing. 2020, vol. 2020, art. no. 8881112.