High-dimensional text clustering by dimensionality reduction and improved density peak
Loading...
Downloads
11
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Hindawi and Wiley
Location
Signature
License
Abstract
This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. Our proposed algorithm is validated using seven text datasets. Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.
Description
Subject(s)
Citation
Wireless Communications and Mobile Computing. 2020, vol. 2020, art. no. 8881112.