High-dimensional text clustering by dimensionality reduction and improved density peak

Loading...
Thumbnail Image

Downloads

6

Date issued

Journal Title

Journal ISSN

Volume Title

Publisher

Hindawi and Wiley

Location

Signature

License

Abstract

This study focuses on high-dimensional text data clustering, given the inability of K-means to process high-dimensional data and the need to specify the number of clusters and randomly select the initial centers. We propose a Stacked-Random Projection dimensionality reduction framework and an enhanced K-means algorithm DPC-K-means based on the improved density peaks algorithm. The improved density peaks algorithm determines the number of clusters and the initial clustering centers of K-means. Our proposed algorithm is validated using seven text datasets. Experimental results show that this algorithm is suitable for clustering of text data by correcting the defects of K-means.

Description

Subject(s)

Citation

Wireless Communications and Mobile Computing. 2020, vol. 2020, art. no. 8881112.