Entropy-weighted medoid shift: An automated clustering algorithm for high-dimensional data

Loading...
Thumbnail Image

Downloads

0

Date issued

Journal Title

Journal ISSN

Volume Title

Publisher

Elsevier

Location

Signature

Abstract

Unveiling the intrinsic structure within high-dimensional data presents a significant challenge, particularly when clusters manifest themselves in lower-dimensional subspaces rather than in the full feature space. This complexity is prevalent in real-world datasets, such as text documents and images, which often contain numerous noisy or sparse features. Traditional clustering methods often overlook these latent subspace structures. This paper introduces a novel subspace-based clustering algorithm designed explicitly to address this challenge. Building upon the robust medoid shift framework, we integrate a dimensionality reduction scheme that dynamically projects data onto evolving subspaces determined through entropy-constrained optimization. This approach effectively filters irrelevant information and identifies underlying clusters, optimizing subspace representation while avoiding trivial solutions. Unlike existing methods, our algorithm ensures convergence without necessitating stopping criteria, thereby enabling efficient processing of large datasets. We validate the efficacy of our approach through extensive experiments on synthetic and real-world datasets, demonstrating substantial performance enhancements over state-of-the-art techniques. By explicitly uncovering the underlying subspace structures, our method opens new avenues for effective high-dimensional data clustering and offers valuable insights into complex data environments.

Description

Delayed publication

Available after

Subject(s)

medoid shift, data clustering, unsupervised learning, high-dimensional data

Citation

Applied Soft Computing. 2025, vol. 169, art. no. 112347.