A Fuzzy Approach for Topological Data Analysis

Abstract

Geometry and topology are becoming more powerful and dominant in data analysis because of their outstanding characteristics. It has emerged recently as a promising research area, known as Topological Data Analysis (TDA), for modern computer science. In recent years, the Mapper algorithm, an outstanding TDA representative, is increasingly completed with a stabilized theoretical foundation and practical applications and diverse, intuitive, user-friendly implementations. From a theoretical perspective, the Mapper algorithm is still a fuzzy clustering algorithm, with a visualization capability to extract the shape summary of data. However, its outcomes are still very sensitive to the parameter choice, including resolution and function. Therefore, there is a need to reduce the dependence on its parameters significantly. This idea is exciting and can be solved thanks to the outstanding characteristics of fuzzy clustering. The Mapper clustering ability is getting more potent by the support from well-known techniques. Therefore, this combination is expected to usefully and powerfully solve some problems encountered in many fields. The main research goal of this thesis is to approach TDA by fuzzy theory to create the interrelationships between them in terms of clustering. Explicitly speaking, the Mapper algorithm represents TDA, and the Fuzzy $C$-Means (FCM) algorithm represents fuzzy theory. They are combined to promote their advantages and overcome their disadvantages. On the one hand, the FCM algorithm helps the Mapper algorithm simplify the choice of parameters to obtain the most informative presentation and is even more efficient in data clustering. On the other hand, the FCM algorithm is equipped with the outstanding features of the Mapper algorithm in simplifying and visualizing data with qualitative analysis. This thesis focuses on conquering and achieving the following aims: (1) Summarizing the theoretical foundations and practical applications of the Mapper algorithm in the flow of literature with improved versions and various implementations. (2) Optimizing the cover choice of the Mapper algorithm in the direction of dividing the filter range automatically into irregular intervals with a random overlapping percentage by using the FCM algorithm. (3) Constructing a novel method for mining data that can exhibit the same clustering ability as the FCM algorithm and reveal some meaningful relationships by visualizing the global shape of data supplied by the Mapper algorithm.

Description

Subject(s)

Topological Data Analysis, Data Shape, Fuzzy Clustering, Mapper Algorithm, Fuzzy Mapper Algorithm, Shape Fuzzy $C$-Means Algorithm

Citation