Shlukování na základě hustoty pro velká data
Loading...
Downloads
3
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
Signature
Abstract
This diploma thesis focuses on clustering with special interest in density based cluster analysis
for big data. In the beginnig, there is a theory behind clustering and mainly behind density
based cluster analysis and the DBSCAN algorithm. Significant part of the first half of this
theses consists of the data structures for efficient data storage and quering. In the second
part, we propose our own version of DBSCAN with kd-tree used as a data structure and with
parallel aproach of some of DBSCAN’s steps. We than measure the impact of parallelizing
the DBSCAN algorithm and compare the basic approach of querying data using brute force in
contrast to kd-tree. In the final part we propose possible enhancements and functionality for
further improvement.
Description
Subject(s)
clustering, DBSCAN, data structure, k-d tree, parallelization, OpenMP