Reduction of disk accesses in multidimensional data structures
Loading...
Files
Downloads
2
Date issued
Authors
Chovanec, Peter
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
ÚK/Sklad diplomových prací
Signature
201600084
Abstract
Multidimensional access methods have become very popular in recent years. They support basic operations (insert, delete, update, and point query) and they often support other query types like the multidimensional range query, similarity queries and so on. Multidimensional access methods can be classified as tree access methods and grid access methods. The grid access methods are highly dependent on the distribution of the data; they have the extremely bad worstcase scenario for the space overhead and the time complexity of the operations insert, update, and delete. Therefore, tree access methods dominate over them. Although tree access methods overcome grid access methods in the case of those operations, query processing have been shown to be inefficient in many cases.
In this thesis, we aim our effort at the processing of the multidimensional range query without necessity of a sequential scan through a complete data collection. However, when a depth-first range query algorithm of a data structure is applied, nodes of the data structure are randomly accessed. It is especially a problem when the nodes are read from the secondary storage. Moreover, other issues of the tree access methods appear when the dimensionality of a space is increased, as a result, many leaf nodes matched by the algorithm do not include any relevant data. When we reduce the number of nodes accessed during range query processing, we reduce the query processing time regardless the nodes are stored in the main memory or in the secondary storage.
This thesis describes three techniques reducing the number of nodes accessed during a range query is processed. The first one is an optimization of disk accesses by prefetch techniques. In the second technique, we focus on an optimization of multiple range query processing, i.e. processing a sequence of range queries using one tree traversal. The third technique enables more efficient processing of a special kind of the range query, the narrow range query, using signatures. Since the R-tree is the most common multidimensional data structure, presented techniques are especially applied on the R-tree.
Description
Import 02/11/2016
Import 18/04/2016
Import 18/04/2016
Subject(s)
multidimensional access methods, range query processing, R-tree, prefetch techniques, multiple range queries, signatures