Structural XML Query Processing
Loading...
Files
Downloads
8
Date issued
Authors
Journal Title
Journal ISSN
Volume Title
Publisher
Vysoká škola báňská - Technická univerzita Ostrava
Location
ÚK/Sklad diplomových prací
Signature
202200006
Abstract
This thesis deals with the processing of structural XML queries which specify predicates on XML nodes and structural relationships that have to be satisfied between them; the structural XML queries are often modeled by a twig pattern query (TPQ). A lot of TPQ types have been proposed; this work takes into account a TPQ model extended by a specification of output and non-output query nodes since it complies with the XQuery semantics and, in many cases, it leads to a more efficient query processing. In general, there are two approaches to process the TPQs: holistic twig joins and binary joins. The holistic twig joins have been developed as a generalization of the binary joins, and they have been considered as a state-of-the-art TPQ processing method. This work brings improvements to the both approaches. For the holistic twig joins, we introduce a cost-based optimization that enables to combine various index data structures during the processing of a TPQ; we also propose a cost-based optimization framework to select an appropriate index data structure for each query node. For the binary joins, we show that these algorithms used in a fully-pipelined plan (i.e., the plan where each join operation does not wait for the complete result of the previous operation) can often overcome the holistic twig joins even without any cost-based optimizer, especially for TPQs with a higher ratio of non-output query nodes and for queries with a low selectivity. We also prove that for a certain class of TPQs, the fully-pipelined plan has the linear time and I/O complexity with respect to the size of the input and output as well as the linear space complexity with respect to the XML document depth (i.e., the same complexity as the holistic twig joins). We also include thorough experiments demonstrating advantages of the proposed improvements.
Description
Subject(s)
XML query processing, twig pattern query, query plan, binary join, holistic twig join, XQuery, XPath, XML, native XML database management system, cost-based optimization