Multidimensional term indexing for efficient processing of complex queries
Loading...
Downloads
0
Date issued
Authors
Krátký, Michal
Skopal, Tomáš
Snášel, Václav
Journal Title
Journal ISSN
Volume Title
Publisher
Akademie věd České republiky. Ústav teorie informace a automatizace
Location
Ve fondu ÚK
Signature
Abstract
The area of Information Retrieval deals with problems of storage and retrieval within a huge collection of text documents. In IR models, the semantics of a document is usually characterized using a set of terms. A common need to various IR models is an efficient term retrieval provided via a term index. Existing approaches of term indexing, e. g. the inverted list, support efficiently only simple queries asking for a term occurrence. In practice, we would like to exploit some more sophisticated querying mechanisms, in particular queries based on regular expressions. In this article we propose a multidimensional approach of term indexing providing efficient term retrieval and supporting regular expression queries. Since the term lengths are usually different, we also introduce an improvement based on a new data structure, called BUB-forest, providing even more efficient term retrieval.
Description
Subject(s)
term indexing, complex queries, multidimensional data structures, BUB-forest
Citation
Kybernetika. 2004, vol. 40, no. 3, p. 381-396.