Comprehensive characterization of an open source document search engine
Loading...
Downloads
7
Date issued
Journal Title
Journal ISSN
Volume Title
Publisher
Association for Computing Machinery
Location
Signature
Abstract
This work performs a thorough characterization and analysis of the open source Lucene search library. The article describes in detail the architecture, functionality, and micro-architectural behavior of the search engine, and investigates prominent online document search research issues. In particular, we study how intra-server index partitioning affects the response time and throughput, explore the potential use of low power servers for document search, and examine the sources of performance degradation ands the causes of tail latencies. Some of our main conclusions are the following: (a) intra-server index partitioning can reduce tail latencies but with diminishing benefits as incoming query traffic increases, (b) low power servers given enough partitioning can provide same average and tail response times as conventional high performance servers, (c) index search is a CPU-intensive cache-friendly application, and (d) C-states are the main culprits for performance degradation in document search.
Description
Subject(s)
document search, index partitioning, parallel index search, parallelism, characterization, real hardware, measurement, evaluation, performance, experimentation
Citation
ACM Transactions on Architecture and Code Optimization. 2019, vol. 16, issue 2, art. no. 19.