zbMATH — the first resource for mathematics

Compression, indexing, and retrieval for massive string data. (English) Zbl 1286.68118
Amir, Amihood (ed.) et al., Combinatorial pattern matching. 21st annual symposium, CPM 2010, New York, NY, USA, June 21–23, 2010. Proceedings. Berlin: Springer (ISBN 978-3-642-13508-8/pbk). Lecture Notes in Computer Science 6129, 260-274 (2010).
Summary: The field of compressed data structures seeks to achieve fast search time, but using a compressed representation, ideally requiring less space than that occupied by the original input data. The challenge is to construct a compressed representation that provides the same functionality and speed as traditional data structures. In this invited presentation, we discuss some breakthroughs in compressed data structures over the course of the last decade that have significantly reduced the space requirements for fast text and document indexing. One interesting consequence is that, for the first time, we can construct data structures for text indexing that are competitive in time and space with the well-known technique of inverted indexes, but that provide more general search capabilities. Several challenges remain, and we focus in this presentation on two in particular: building I/O-efficient search structures when the input data are so massive that external memory must be used, and incorporating notions of relevance in the reporting of query answers.
For the entire collection see [Zbl 1192.68005].

68P30 Coding and information theory (compaction, compression, models of communication, encoding schemes, etc.) (aspects in computer science)
68P05 Data structures
Full Text: DOI