an:07286734
Zbl 07286734
Hooshmand, Sahar; Abedin, Paniz; K??lekci, M. O??uzhan; Thankachan, Sharma V.
Non-overlapping indexing -- cache obliviously
EN
Navarro, Gonzalo (ed.) et al., 29th annual symposium on
combinatorial pattern matching, CPM 2018, July 2--4, 2018, Qingdao, China. Wadern: Schloss Dagstuhl -- Leibniz Zentrum f??r Informatik (ISBN 978-3-95977-074-3). LIPIcs -- Leibniz International Proceedings in Informatics 105, Article 8, 9 p. (2018).
2018
a
68W32
suffix trees; cache oblivious; data structure; string algorithms
Summary: The non-overlapping indexing problem is defined as follows: pre-process a given text \(\mathsf{T}[1,n]\) of length \(n\) into a data structure such that whenever a pattern \(P[1,p]\) comes as an input, we can efficiently report the largest set of non-overlapping occurrences of \(P\) in \(\textsf{T}\). The best known solution is by \textit{H. Cohen} and \textit{E. Porat} [Lect. Notes Comput. Sci. 5878, 1044--1053 (2009; Zbl 1273.68097)]. Their index size is \(O(n)\) words and query time is optimal \(O(p+\mathsf{nocc})\), where \textsf{nocc} is the output size. We study this problem in the cache-oblivious model and present a new data structure of size \(O(n\log n)\) words. It can answer queries in optimal \(O(\frac{p}{B}+\log_B n+\frac{\mathsf{nocc}}{B})\) I/Os, where \(B\) is the block size.
For the entire collection see [Zbl 1390.68025].
Zbl 1273.68097