# zbMATH — the first resource for mathematics

Succinct non-overlapping indexing. (English) Zbl 1432.68089
Cicalese, Ferdinando (ed.) et al., Combinatorial pattern matching. 26th annual symposium, CPM 2015, Ischia Island, Italy, June 29 – July 1, 2015. Proceedings. Cham: Springer. Lect. Notes Comput. Sci. 9133, 185-195 (2015).
Summary: Given a text $$\mathsf {T}$$ having $$n$$ characters, we consider the non-overlapping indexing problem defined as follows: pre-process $$\mathsf {T}$$ into a data-structure, such that whenever a pattern $$P$$ comes as input, we can report a maximal set of non-overlapping occurrences of $$P$$ in $$\mathsf {T}$$. The best known solution for this problem takes linear space, in which a suffix tree of $$\mathsf {T}$$ is augmented with $$O(n)$$-word data structures. A query $$P$$ can be answered in optimal $$O(|P|+\mathsf {nocc})$$ time, where $$\mathsf {nocc}$$ is the output size [H. Cohen and E. Porat, Lect. Notes Comput. Sci. 5878, 1044–1053 (2009; Zbl 1273.68097)]. We present the following new result: let $$\mathsf {CSA}$$ (not necessarily a compressed suffix array) be an index of $$\mathsf {T}$$ that can compute (i) the suffix range of $$P$$ in $$\mathsf {search}(P)$$ time, and (ii) a suffix array or an inverse suffix array value in $$\mathsf t_\mathsf{SA}$$ time; then by using $$\mathsf {CSA}$$ alone, we can answer a query $$P$$ in $$O(\mathsf {search}(P)+ \mathsf {nocc}\cdot \mathsf t_\mathsf{SA})$$ time. Additionally, we present an improved result for a generalized version of this problem called range non-overlapping indexing.
For the entire collection see [Zbl 1314.68012].

##### MSC:
 68P05 Data structures 68P15 Database theory 68W32 Algorithms on strings
Full Text: