# zbMATH — the first resource for mathematics

Space-efficient dictionaries for parameterized and order-preserving pattern matching. (English) Zbl 1380.68472
Grossi, Roberto (ed.) et al., 27th annual symposium on combinatorial pattern matching, CPM 2016, Tel Aviv, Israel, June 27–29, 2016. Proceedings. Wadern: Schloss Dagstuhl – Leibniz Zentrum für Informatik (ISBN 978-3-95977-012-5). LIPIcs – Leibniz International Proceedings in Informatics 54, Article 2, 12 p. (2016).
Summary: Let $$S$$ and $$S'$$ be two strings of the same length, over a totally-ordered alphabet. We consider the following two variants of string matching.
Parameterized Matching: The characters of $$S$$ and $$S'$$ are partitioned into static characters and parameterized characters. The strings are parameterized match iff the static characters match exactly and there exists a one-to-one function which renames the parameterized characters in $$S$$ to those in $$S'$$.
Order-Preserving Matching: The strings are order-preserving match iff for any two integers $$i,j\in[1,|S|],\;S[i]\prec S[j]\Leftrightarrow S'[i]\prec S'[j]$$, where $$\prec$$ denotes the precedence order of the alphabet.
Let $${\mathcal P}$$ be a collection of $$d$$ patterns $$\{P_1,P_2,\dots,P_d\}$$ of total length $$n$$ characters, which are chosen from a totally-ordered alphabet $$\Sigma$$. Given a text $$T$$, also over $$\Sigma$$, we consider the dictionary indexing problem under the above definitions of string matching. Specifically, the task is to index $${\mathcal P}$$, such that we can report all positions $$j$$ (called occurences) where at least one of the patterns $$P_i\in{\mathcal P}$$ is a parameterized-match (resp. order-preserving match) with the same-length substring of $$T$$ starting at $$j$$. Previous best-known indexes occupy $$O(n\log n)$$ bits and can report all occ occurences in $${\mathcal O}(|T|\log|\Sigma|+\mathrm{occ})$$ time. We present space-efficient indexes that occupy $${\mathcal O}(n\log|\Sigma|+ d\log n)$$ bits and reports all occ positions in $${\mathcal O}(|T|(\log|\Sigma|+ \log_{|\Sigma|}n)+\mathrm{occ})$$ time for parameterized matching and in $${\mathcal O}(|T|\log n+\mathrm{occ})$$ time for order-preserving matching.
For the entire collection see [Zbl 1351.68018].

##### MSC:
 68W32 Algorithms on strings
Full Text: