×

zbMATH — the first resource for mathematics

Structural properties of the string statistics problem. (English) Zbl 0593.68047
Summary: A suitably weighted index tree such as a B-tree or a suffix tree can be easily adapted to store, for a given string x and for all substrings w of x, the number of distinct instances of w along x. The storage needed is seen to be linear in the length of x; moreover, the whole statistics can itself be derived in linear time, off-line on a RAM. If the substring w has nontrivial periods, however, the number of distinct instances might differ from that of distinct nonoverlapping occurrences along x. It is shown here that O(n log n) storage units - n standing for the length of x - are sufficient to organize this second kind of statistics, in such a way that the maximum number of nonoverlapping instances for arbitrary w along x can be retrieved in a number of character comparisons not exceeding the length of w.

MSC:
68P10 Searching and sorting
68T99 Artificial intelligence
68Q25 Analysis of algorithms and problem complexity
68R10 Graph theory (including graph drawing) in computer science
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Apostolico, A., On context constrained squares and repetitions in a string, RAIRO J. theoret. inform., 18, No. 2, 147-159, (1984) · Zbl 0543.68067
[2] Aho, A.V.; Hopcrofr, J.E.; Ullman, J.D., The design and analysis of computer algorithms, (1974), Addison-Wesley Reading, Mass
[3] Apostolico, A.; Giancarlo, R., The boyer-Moore-galil string searching strategy revisited, SIAM J. comput., 15, No. 1, (1986) · Zbl 0589.68047
[4] Apostolico, A., The myriad virtues of subword trees, () · Zbl 0572.68067
[5] Apostolico, A.; Preparata, F.P., Optimal off-line detection of repetitions in a string, Theoret. comput. sci., 22, 297-315, (1983) · Zbl 0497.68052
[6] Boyer, R.S.; Moore, J.S., A fast string searching algorithm, Comm. ACM, 20, 323-350, (1977) · Zbl 1219.68165
[7] Crochemore, M., Recherche linĂ©aire d’un carrel dans un mot, C. R. acad. sci. Paris ser. I, 2966, 781-784, (1983) · Zbl 0522.68074
[8] Crochemore, M., An optimal algorithm for computing the repetitions in a word, Ipl, 12, No. 5, 244-250, (1981) · Zbl 0467.68075
[9] Galil, Z., On improving the worst case running time of the boyer-Moore string matching algorithm, (), 241-250
[10] Knuth, D.E.; Morris, J.H.; Pratt, V.R., Fast pattern matching in strings, SIAM J. comput., 6, 323-350, (1977) · Zbl 0372.68005
[11] Lentin, A.; Schutzemberger, M.P., A combinatorial problem in the theory of free monoids, (), 128-144
[12] Lorentz, R.J.; Main, M.G., Linear time recognition of square-free strings, ()
[13] Lothaire, M., Combinatorics on words, (1983), Addison-Wesley Reading, Mass · Zbl 0514.20045
[14] Lyndon, R.C.; Schutzemberger, M.P., The equation aM = bncp in a free group, Michigan math. J., 9, 289-298, (1962) · Zbl 0106.02204
[15] Mccreight, E.M., A space economical suffix tree construction algorithm, J. ACM, 23, 262-272, (1976) · Zbl 0329.68042
[16] Main, M.; Lorentz, R., An O(n log n) algorithm for finding all repetitions in a string, J. algorithms, 5, 422-432, (1984) · Zbl 0547.68083
[17] Majster, M.E.; Reiser, A., Efficient on-line construction and correction of position trees, SIAM J. comput., 9, 785-807, (1980) · Zbl 0447.68072
[18] Weiner, P., Linear pattern matching algorithms, ()
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.