×

zbMATH — the first resource for mathematics

Space-efficient frameworks for top-\(k\) string retrieval. (English) Zbl 1295.68230

MSC:
68W32 Algorithms on strings
68P05 Data structures
68P20 Information storage and retrieval of data
68Q17 Computational difficulty of problems (lower bounds, completeness, difficulty of approximation, etc.)
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Alok Aggarwal and Jeffrey Scott Vitter. 1988. The input/output complexity of sorting and related problems. Commun. ACM 31, 9, 1116–1127. · doi:10.1145/48529.48535
[2] D. Belazzougui and G. Navarro. 2011. Improved compressed indexes for full-text document retrieval. In Proceedings of the International Symposium on String Processing and Information Retrieval. 386–397. · Zbl 05965196 · doi:10.1007/978-3-642-24583-1_38
[3] Djamal Belazzougui, Gonzalo Navarro, and Daniel Valenzuela. 2013. Improved compressed indexes for full-text document retrieval. J. Discr. Algor. 18, 3–13. · Zbl 1268.68075 · doi:10.1016/j.jda.2012.07.005
[4] M. A. Bender and M. Farach-Colton. 2000. The LCA problem revisited. In Proceedings of the Latin American Symposium on Theoretical Informatics. 88–94. · Zbl 0959.68133
[5] Manuel Blum, Robert W. Floyd, Vaughan R. Pratt, Ronald L. Rivest, and Robert Endre Tarjan. 1973. Time bounds for selection. J. Comput. System Sci. 7, 4, 448–461. · Zbl 0278.68033 · doi:10.1016/S0022-0000(73)80033-9
[6] M. R. Brown and R. E. Tarjan. 1979. A fast merging algorithms. J. ACM 26, 2, 211–226. · Zbl 0395.68055 · doi:10.1145/322123.322127
[7] B. Chazelle. 1988. A functional approach to data structures and its use in multidimensional searching. SIAM J. Comput. 17, 3, 427–462. · Zbl 0679.68074 · doi:10.1137/0217026
[8] Yu Feng Chien, Wing Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2013. Geometric BWT: Compressed text indexing via sparse suffixes and range searching. Algorithmica. To appear. · Zbl 1314.68115 · doi:10.1007/s00453-013-9792-1
[9] D. R. Clark. 1996. Compact Pat Trees. Ph.D. Dissertation, University of Waterloo.
[10] H. Cohen and E. Porat. 2010. Fast set intersection and two-patterns matching. Theoret. Comput. Sci. 411, 40–42, 3795–3800. · Zbl 1207.68270 · doi:10.1016/j.tcs.2010.06.002
[11] Richard Cole, Lee-Ad Gottlieb, and Moshe Lewenstein. 2004. Dictionary matching and indexing with errors and don’t cares. In Proceedings of the Symposium on Theory of Computing. 91–100. · Zbl 1192.68818
[12] J. S. Culpepper, G. Navarro, S. J. Puglisi, and A. Turpin. 2010. Top-kranked document search in general text databases. In Proceedings of the European Symposium on Algorithms. 194–205. · Zbl 1287.68035
[13] J. Shane Culpepper, Matthias Petri, and Falk Scholer. 2012. Efficient in-memory top-kdocument retrieval. In Proceedings of the SIGIR Conference on Research and Development in Information Retrieval. 225–234.
[14] Martin Farach. 1997. Optimal suffix tree construction with large alphabets. In Proceedings of the Symposium on Foundations of Computer Science. 137–143. · doi:10.1109/SFCS.1997.646102
[15] P. Ferragina and G. Manzini. 2005. Indexing compressed text. J. ACM 52, 4, 552–581. · Zbl 1323.68261 · doi:10.1145/1082036.1082039
[16] P. Ferragina, G. Manzini, V. Mäkinen, and G. Navarro. 2007. Compressed representations of sequences and full-text indexes. ACM Trans. Algor. 3, 2. · Zbl 1321.68263
[17] Johannes Fischer, Travis Gagie, Tsvi Kopelowitz, Moshe Lewenstein, Veli Mäkinen, Leena Salmela, and Niko Välimäki. 2012. Forbidden patterns. In Proceedings of the Latin American Symposium on Theoretical Informatics. 327–337.
[18] J. Fischer and V. Heun. 2007. A new succinct representation of RMQ-information and improvements in the enhanced suffix array. In Proceedings of the Symposium on Combinatorics, Algorithms, Probabilistic and Experimental Methodologies. 459–470. · Zbl 1176.68058 · doi:10.1007/978-3-540-74450-4_41
[19] Johannes Fischer and Volker Heun. 2011. Space-efficient preprocessing schemes for range minimum queries on static arrays. SIAM J. Comput. 40, 2, 465–492. · Zbl 1222.05024 · doi:10.1137/090779759
[20] G. N. Frederickson. 1993. An optimal algorithm for selection in a min-heap. Inf. Comput. 104, 2, 197–214. · Zbl 0818.68065 · doi:10.1006/inco.1993.1030
[21] Michael L. Fredman, János Komlós, and Endre Szemerédi. 1984. Storing a sparse table withO(1) worst case access time. J. ACM 31, 3, 538–544.
[22] Matteo Frigo, Charles E. Leiserson, Harald Prokop, and Sridhar Ramachandran. 1999. Cache-oblivious algorithms. In Proceedings of the Symposium on Foundations of Computer Science. 285–298. · Zbl 1295.68236
[23] Travis Gagie, Kalle Karhu, Gonzalo Navarro, Simon J. Puglisi, and Jouni Sirén. 2013. Document listing on repetitive collections. In Proceedings of the Symposium on Combinatorial Pattern Matching. 107–119. · Zbl 1381.68076
[24] T. Gagie, G. Navarro, and S. J. Puglisi. 2010. Colored range queries and document retrieval. In Proceedings of the International Symposium on String Processing and Information Retrieval. 67–81. · Zbl 05803982 · doi:10.1007/978-3-642-16321-0_7
[25] Travis Gagie, Gonzalo Navarro, and Simon J. Puglisi. 2012. New algorithms on wavelet trees and applications to information retrieval. Theoret. Comput. Sci. 426, 25–41. · Zbl 1243.68161 · doi:10.1016/j.tcs.2011.12.002
[26] Travis Gagie, Simon J. Puglisi, and Andrew Turpin. 2009. Range quantile queries: Another virtue of wavelet trees. In Proceedings of the International Symposium on String Processing and Information Retrieval. 1–6. · Zbl 05609194 · doi:10.1007/978-3-642-03784-9_1
[27] Alexander Golynski, J. Ian Munro, and S. Srinivasa Rao. 2006. Rank/select operations on large alphabets: A tool for text indexing. In Proceedings of the Symposium on Discrete Algorithms. 368–373. · Zbl 1192.68800
[28] R. Grossi, A. Gupta, and J. S. Vitter. 2003. High-order entropy-compressed text indexes. In Proceedings of the Symposium on Discrete Algorithms. 841–850. · Zbl 1092.68584
[29] R. Grossi and J. S. Vitter. 2005. Compressed suffix arrays and suffix trees with applications to text indexing and string matching. SIAM J. Comput. 35, 2, 378–407. · Zbl 1092.68115 · doi:10.1137/S0097539702402354
[30] Wing-Kai Hon, Manish Patil, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2013a. Indexes for document retrieval with relevance. In Space-Efficient Data Structures, Streams, and Algorithms. 351–362. · Zbl 1394.68127 · doi:10.1007/978-3-642-40273-9_22
[31] Wing Kai Hon, Manish Patil, Rahul Shah, and Shih Bin Wu. 2010a. Efficient index for retrieving top-kmost frequent documents. J. Disc. Algor. 8, 4, 402–417. · Zbl 1215.68095 · doi:10.1016/j.jda.2010.08.003
[32] Wing Kai Hon, Rahul Shah, and Sharma V. Thankachan. 2012. Towards an optimal space-and-query-time index for top-kdocument retrieval. In Proceedings of the Symposium on Combinatorial Pattern Matching. 173–184. · Zbl 1358.68092 · doi:10.1007/978-3-642-31265-6_14
[33] Wing Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2010b. String retrieval for multi-pattern queries. In Proceedings of the International Symposium on String Processing and Information Retrieval. 55–66. · Zbl 05803981 · doi:10.1007/978-3-642-16321-0_6
[34] Wing Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2012c. On position restricted substring searching in succinct space. J. Disc. Algor. 17, 109–114. · Zbl 1267.68102 · doi:10.1016/j.jda.2012.09.002
[35] Wing-Kai Hon, Rahul Shah, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2012c. Document listing for queries with excluded pattern. In CPM, 185–195. · Zbl 1358.68093
[36] Wing Kai Hon, Rahul Shah, and Jeffrey Scott Vitter. 2009. Space-efficient framework for top-kstring retrieval problems. In Proceedings of the Symposium on Foundations of Computer Science. 713–722. · Zbl 1292.68182
[37] Wing Kai Hon, Sharma V. Thankachan, Rahul Shah, and Jeffrey Scott Vitter. 2013. Faster compressed top-kdocument retrieval. In Proceedings of the Data Compression Conference. 341–350. · Zbl 1259.68259
[38] Bo-June (Paul) Hsu and Giuseppe Ottaviano. 2013. Space-efficient data structures for top-kcompletion. In Proceedings of the International Conference on World Wide Web. 583–594.
[39] M. Karpinski and Y. Nekrich. 2011. Top-Kcolor queries for document retrieval. In Proceedings of the Symposium on Discrete Algorithms. 401–411. · Zbl 1373.68197 · doi:10.1137/1.9781611973082.32
[40] D. E. Knuth, J. H. Morris, and V. B. Pratt. 1977. Fast pattern matching in strings. SIAM J. Comput. 6, 2, 323–350. · Zbl 0372.68005 · doi:10.1137/0206024
[41] Roberto Konow and Gonzalo Navarro. 2013. Faster compact top-kdocument retrieval. In Proceedings of the Data Compression Conference. 351–360.
[42] U. Manber and G. Myers. 1993. Suffix arrays: A new method for on-line string searches. SIAM J. Comput. 22, 5, 935–948. · Zbl 0784.68027 · doi:10.1137/0222058
[43] Y. Matias, S. Muthukrishnan, S. C. Sahinalp, and J. Ziv. 1998. Augmenting suffix trees, with applications. In Proceedings of the European Symposium on Algorithms. 67–78.
[44] E. M. McCreight. 1976. A space-economical suffix tree construction algorithm. J. ACM 23, 2, 262–272. · Zbl 0329.68042 · doi:10.1145/321941.321946
[45] J. I. Munro, V. Raman, and S. S. Rao. 2001. Space efficient suffix trees. J. Algor. 39, 2, 205–222. · Zbl 0977.68069 · doi:10.1006/jagm.2000.1151
[46] S. Muthukrishnan. 2002. Efficient algorithms for document retrieval problems. In Proceedings of the Symposium on Discrete Algorithms. 657–666. · Zbl 1093.68588
[47] Gonzalo Navarro. 2013. Spaces, trees and colors: The algorithmic landscape of document retrieval on sequences. CoRR abs/1304.6023. · Zbl 1305.68078
[48] G. Navarro and V. Mäkinen. 2007. Compressed full-text indexes. Comput. Surveys 39, 1.
[49] Gonzalo Navarro and Yakov Nekrich. 2012a. Sorted range reporting. In Proceedings of the 13th Scandinavian Symposium and Workshops on Algorithm Theory (SWAT). 271–282. · Zbl 1347.68343
[50] G. Navarro and Y. Nekrich. 2012b. Top-kdocument retrieval in optimal time and linear space. In Proceedings of the Symposium on Discrete Algorithms. 1066–1077. · doi:10.1137/1.9781611973099.84
[51] G. Navarro, S. J. Puglisi, and D. Valenzuela. 2011. Practical compressed document retrieval. In Proceedings of the Symposium on Experimental Algorithms. 193–205. · Zbl 05906096 · doi:10.1007/978-3-642-20662-7_17
[52] Gonzalo Navarro and Sharma V. Thankachan. 2013. Faster top-kdocument retrieval in optimal space. In Proceedings of the International Symposium on String Processing and Information Retrieval, To appear. · Zbl 1406.68022
[53] L. Page, S. Brin, R. Motwani, and T. Winograd. 1999. The PageRank citation ranking: Bringing order to the web. Technical Report 1999-66, Stanford InfoLab.
[54] Rasmus Pagh. 2001. Low redundancy in static dictionaries with constant query time. SIAM J. Comput. 31, 2, 353–363. · Zbl 0994.68050 · doi:10.1137/S0097539700369909
[55] M. Patil, S. V. Thankachan, R. Shah, W. K. Hon, J. S. Vitter, and S. Chandrasekaran. 2011. Inverted indexes for phrases and strings. In Proceedings of the SIGIR Conference on Research and Development in Information Retrieval. 555–564.
[56] Mihai Patrascu. 2008. Succincter. In Proceedings of the Symposium on Foundations of Computer Science. 305–313.
[57] R. Raman, V. Raman, and S. S. Rao. 2007. Succinct indexable dictionaries with applications to encodingk-ary trees, prefix sums and multisets. ACM Trans. Algor. 3, 4. · Zbl 1093.68582 · doi:10.1145/1290672.1290680
[58] K. Sadakane. 2007a. Compressed suffix trees with full functionality. Theory of Computing Systems, 589–607. · Zbl 1148.68015 · doi:10.1007/s00224-006-1198-x
[59] K. Sadakane. 2007b. Succinct data structures for flexible text retrieval systems. J. Disc. Algor. 5, 1, 12–22. · Zbl 1137.68360 · doi:10.1016/j.jda.2006.03.011
[60] K. Sadakane and G. Navarro. 2010. Fully-functional succinct trees. In Proceedings of the Symposium on Discrete Algorithms. 134–149. · Zbl 1288.05046 · doi:10.1137/1.9781611973075.13
[61] R. Shah and M. Farach-Colton. 2002. Undiscretized dynamic programming: Faster algorithms for facility location and related problems on trees. In Proceedings of the Symposium on Discrete Algorithms. 108–115. · Zbl 1093.68628
[62] Rahul Shah, Cheng Sheng, Sharma V. Thankachan, and Jeffrey Scott Vitter. 2013. Top-kdocument retrieval in external memory. In Proceedings of the European Symposium on Algorithms. 803–814. · Zbl 1394.68129
[63] Dekel Tsur. 2013. Top-kdocument retrieval in optimal space. Inf. Process. Lett. 113, 12, 440–443. · Zbl 1371.68071 · doi:10.1016/j.ipl.2013.03.012
[64] N. Välimäki and V. Mäkinen. 2007. Space-efficient algorithms for document retrieval. In Proceedings of the Symposium on Combinatorial Pattern Matching. 205–215.
[65] Jeffrey Scott Vitter. 2008. Algorithms and data structures for external memory. Found. Trends Theoret. Comput. Sci. 2, 4, 305–474. · Zbl 1244.68007 · doi:10.1561/0400000014
[66] P. Weiner. 1973. Linear pattern matching algorithms. In Proceedings of the Symposium on Switching and Automata Theory. 1–11. · doi:10.1109/SWAT.1973.13
[67] Dan E. Willard. 1983. Log-logarithmic worst-case range queries are possible in space Θ(N). Inf. Process. Lett. 17, 2, 81–84. · Zbl 0509.68106 · doi:10.1016/0020-0190(83)90075-3
[68] I. Witten, A. Moffat, and T. Bell. 1999. Managing Gigabytes: Compressing and Indexing Documents and Images. Morgan-Kaufmann, San Francisco, CA. · Zbl 0821.68051
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.