×

zbMATH — the first resource for mathematics

Time-optimal top-\(k\) document retrieval. (English) Zbl 1359.68053

MSC:
68P05 Data structures
68W32 Algorithms on strings
PDF BibTeX Cite
Full Text: DOI
References:
[1] N. Alon and M. Naor, Derandomization, witnesses for Boolean matrix multiplication and construction of perfect hash functions, Algorithmica, 16 (1996), pp. 434–449. · Zbl 0857.68055
[2] S. Alstrup, T. Husfeldt, and T. Rauhe, Marked ancestor problems, in Proceedings of the 39th IEEE Annual Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1998, pp. 534–544.
[3] A. Amir, M. Farach, Z. Galil, R. Giancarlo, and K. Park, Dynamic dictionary matching, J. Comput. System Sci., 49 (1994), pp. 208–222. · Zbl 0942.68783
[4] A. Amir, M. Farach, R. M. Idury, J. A. Lapoutre, and A. A. Schaffer, Improved dynamic dictionary matching, Inform. and Comput., 119 (1995), pp. 258–282. · Zbl 0832.68033
[5] A. Andersson and M. Thorup, Dynamic ordered sets with exponential search trees, J. ACM, 54 (2007), 13. · Zbl 1292.68038
[6] V. Anh and A. Moffat, Pruned query evaluation using pre-computed impacts, in Proceedings of the 29th Annual International ACM SIGIR Conference on Research and Development in Information Retrieval (SIGIR), ACM, New York, 2006, pp. 372–379.
[7] A. Apostolico, The myriad virtues of subword trees, in Combinatorial Algorithms on Words, NATO ISI Ser. Comput. Syst. Sci. 12, Springer, Berlin, 1985, pp. 85–96. · Zbl 0572.68067
[8] L. Arge and J. S. Vitter, Optimal external memory interval management, SIAM J. Comput., 32 (2003), pp. 1488–1508. · Zbl 1030.68027
[9] R. Baeza-Yates and B. Ribeiro-Neto, Modern Information Retrieval, 2nd ed., Addison-Wesley, New York, 2011.
[10] D. Belazzougui, P. Boldi, R. Pagh, and S. Vigna, Fast prefix search in little space, with applications, in Proceedings of the 18th Annual European Symposium on Algorithms (ESA), Pt. 1, Lecture Notes in Comput. Sci. 6346, Springer, Berlin, 2010, pp. 427–438. · Zbl 1287.68188
[11] D. Belazzougui and G. Navarro, Alphabet-independent compressed text indexing, ACM Trans. Algorithms, 10 (2014), 23. · Zbl 1325.68307
[12] D. Belazzougui, G. Navarro, and D. Valenzuela, Improved compressed indexes for full-text document retrieval, J. Discrete Algorithms, 18 (2013), pp. 3–13. · Zbl 1268.68075
[13] M. Bender, R. Cole, E. Demaine, M. Farach-Colton, and J. Zito, Two simplified algorithms for maintaining order in a list, in Proceedings of the 10th Annual European Symposium on Algorithms (ESA), Lecture Notes in Comput. Sci. 2461, Springer, Berlin, 2002, pp. 152–164. · Zbl 1019.68527
[14] M. Bender and M. Farach-Colton, The LCA problem revisited, in Proceedings of the 4th Latin American Theoretical Informatics Symposium (LATIN), Lecture Notes in Comput. Sci. 1776, Springer, Berlin, 2000, pp. 88–94. · Zbl 0959.68133
[15] A. Blumer, J. Blumer, D. Haussler, R. McConnell, and A. Ehrenfeucht, Complete inverted files for efficient text retrieval and analysis, J. ACM, 34 (1987), pp. 578–595. · Zbl 0554.68058
[16] G. Brodal, R. Fagerberg, M. Greve, and A. López-Ortiz, Online sorted range reporting, in Proceedings of the 20th International Symposium on Algorithms and Computation (ISAAC), Lecture Notes in Comput. Sci. 5878, Springer, Berlin, 2009, pp. 173–182. · Zbl 1272.68113
[17] T. Chan, K. G. Larsen, and M. Pǎtraşcu, Orthogonal range searching on the RAM, revisited, in Proceedings of the 27th ACM Symposium on Computational Geometry (SoCG), ACM, New York 2011, pp. 1–10. · Zbl 1283.68139
[18] B. Chazelle, A functional approach to data structures and its use in multidimensional searching, SIAM J. Comput., 17 (1988), pp. 427–462. · Zbl 0679.68074
[19] R. Cole and R. Hariharan, Dynamic LCA queries on trees, SIAM J. Comput., 34 (2005), pp. 894–923. · Zbl 1075.68019
[20] M. Crochemore and W. Rytter, Jewels of Stringology, World Scientific, Singapore, 2002. · Zbl 1078.68151
[21] S. Culpepper, G. Navarro, S. Puglisi, and A. Turpin, Top-\(k\) ranked document search in general text databases, in Proceedings of the 18th Annual European Symposium on Algorithms (ESA), Lecture Notes in Comput. Sci. 6347, Springer, Berlin, 2010, pp. 194–205. · Zbl 1287.68035
[22] P. Dietz and R. Raman, Persistence, randomization and parallelization: On some combinatorial games and their applications (abstract), in Proceedings of the 3rd Workshop on Algorithms and Data Structures (WADS), Lecture Notes in Comput. Sci. 709, Springer, Berlin, 1993, pp. 289–301.
[23] P. Dietz and D. Sleator, Two algorithms for maintaining order in a list, in Proceedings of the 19th Annual ACM Symposium on Theory of Computing (STOC), ACM, New York, 1987, pp. 365–372.
[24] M. Farach, Optimal suffix tree construction with large alphabets, in Proceedings of the 38th Annual Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 1997, pp. 137–143.
[25] P. Ferragina, G. Manzini, V. Mäkinen, and G. Navarro, Compressed representations of sequences and full-text indexes, ACM Trans, Algorithms, 3 (2007), 20. · Zbl 1321.68263
[26] J. Fischer and P. Gawrychowski, Alphabet-dependent string searching with wexponential search trees, in Proceedings of the 26th Annual Symposium on Combinatorial Pattern Matching (CPM), Lecture Notes in Comput. Sci. 9133, Springer, Cham, Switzerland, 2015, pp. 160–171. · Zbl 1432.68087
[27] J. Fischer and V. Heun, Space-efficient preprocessing schemes for range minimum queries on static arrays, SIAM J. Comput., 40 (2011), pp. 465–492. · Zbl 1222.05024
[28] R. Fleischer, A simple balanced search tree with O(1) worst-case update time, Internat. J. Found. Comput. Sci., 7 (1996), pp. 137–149. · Zbl 0852.68020
[29] M. Fredman, J. Komlós, and E. Szemerédi, Storing a sparse table with O(1) worst case access time, J. ACM, 31 (1984), pp. 538–544. · Zbl 0629.68068
[30] H. N. Gabow, J. L. Bentley, and R. E. Tarjan, Scaling and related techniques for geometry problems, in Proceedings of the 16th ACM Symposium on Theory of Computing (STOC), ACM, New York, 1984, pp. 135–143.
[31] T. Gagie, J. Kärkkäinen, G. Navarro, and S. J. Puglisi, Colored range queries and document retrieval, Theoret. Comput. Sci., 483 (2013), pp. 36–50. · Zbl 1292.68045
[32] T. Gagie, G. Navarro, and S. J. Puglisi, New algorithms on wavelet trees and applications to information retrieval, Theoret. Comput. Sci., 426–427 (2012), pp. 25–41. · Zbl 1243.68161
[33] S. Gog and G. Navarro, Improved single-term top-\(k\) document retrieval, in Proceedings of the 17th Workshop on Algorithm Engineering and Experiments (ALENEX), SIAM, Philadelphia, 2015, pp. 24–32.
[34] R. Grossi, A. Gupta, and J. Vitter, High-order entropy-compressed text indexes, in Proceedings of the 14th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, 2003, pp. 841–850. · Zbl 1092.68584
[35] R. Grossi and J. S. Vitter, Compressed suffix arrays and suffix trees with applications to text indexing and string matching, SIAM J. Comput., 35 (2005), pp. 378–407. · Zbl 1092.68115
[36] D. Gusfield, Algorithms on Strings, Trees and Sequences: Computer Science and Computational Biology, Cambridge University Press, Cambridge, 1997. · Zbl 0934.68103
[37] W.-K. Hon, M. Patil, R. Shah, and S.-Bin Wu, Efficient index for retrieving top-\(k\) most frequent documents, J. Discrete Algorithms, 8 (2010), pp. 402–417. · Zbl 1215.68095
[38] W.-K. Hon, R. Shah, and S. Thankachan, Towards an optimal space-and-query-time index for top-\(k\) document retrieval, in Proceedings of the 23rd Annual Symposium on Combinatorial Pattern Matching (CPM), Lecture Notes in Comput. Sci. 7354, Springer, Berlin, 2012, pp. 173–184. · Zbl 1358.68092
[39] W.-K. Hon, R. Shah, S. Thankachan, and J. Vitter, Faster compressed top-\(k\) document retrieval, in Proceedings of the 23rd Data Compression Conference (DCC), IEEE, Piscataway, NJ, 2013, pp. 341–350.
[40] W.-K. Hon, R. Shah, S. V. Thankachan, and J. S. Vitter, Space-efficient frameworks for top-\(k\) string retrieval, J. ACM, 61 (2014), 9. · Zbl 1295.68230
[41] W.-K. Hon, R. Shah, and J. Vitter, Space-efficient framework for top-\(k\) string retrieval problems, in Proceedings of the 50th IEEE Annual Symposium on Foundations of Computer Science (FOCS), IEEE Computer Society, Los Alamitos, CA, 2009, pp. 713–722. · Zbl 1292.68182
[42] M. Karpinski and Y. Nekrich, Top-\(k\) color queries for document retrieval, in Proceedings of the 22nd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), SIAM, Philadelphia, 2011, pp. 401–411. · Zbl 1373.68197
[43] D. Knuth, The Art of Computer Programming, Vol. 3: Sorting and Searching, Addison-Wesley, Reading, MA, 1973. · Zbl 0302.68010
[44] R. Konow and G. Navarro, Faster compact top-\(k\) document retrieval, in Proceedings of the 23rd Data Compression Conference (DCC), IEEE, Piscataway, NJ, 2013, pp. 351–360.
[45] K. G. Larsen, J. I. Munro, J. S. Nielsen, and S. V. Thankachan, On hardness of several string indexing problems, in Proceedings of the 25th Annual Symposium on Combinatorial Pattern Matching (CPM), Lecture Notes in Comput. Sci. 8486, Springer, Cham, Switzerland, 2014, pp. 242–251. · Zbl 1407.68229
[46] V. Mäkinen and G. Navarro, Rank and select revisited and extended, Theoret. Comput. Sci. 387 (2007), pp. 332–347. · Zbl 1144.68023
[47] Y. Matias, S. Muthukrishnan, S. C. Sahinalp, and J. Ziv, Augmenting suffix trees, with applications, in Proceedings of the 6th European Symposium on Algorithms (ESA), Lecture Notes in Comput. Sci. 1461, Springer, Berlin, 1998, pp. 67–78.
[48] E. McCreight, A space-economical suffix tree construction algorithm, J. ACM, 23 (1976), pp. 262–272. · Zbl 0329.68042
[49] E. M. McCreight, Priority search trees, SIAM J. Comput., 14 (1985), pp. 257–276. · Zbl 0564.68050
[50] I. Munro, Tables, in Proceedings of the 16th Conference on Foundations of Software Technology and Theoretical Computer Science (FSTTCS), Lecture Notes in Comput. Sci. 1180, Springer, Berlin, 1996, pp. 37–42.
[51] S. Muthukrishnan, Efficient algorithms for document retrieval problems, in Proceedings of the 13th Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), ACM, New York, 2002, pp. 657–666. · Zbl 1093.68588
[52] G. Navarro, Spaces, trees and colors: The algorithmic landscape of document retrieval on sequences, ACM Comput. Surv., 46 (2014), 52. · Zbl 1305.68078
[53] G. Navarro and Y. Nekrich, Top-\(k\) document retrieval in optimal time and linear space, in Proceedings of the 23rd Annual ACM-SIAM Symposium on Discrete Algorithms (SODA), ACM, New York, 2012, pp. 1066–1078. · Zbl 1422.68063
[54] G. Navarro, Y. Nekrich, and L. Russo, Space-efficient data-analysis queries on grids, Theoret. Comput. Sci., 482 (2013), pp. 60–72. · Zbl 1291.68155
[55] G. Navarro, S. J. Puglisi, and D. Valenzuela, General document retrieval in compact space, ACM J. Exp. Algorithmics, 19 (2014), 3. · Zbl 1347.68103
[56] G. Navarro and S. V. Thankachan, New space/time tradeoffs for top-\(k\) document retrieval on sequences, Theoret. Comput. Sci., 542 (2014), pp. 83–97. · Zbl 1317.68049
[57] Y. Nekrich, A linear space data structure for orthogonal range reporting and emptiness queries, Internat. J. Comput. Geom. Appl., 19 (2009), pp. 1–15. · Zbl 1177.68060
[58] M. H. Overmars, The Design of Dynamic Data Structures, Lecture Notes in Comput. Sci. 156, Springer, Berlin, 1983. · Zbl 0545.68009
[59] M. H. Overmars and J. van Leeuwen, Worst-case optimal insertion and deletion methods for decomposable searching problems, Inform. Process. Lett., 12 (1981), pp. 168–173. · Zbl 0459.68026
[60] M. Persin, J. Zobel, and R. Sacks-Davis, Filtered document retrieval with frequency-sorted indexes, J. Amer. Soc. Inform. Sci., 47 (1996), pp. 749–764.
[61] L. Russo, G. Navarro, and A. Oliveira, Fully-compressed suffix trees, ACM Trans. Algorithms, 7 (2011), 53. · Zbl 1295.68103
[62] M. Ružić, Constructing efficient dictionaries in close to sorting time, in Proceedings of the 35th International Colloquium on Automata, Languages and Programming (ICALP), Lecture Notes in Comput. Sci. 5125, Springer, Berlin, 2008, pp. 84–95. · Zbl 1152.68429
[63] K. Sadakane, Compressed suffix trees with full functionality, Theory Comput. Syst., 41 (2007), pp. 589–607. · Zbl 1148.68015
[64] K. Sadakane, Succinct data structures for flexible text retrieval systems, J. Discrete Algorithms, 5 (2007), pp. 12–22. · Zbl 1137.68360
[65] R. Shah, C. Sheng, S. V. Thankachan, and J. Vitter, Top-\(k\) document retrieval in external memory, in Proceedings of the 21st Annual European Symposium on Algorithms (ESA), Lecture Notes in Comput. Sci. 8125, Springer, Berlin, 2013, pp. 803–814. · Zbl 1394.68129
[66] W. Szpankowski, A generalized suffix tree and its (un)expected asymptotic behaviors, SIAM J. Comput., 22 (1993), pp. 1176–1198. · Zbl 0799.68050
[67] M. Thorup, Integer priority queues with decrease key in constant time and the single source shortest paths problem, J. Comput. System Sci., 69 (2004), pp. 330–353. · Zbl 1084.68030
[68] D. Tsur, Top-\(k\) document retrieval in optimal space, Inform. Process. Lett., 113 (2013), pp. 440–443. · Zbl 1371.68071
[69] E. Ukkonen, On-line construction of suffix trees, Algorithmica, 14 (1995), pp. 249–260. · Zbl 0831.68027
[70] N. Välimäki and V. Mäkinen, Space-efficient algorithms for document retrieval, in Proceedings of the 18th Annual Symposium on Combinatorial Pattern Matching (CPM), Lecture Notes in Comput. Sci. 4580, Springer, Berlin, 2007, pp. 205–215. · Zbl 1138.68401
[71] P. Weiner, Linear pattern matching algorithm, in Proceedings of the 14th Annual IEEE Symposium on Switching and Automata Theory, IEEE Computer Society, Northridge, CA, 1973, pp. 1–11.
[72] J. Zobel and A. Moffat, Inverted files for text search engines, ACM Comput. Surv., 38 (2006), 6.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.