zbMATH — the first resource for mathematics

Weighted random sampling with a reservoir. (English) Zbl 1184.68620
Summary: A new algorithm for drawing a weighted random sample of size $$m$$ from a population of n weighted items, where $$m\leqslant n$$, is presented. The algorithm can generate a weighted random sample in one-pass over unknown populations.

MSC:
 68W20 Randomized algorithms 68W10 Parallel algorithms in computer science
Full Text:
References:
 [1] B. Babcock, S. Babu, M. Datar, R. Motwani, J. Widom, Models and issues in data stream systems, in: ACM PODS, 2002, pp. 1-16 [2] Chaudhuri, S.; Motwani, R.; Narasayya, V., On random sampling over joins, (), 263-274 [3] Devroye, L., Non-uniform random variate generation, (1986), Springer-Verlag Berlin · Zbl 0593.65005 [4] Karger, D.R., Random sampling in cut, flow, and network design problems, (), 648-657 · Zbl 1344.68275 [5] Karger, D.R.; Levine, M.S., Random sampling in residual graphs, (), 63-66 · Zbl 1192.90230 [6] Knuth, D.E., Seminumerical algorithms, The art of computer programming, vol. 2, (1981), Addison-Wesley · Zbl 0477.65002 [7] Li, K.-H., Reservoir-sampling algorithms of time complexity $$\operatorname{o}(n(1 + \log(n / n)))$$, ACM trans. math. software, 20, 4, 481-493, (1994) · Zbl 0889.65147 [8] J.-H. Lin, J.S. Vitter, ε-approximations with minimum packing constraint violation, in: 24th ACM STOC, 1992, pp. 771-782 [9] Muthukrishnan, S., Data streams: algorithms and applications, Foundations trends theoret. comput. sci., 1, 2, (2005) · Zbl 1128.68025 [10] F. Olken, Random sampling from databases, PhD thesis, Department of Computer Science, University of California at Berkeley, 1993 [11] Rajan, V.; Ghosh, R.K.; Gupta, P., An efficient parallel algorithm for random sampling, Inform. process. lett., 30, 265-268, (1989) · Zbl 0665.68033 [12] Vitter, J.S., Faster methods for random sampling, Comm. ACM, 27, 7, 703-718, (1984) · Zbl 0595.65008 [13] Vitter, J.S., Random sampling with a reservoir, ACM trans. math. software, 11, 1, 37-57, (1985) · Zbl 0562.68028 [14] Wong, C.K.; Easton, M.C., An efficient method for weighted sampling without replacement, SIAM J. comput., 9, 1, 111-113, (1980) · Zbl 0447.68040
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.