# zbMATH — the first resource for mathematics

Random sampling with a reservoir. (English) Zbl 0562.68028
Fast algorithms are introduced for selecting a random sample of n records without replacement from a pool of N records, where the value of N is unknown beforehand. The main result of the paper is the design and analysis of algorithm Z; it does the sampling using constant space and in $$O(n(1+\log (N/n)))$$ expected time, which is optimum, up to a constant factor. Several optimizations are studied that collectively improve the speed of the naive version of the algorithm by an order of magnitude. An efficient Pascal-like implementation is given that incorporates these modifications and that is suitable for general use. Theoretical and empirical results indicate the algorithm Z outperforms current methods by a significant margin.

##### MSC:
 68Q25 Analysis of algorithms and problem complexity 65C10 Random number generation in numerical analysis 65C99 Probabilistic methods, stochastic differential equations 62-04 Software, source code, etc. for problems pertaining to statistics
Full Text: