×

zbMATH — the first resource for mathematics

When small data beats big data. (English) Zbl 06892184
Summary: Small data is sometimes preferable to big data. A high quality small sample can produce superior inferences to a low quality large sample. Data has acquisition, computation and privacy costs which require costs to be balanced against benefits. Statistical inference works well on small data but not so well on large data. Sometimes aggregation into small datasets is better than large individual-level data. Small data is a better starting point for teaching of statistics.

MSC:
62A01 Foundations and philosophical topics in statistics
62-07 Data analysis (statistics) (MSC2010)
Keywords:
big data; small data
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Augustin, N. H.; Musio, M.; von Wilpert, K.; Kublin, E.; Wood, S. N.; Schumacher, M., Modelling spatio-temporal trends of forest health monitoring data, J. Amer. Statist. Assoc., 104, 487, 899-911, (2009) · Zbl 1388.62093
[2] Chandrasekaran, V.; Jordan, M. I., Computational and statistical tradeoffs via convex relaxation, Proc. Natl. Acad. Sci., 110, 13, 1181-1190, (2013)
[3] Freedman, D.; Pisani, R.; Purves, R., Statistics, (1998), Norton New York · Zbl 1351.62002
[4] Lindstrom, M., Small data: the tiny clues that uncover huge trends, (2016), St. Martin’s Press London
[5] Meng, X. L., A trio of inference problems that could win you a nobel prize in statistics (if you help fund it), (Lin, X.; etal., Past, Present, and Future of Statistical Science, (2014), CRC Press Boca Raton)
[6] Secchi, P., On the role of statistics in the era of big data: a call for debate, Statist. Probab. Lett., 136, 10-14, (2018), Special Issue on “The role of Statistics in the era of Big Data” · Zbl 06892156
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.