×

Sample-size calculation for tests of homogeneity. (English. French summary) Zbl 1357.62084

Summary: Mixture models are widely used to explain excessive variation in observations that is not captured by standard parametric models, and they lead to suggestive latent structures. The hypothetical latent structure often needs critical examination based on experimental data. It is therefore important to know the sample size needed to ensure a reasonable chance of success. We investigate this issue for the EM-test and the \(C(\alpha)\) test. They are shown to be asymptotically equivalent and have simple limiting distributions under two sets of local alternatives for commonly used mixture models. We obtain a simple sample-size formula and an associated simulation-based calibration procedure, and we demonstrate via data examples and simulation studies that they provide useful guidance for several common mixture models.

MSC:

62F03 Parametric hypothesis testing
62H30 Classification and discrimination; cluster analysis (statistical aspects)
62-07 Data analysis (statistics) (MSC2010)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Böhning, Computer-assisted analysis of mixtures (C.A.MAM): Statistical algorithms, Biometrics 48 pp 283– (1992) · doi:10.2307/2532756
[2] Chen, The likelihood ratio test for homogeneity in finite mixture models, The Canadian Journal of Statistics 29 pp 201– (2001) · Zbl 0979.62007 · doi:10.2307/3316073
[3] Chen, A modified likelihood ratio test for homogeneity in finite mixture models, Journal of the Royal Statistical Society, Series B 63 pp 19– (2001) · Zbl 0976.62011 · doi:10.1111/1467-9868.00273
[4] Chen, Optimal rate of convergence for finite mixture models, The Annals of Statistics 23 pp 221– (1995) · Zbl 0821.62023 · doi:10.1214/aos/1176324464
[5] Chen, Penalized likelihood-ratio test for finite mixture models with multinomial observations, The Canadian Journal of Statistics 26 pp 583– (1998) · Zbl 1066.62509 · doi:10.2307/3315719
[6] Chen, Hypothesis test for normal mixture models: The EM approach, The Annals of Statistics 37 pp 2523– (2009) · Zbl 1173.62007 · doi:10.1214/08-AOS651
[7] Chen, Tuning the EM-test for finite mixture models, The Canadian Journal of Statistics 39 pp 389– (2011) · Zbl 1230.62020 · doi:10.1002/cjs.10122
[8] Chernoff, Asymptotic distribution of the likelihood ratio test that a mixture of two binomials is a single binomial, Journal of Statistical Planning and Inference 43 pp 19– (1995) · Zbl 0812.62015 · doi:10.1016/0378-3758(94)00006-H
[9] Dacunha-Castelle, Testing the order of a model using locally conic parametrization: Population mixtures and stationary ARMA processes, The Annals of Statistics 27 pp 1178– (1999) · Zbl 0957.62073 · doi:10.1214/aos/1017938921
[10] Geissler , A. 1889 Beiträge zur Frage des Geschlechtsverhältnisses der Geborenen
[11] Hartigan , J. A. 1985 807 810
[12] Kim, Empirical identifiability in finite mixture models, Annals of the Institute of Statistical Mathematics 67 pp 745– (2015) · Zbl 1440.62246 · doi:10.1007/s10463-014-0474-9
[13] Le Cam, On some asymptotic properties of maximum likelihood estimates and related Bayes procedures, University of California Publications in Statistics 1 pp 277– (1953)
[14] Li, Testing the order of a finite mixture, Journal of the American Statistical Association 105 pp 1084– (2010) · Zbl 1390.62024 · doi:10.1198/jasa.2010.tm09032
[15] Li, Non-finite Fisher information and homogeneity: The EM approach, Biometrika 96 pp 411– (2009) · Zbl 1163.62012 · doi:10.1093/biomet/asp011
[16] Lindsay , B. G. 1995
[17] Lindsay, Residual diagnostics in the mixture model, Journal of the American Statistical Association 87 pp 785– (1992) · doi:10.1080/01621459.1992.10475280
[18] Liu, Asymptotics for likelihood ratio tests under loss of identifiability, The Annals of Statistics 31 pp 807– (2003) · Zbl 1032.62014 · doi:10.1214/aos/1056562463
[19] Morris, Natural exponential families with quadratic variance functions, The Annals of Statistics 10 pp 65– (1982) · Zbl 0498.62015 · doi:10.1214/aos/1176345690
[20] Neyman , J. 1959 213 234
[21] Neyman, On the use of C({\(\alpha\)}) optimal test of composite hypotheses, Bulletin de l’Institut International de Statistique 41 pp 447– (1966) · Zbl 0161.16201
[22] Proschan, Theoretical explanation of observed decreasing failure rate, Technometrics 5 pp 375– (1963) · doi:10.1080/00401706.1963.10490105
[23] Sokal , R. R. Rohlf , F. J. 1973
[24] van der Vaart , A. W. 2000
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.