×

New statistics to test log-linear modeling hypothesis with no distributional specifications and clusters with homogeneous correlation. (English) Zbl 1435.62203

Summary: Traditionally, the Dirichlet-multinomial distribution has been recognized as a key model for contingency tables generated by cluster sampling schemes. There are, however, other possible distributions appropriate for these contingency tables. This paper introduces new statistics capable of testing log-linear modeling hypotheses with distributional unspecification, when the individuals of the clusters are possibly homogeneously correlated. An estimator for the intracluster correlation coefficient, valid for different cluster sizes, plays a crucial role in the construction of the goodness-of-fit test-statistics.

MSC:

62H17 Contingency tables
62G10 Nonparametric hypothesis testing
62H30 Classification and discrimination; cluster analysis (statistical aspects)
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Eldridge, S. M.; Ukoumunne, O. C.; Carlin, J. B., The intra-cluster correlation coefficient in cluster randomized trials: A review of definitions, Internat. Statist. Rev., 77, 378-394 (2009)
[2] Zhu, Y.; Krewski, D.; Ross, W. H., Dose-response models for correlated multinomial data from developmental toxicity studies, J. R. Stat. Soc. C, 43, 583-598 (1994) · Zbl 0825.62894
[3] Zhu, Y., Correlated multinomial data, (El-Shaarawi, A. H.; Piegorsch, W. W.; Ryan, L.; Darnell, R., Encyclopedia of Environmetrics (2006))
[4] Choi, J. W.; McHugh, R. B., A reduction factor in goodness-of-fit and independence tests for clustered and weighted observations, Biometrics, 45, 979-996 (1989) · Zbl 0715.62187
[5] Fellegi, I. P., Approximate tests of independence and goodness of fit based upon stratified multistage samples, J. Amer. Statist. Assoc., 75, 261-268 (1980) · Zbl 0437.62020
[6] Holt, D.; Scott, A. J.; Ewings, P. O., Chi-squared tests with survey data, J. R. Stat. Soc. A, 143, 302-320 (1980) · Zbl 0461.62014
[7] Rao, J. N.K.; Scott, A. J., The analysis of categorical data from complex sample surveys: Chi-squared tests for goodness of fit and independence in two-way tables, J. Amer. Statist. Assoc., 76, 221-230 (1981) · Zbl 0473.62010
[8] Rao, J. N.K.; Scott, A. J., On chi-squared tests for multiway contingency tables with cell proportions estimated from survey data, Ann. Statist., 12, 46-60 (1984) · Zbl 0622.62059
[9] Bedrick, E. J., Adjusted chi-squared tests for cross-classified tables of survey data, Biometrika, 70, 591-595 (1983) · Zbl 0543.62036
[10] Landis, J. R.; Lepkowski, J. M.; Eklund, S. A.; Stehouwer, S. A., A Statistical Methodology for Analyzing Data from a Complex Survey: The First National Health and Nutrition Examination Survey. Series 2, No. 92 (1984), The National Center for Health Statistics: The National Center for Health Statistics Hyattsville, Maryland
[11] Koch, G. G.; Freeman, D. H.; Freeman, J. L., Strategies in the multivariate analysis of data from complex surveys, Internat. Statist. Rev., 43, 59-78 (1975) · Zbl 0303.62009
[12] Fay, R. E., Complex samples, J. Amer. Statist. Assoc., 80, 148-157 (1985) · Zbl 0591.62008
[13] Altham, P. M.E., Discrete variable analysis for individuals grouped into families, Biometrika, 63, 263-269 (1976) · Zbl 0329.62080
[14] Cohen, J. E., The distribution of the chi-squared statistic under clustered sampling from contingency tables, J. Amer. Statist. Assoc., 71, 665-670 (1976) · Zbl 0343.62010
[15] Brier, S. S., Analysis of contingency tables under cluster sampling, Biometrika, 67, 591-596 (1980) · Zbl 0455.62045
[16] Fienberg, S. E., The use of chi-square statistics for categorical data problems, J. R. Stat. Soc. Ser. B Stat. Methodol., 41, 54-64 (1979) · Zbl 0427.62013
[17] Menéndez, M. L.; Morales, D.; Pardo, L.; Vajda, I., Divergence-based estimation and testing of statistical models of classification, J. Multivariate Anal., 54, 329-354 (1995) · Zbl 0844.62007
[18] Menéndez, M. L.; Morales, D.; Pardo, L.; Vajda, I., About divergence-based goodness-of-fit tests in the dirichlet-multinomial model, Comm. Statist. Theory Methods, 25, 1119-1133 (1996) · Zbl 0875.62192
[19] Mosimann, J. E., On the compound multinomial distributions, the multivariate \(\beta \)-distribution and correlation among proportions, Biometrika, 49, 65-82 (1962) · Zbl 0105.12502
[20] Morel, J. G.; Nagaraj, N. K., A finite mixture distribution for modelling multinomial extra variation, Biometrika, 80, 363-371 (1993) · Zbl 0778.62013
[21] Alonso-Revenga, J. M.; Martín, N.; Pardo, L., New improved estimators for overdispersion in models with clustered multinomial data and unequal cluster sizes, Stat. Comput., 27, 1, 193-217 (2017) · Zbl 1505.62023
[22] Cressie, N.; Pardo, L., Phi-Divergence statistics, (ElShaarawi, A. H.; Piegorich, W. W., Encyclopedia of Environmetrics. Vol. 3 (2002), Wiley: Wiley New York), 1551-1555
[23] Pardo, L., Statistical Inference Based on Divergence Measures (2006), Chapman & Hall/CRC: Chapman & Hall/CRC Boca Raton · Zbl 1118.62008
[24] Raim, A. M.; Neerchal, N. K.; Morel, J. G., Modeling Overdispersion in \(R\) Technical Report HPCI-2015-1 UMBCH High Performance Computing Facility (2015), University of Maryland
[25] Rutterford, C.; Copas, A.; Eldridge, S., Methods for sample size determination in cluster randomized trials, Int. J. Epidemiol., 44, 1051-1067 (2015)
[26] John, M.; Mazumdar, M., Derivation of sample size formula for cluster randomized trials with binary responses using a general continuity correction factor and identification of optimal settings for small event rates, J. Data Sci., 11, 181-203 (2013)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.