Akaike’s information criterion in generalized estimating equations. (English) Zbl 1210.62099

Summary: Correlated response data are common in biomedical studies. Regression analysis based on the generalized estimating equations (GEE) is an increasingly important method for such data. However, there seem to be few model-selection criteria available in GEE. The well-known Akaike Information Criterion (AIC) cannot be directly applied since AIC is based on maximum likelihood estimation while GEE is nonlikelihood based. We propose a modification to AIC, where the likelihood is replaced by the quasi-likelihood and a proper adjustment is made for the penalty term. Its performance is investigated through simulation studies. For illustration, the method is applied to a real data set.


62J12 Generalized linear models (logistic models)
62G08 Nonparametric regression and quantile regression
62B10 Statistical aspects of information-theoretic topics
65C60 Computational problems in statistics (MSC2010)
62P10 Applications of statistics to biology and medical sciences; meta analysis
Full Text: DOI


[1] Akaike H., Proceedings of the Second International Symposium on Information Theory pp 267– (1973)
[2] Bahadur, Studies in Item Analysis and Prediction, Volume VI, Stanford Mathematical Studies in the Social Sciences pp 158– (1961)
[3] Barnhart, Goodness-of-fit tests for GEE modeling with binary data, Biometrics 54 pp 720– (1998) · Zbl 1058.62524
[4] Fitzmaurice, A caveat concerning independence estimating equations with multiple multivariate binary data, Biometrics 51 pp 309– (1995) · Zbl 0825.62479
[5] Hanfelt, Approximate likelihood ratios for general estimating functions, Biometrika 82 pp 461– (1995) · Zbl 0831.62025
[6] Klein, The Wisconsin Epidemiologic Study of Diabetic Retinopathy: II. Prevalence and risk of diabetic retinopathy when age at diagnosis is less than 30 years, Archives of Ophthalmology 102 pp 520– (1984)
[7] Kullback, On information and sufficiency, Annals of Mathematical Statistics 22 pp 79– (1951) · Zbl 0042.38403
[8] Lehmann, Theory of Point Estimation (1983) · Zbl 0522.62020
[9] Li, A deviance function for the quasi-likelihood method, Biometrika 80 pp 741– (1993) · Zbl 0796.62025
[10] Liang, Longitudinal data analysis using generalized linear models, Biometrika 73 pp 13– (1986) · Zbl 0595.62110
[11] Linhart, Model Selection (1986)
[12] Mallows, Some comments on Cp, Technometrics 15 pp 661– (1973) · Zbl 0269.62061
[13] McCullagh, Generalized Linear Models (1989) · Zbl 0588.62104
[14] McDonald, Estimating logistic regression parameters for bivariate binary data, Journal of the Royal Statistical Society, Series B 55 pp 391– (1993) · Zbl 0800.62452
[15] Miller, Subset Selection in Regression (1990) · Zbl 0702.62057
[16] Pepe, A cautionary note on inference for marginal regression models with longitudinal data and general correlated response data, Communications in Statistics, Series B 23 pp 939– (1994) · Zbl 04522389
[17] Wedderburn, Quasi-likelihood functions, generalized linear models, and the Gauss-Newton method, Biometrika 61 pp 439– (1974) · Zbl 0292.62050
[18] Zeger, The analysis of discrete longitudinal data: Commentary, Statistics in Medicine 7 pp 161– (1988)
[19] Zeger, Models for longitudinal data: A generalized estimating equation approach, Biometrics 42 pp 121– (1988) · Zbl 0715.62136
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.