Model-checking techniques based on cumulative residuals.

*(English)*Zbl 1209.62168Summary: Residuals have long been used for graphical and numerical examinations of the adequacy of regression models. Conventional residual analysis based on the plots of raw residuals or their smoothed curves is highly subjective, whereas most numerical goodness-of-fit tests provide little information about the nature of model misspecification. In this paper, we develop objective and informative model-checking techniques by taking the cumulative sums of residuals over certain coordinates (e.g., covariates or fitted values) or by considering some related aggregates of residuals, such as moving sums and moving averages. For a variety of statistical models and data structures, including generalized linear models with independent or dependent observations, the distributions of these stochastic processes under the assumed model can be approximated by the distributions of certain zero-mean Gaussian processes whose realizations can be easily generated by computer simulation. Each observed process can then be compared, both graphically and numerically, with a number of realizations from the Gaussian process. Such comparisons enable one to assess objectively whether a trend seen in a residual plot reflects model misspecification or natural variation. The proposed techniques are particularly useful in checking the functional form of a covariate and the link function. Illustrations with several medical studies are provided.

##### MSC:

62J20 | Diagnostics, and linear inference and regression |

62J12 | Generalized linear models (logistic models) |

62M99 | Inference from stochastic processes |

65C60 | Computational problems in statistics (MSC2010) |

62P10 | Applications of statistics to biology and medical sciences; meta analysis |

##### Keywords:

generalized linear models; goodness of fit; link function; longitudinal data; marginal models; model misspecification; regression diagnostics; residual plots; transformation
Full Text:
DOI

##### References:

[1] | Atkinson, Plots, Transformations and Regression (1985) |

[2] | Barlow, Residuals for relative risk regression, Biometrika 75 pp 65– (1988) · Zbl 0632.62102 |

[3] | Cleveland, Robust locally weighted regression and smoothing scatterplots, Journal of the American Statistical Association 74 pp 829– (1979) · Zbl 0423.62029 |

[4] | Cook, Applied Regression Including Computing and Graphics (1999) |

[5] | Cox, Regression models and life-tables, Journal of the Royal Statistical Society, Series B 34 pp 187– (1972) · Zbl 0243.62041 |

[6] | Fahrmeir, Consistency and asymptotic normality of the maximum likelihood estimator in generalized linear models, The Annals of Statistics 13 pp 342– (1985) · Zbl 0594.62058 |

[7] | Fischl, The safety and efficacy of zidovudine (AZT) in the treatment of subjects with mildly symptomatic human immunodeficiency virus type 1 (HIV) infection, Annals of Internal Medicine 112 pp 727– (1990) |

[8] | Liang, Longitudinal data analysis using generalized linear models, Biometrika 73 pp 13– (1986) · Zbl 0595.62110 |

[9] | Lin, Checking the Cox model with cumulative sums of martingale-based residuals, Biometrika 80 pp 557– (1993) · Zbl 0788.62094 |

[10] | Lin, Semi-parametric regression for the mean and rate functions of recurrent events, Journal of the Royal Statistical Society, Series B 62 pp 711– (2000) · Zbl 1074.62510 |

[11] | McCullagh, Generalized Linear Models (1989) · Zbl 0588.62104 |

[12] | Montgomery, Introduction to Statistical Quality Control (1997) |

[13] | Neter, Applied Linear Statistical Models (1996) |

[14] | Pollard, Empirical Processes: Theory and Applications (1990) |

[15] | Schoenfeld, Chi-squared goodness-of-fit tests for the proportional hazards regression model, Biometrika 67 pp 145– (1980) · Zbl 0446.62039 |

[16] | Selvin, Practical Bio statistical Methods (1995) |

[17] | Spiekerman, Checking the marginal Cox model for correlated failure time data, Biometrika 83 pp 143– (1996) · Zbl 0865.62080 |

[18] | Stute, Nonparametric model checks for regression, The Annals of Statistics 25 pp 613– (1997) · Zbl 0926.62035 |

[19] | Su, A lack-of-fit test for the mean function in a generalized linear model, Journal of the American Statistical Association 86 pp 420– (1991) |

[20] | Therneau, Martingale-based residuals for survival models, Biometrika 77 pp 147– (1990) · Zbl 0692.62082 |

[21] | Tsiatis, A note on a goodness-of-fit test for the logistic regression without replication, Biometrika 67 pp 250– (1980) · Zbl 0424.62030 |

[22] | White, Maximum likelihood estimation of misspecified models, Econometrica 50 pp 1– (1982) · Zbl 0478.62088 |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.