Fast calibrations of the forward search for testing multiple outliers in regression. (English) Zbl 1301.62069

Summary: The paper considers the problem of testing for multiple outliers in a regression model and provides fast approximations to the null distribution of the minimum deletion residual used as a test statistic. Since direct simulation of each combination of number of observations and number of parameters is too time consuming, methods using simple normal samples are described for approximating the pointwise distribution of the test statistic. One approximation is based on adjustments to the results of simple simulations. The other uses properties of order statistics from folded \(t\) distributions to move outside the significance levels available by simulation. Analyses of data with beta errors and of transformed data on survival times demonstrate the usefulness in graphical methods of the inclusion of our bounds.


62J05 Linear regression; mixed models
62P10 Applications of statistics to biology and medical sciences; meta analysis


Full Text: DOI


[1] Atkinson AC (1985). Plots, transformations, and regression. Oxford University Press, Oxford · Zbl 0582.62065
[2] Atkinson AC, Riani M (2000) Robust diagnostic regression analysis. Springer, New York · Zbl 0964.62063
[3] Atkinson AC, Riani M (2006): Distribution theory and simulations for tests of outliers in regression. J Comp Graphical Statist 15:460–476
[4] Atkinson AC, Riani M, Cerioli A (2006): Random start forward searches with envelopes for detecting clusters in multivariate data.In: Zani S, Cerioli A, Riani M, Vichi M (eds). Data analysis, classification and the forward search. Springer, Berlin, pp 163–171
[5] Barnett V, Lewis T (1994) Outliers in statistical data, 3rd edn. Wiley, New York · Zbl 0801.62001
[6] Beckman RJ, Cook RD (1983) Outlier..........s (with discussion). Technometrics 25:119–163
[7] Billor N, Hadi AS, Velleman PJ (2000) BACON: blocked adaptive computationally efficient outlier nominators. Comp Statist Data Anal 34:279–298 · Zbl 1145.62314
[8] Cook RD, Weisberg S (1982). Residuals and influence in regression. Chapman and Hall, London · Zbl 0564.62054
[9] Hawkins DM (1983) Discussion of the paper by Beckman and Cook. Technometrics 25:155–156
[10] Hawkins DM, Olive DJ (2002) Inconsistency of resampling algorithms for high-breakdown regression estimators and a new algorithm (with discussion). J Am Statist Assoc 97:136–159 · Zbl 1073.62546
[11] Johnson NL, Kotz S, Balakrishnan N (1994) Continuous univariate distributions - 1, 2nd edn. Wiley, New York · Zbl 0811.62001
[12] Lehmann E (1991) Point estimation, 2nd edn. Wiley, New York · Zbl 0801.62025
[13] Maronna RA, Yohai VJ (2002) Discussion of Hawkins and Olive (2002). J Am Statist Assoc 97:154–155
[14] Neter J, Kutner MH, Nachtsheim CJ, Wasserman W (1996) Applied linear statistical models, 4th edn. McGraw-Hill, New York
[15] Rousseeuw PJ (1984) Least median of squares regression. J Am Statist Assoc 79:871–880 · Zbl 0547.62046
[16] Wisnowski JW, Montgomery DC, Simpson JR (2001) A comparative analysis of multiple outlier detection procedures in the linear regression model. Comp Statist Data Anal 36:351–382 · Zbl 1038.62062
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.