×

zbMATH — the first resource for mathematics

A comparative study of several smoothing methods in density estimation. (English) Zbl 0937.62518
Summary: The theory of bandwidth choice in density estimation is developing very fast. Several methods (with plenty of varieties and subvarieties) have been recently proposed as an alternative to least squares cross-validation, the standard for years. This paper includes (a) A critical up-to-date review of the main methods currently available. The discussion provide some new insights on the important problem of estimating the minimization criteria and on the choice of pilot bandwidths in bootstrap-based methods. (b) An extensive simulation study of ten selected bandwidths. (c) A final discussion with some recommendations for practitioners. The conclusions are not easily summarized in a few words, because different cases have to be considered and important nuances must be pointed out. However, we could mention that the classical cross-validation bandwidths show, generally speaking, a relatively poor behavior (this is especially clear for the pseudo-likelihood method). On the other hand, although no selector appears to be uniformly better, the plug-in (in a similar version to that proposed by S. J. Sheather and M. C. Jones [J. R. Stat. Soc., Ser. B 5, 683-690 (1991; Zbl 0800.62219)] and the (smoothed) bootstrap-based selectors show a fairly satisfactory performance which suggests that they could be the new standard methods for the problem of smoothing in density estimation. Interesting results are also obtained for a new type of bandwidths based on the number of inflection points.

MSC:
62-XX Statistics
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Bowman, A. W.: An alternative method of cross-validation for the smoothing of density estimates. Biometrika 71, 353-360 (1984)
[2] Bowman, A. W.: A comparative study of some kernel-based nonparametric density estimators. J. statist. Comput. simul. 21, 313-327 (1985) · Zbl 0565.62026
[3] Broniatowski, M.: Convergence L1 presque sure de l’estimateur de la densité obtenu par validation croissée. CR acad. Sc. Paris 10, 487-490 (1986) · Zbl 0595.62036
[4] Broniatowski, M.; Deheuvels, P.; Devroye, L.: On the relationship between stability of extreme order statistics and convergence of the maximum likelihood kernel density estimate. Ann. statist. 17, 1070-1086 (1989) · Zbl 0701.62045
[5] Cao-Abad, R.: Applicaciones y nuevos resultados del método bootstrap en la estimación no paramétrica de curvas. Ph. D. Dissertation (1990)
[6] Chiu, S. T.: Bandwidth selection for kernel density estimation. Ann. statist. 19, 1883-1905 (1991) · Zbl 0749.62022
[7] Chow, Y. S.; Geman, S.; Wu, L. D.: Consistent cross-validated density estimation. Ann. statist. 11, 25-38 (1983) · Zbl 0509.62033
[8] Cuevas, A.; González-Manteiga, W.: Data-driven smoothing based on convexity properties. Nonparametric functional estimation and related topics, 225-240 (1991) · Zbl 0806.62024
[9] Devroye, L.: A course in density estimation. (1987) · Zbl 0617.62043
[10] Devroye, L.: The double kernel method in density estimation. Ann. inst. Henri Poincaré 25, 533-580 (1989) · Zbl 0701.62044
[11] Devroye, L.; Györfi, L.: Nonparametric density estimation: the L1-view. (1985) · Zbl 0546.62015
[12] Faraway, J. J.; Jhun, M.: Bootstrap choice of bandwidth for density estimation. J. amer. Statist. assoc. 85, 1119-1122 (1990)
[13] Feluch, W.; Koronacki, J.: A note on modified cross-validation in density estimation. Comp. statist. Data anal. 13, 143-151 (1992) · Zbl 0850.62343
[14] Habbema, J. D. F.; Hermans, J.; Den Broeck, K. Van: A stepwise discrimation analysis program using density estimation. Compstat 1974: Proceedings in computational statistics, 101-110 (1974)
[15] Hall, P.: Large-sample optimality of least-squares cross-validation in density estimation. Ann. statist. 11, 1156-1174 (1983) · Zbl 0599.62051
[16] Hall, P.: On Kullback–Leibler loss and density estimation. Ann. statist. 15, 1491-1519 (1987) · Zbl 0678.62045
[17] Hall, P.: Using the bootstrap to estimate mean squared error and select smoothing parameter in nonparametric problems. J. multiv. Anal. 32, 177-203 (1990) · Zbl 0722.62030
[18] Hall, P.; Johnstone, I.: Empirical functionals and efficient smoothing parameter selection. J.R. statist. Soc. B 54, 475-530 (1992) · Zbl 0786.62050
[19] Hall, P.; Marron, J. S.: On the amount of noise inherent in bandwidth selection for a kernel density estimator. Ann. statist. 15, 163-181 (1987) · Zbl 0667.62022
[20] Hall, P.; Marron, J. S.: Extent to which least-squares cross-validation minimise integrated square error in nonparametric density estimation. Probab. th. Rel. fields 74, 567-581 (1987) · Zbl 0588.62052
[21] Hall, P.; Marron, J. S.: Estimation of integrated square density derivatives. Statist. probab. Lett. 6, 109-115 (1987) · Zbl 0628.62029
[22] Hall, P.; Marron, J. S.: Lower bounds for bandwidth selection in density estimation. Probab. th. Rel. fields 90, 143-173 (1991) · Zbl 0742.62041
[23] Hall, P.; Marron, J. S.: Local minima in cross-validation functions. J.R. statist. Soc. B 53, 245-252 (1991) · Zbl 0800.62216
[24] Hall, P.; Marron, S. J.; Park, B.: Smoothed cross-validation. Probab. th. Rel. fields, 1-20 (1992) · Zbl 0742.62042
[25] Hall, P.; Sheather, S. J.; Jones, M. C.; Marron, J. S.: On optimal data-based bandwidth selection in kernel density estimation. Biometrika 78, 263-269 (1991) · Zbl 0733.62045
[26] Hall, P.; Wand, M.: Minimizing L1-distance in nonparametric density estimation. J. multiv. Anal. 26, 59-88 (1988) · Zbl 0673.62030
[27] Jones, M. C.; Kappenman, R. F.: On a class of kernel density estimate bandwidth selector. (1990) · Zbl 0767.62035
[28] Jones, M. C.; Marron, J. S.; Park, B. U.: A simple root n bandwidth selector. An.. statist. 19, 1919-1932 (1991) · Zbl 0745.62033
[29] Jones, M. C.; Marron, J. S.; Sheather, S. J.: Progress in data-based selection for kernel density estimation. (1992) · Zbl 0897.62037
[30] Jones, M. C.; Sheather, S. J.: Using non-stochastic terms to advantage in kernel-based estimation of integrated squared density derivatives. Statist. prob. Letters 11, 511-514 (1991) · Zbl 0724.62040
[31] Léger, C.; Romano, J. P.: Bootstrap choice of tuning parameters. Ann. inst. Statist. math. 42, 709-735 (1990) · Zbl 0722.62032
[32] Mammen, E.: A short note on optimal bandwidth selection for kernel estimators. Statist. prob. Letters 9, 23-25 (1990) · Zbl 0687.62025
[33] Mammen, E.: On qualitative smoothness of kernel density estimates. Manuscript (1990) · Zbl 0811.62046
[34] Mammen, E.; Marron, J. S.; Fisher, N. I.: Some asymptotics for multimodality tests based on kernel density estimates. Probab. th. Rel. fields 91, 115-132 (1992) · Zbl 0745.62048
[35] Marron, J. S.: An asymptotically efficient solution to the bandwidth problem of kernel density estimation. Ann. statist. 13, 1011-1023 (1985) · Zbl 0585.62073
[36] Marron, J. S.: A comparison of cross-validation techniques in density estimation. Ann. statist. 15, 152-162 (1987) · Zbl 0619.62032
[37] Marron, J. S.: Bootstrap bandwidth selection. Exploring the limits of bootstrap, 249-262 (1992) · Zbl 0838.62029
[38] Marron, J. S.: Root n bandwidth selection. Nonparametric functional estimation and related topics, 251-260 (1991)
[39] Park, B. U.; Marron, J. S.: Comparison of data-driven bandwidth selectors. J. amer. Statist. assoc. 85, 66-72 (1990)
[40] Park, B. U.; Turlach, B. A.: Practical performance of several data driven bandwidth selectors. (1992) · Zbl 0775.62100
[41] Parzen, E.: On estimation of a probability density function and mode. Ann. math. Statist. 33, 1065-1076 (1962) · Zbl 0116.11302
[42] Quintela, A.: Cálculo del parámetro de suavización en la estimación no paramétrica de curvas con datos dependientes. Ph. D. Dissertation (1992)
[43] Rosenblatt, M.: Remarks on some nonparametric estimates of a density function. Ann. math. Statist. 27, 832-837 (1956) · Zbl 0073.14602
[44] Scott, D. W.; Terrell, G. R.: Biased and unbiased cross-validation in density estimation. J. amer. Statist. assoc. 82, 1131-1146 (1987) · Zbl 0648.62037
[45] Sheather, S. J.; Jones, M. C.: A reliable data-based bandwidth selection method for kernel density estimation. J. roy. Statist. soc. Ser. B 53, 683-690 (1991) · Zbl 0800.62219
[46] Silverman, B. W.: Using kernel density estimates to investigate multimodality. J. roy. Statist. soc. B 43, 97-99 (1981)
[47] Silverman, B. W.: Weak and strong uniform consistency of the kernel estimate of a density and its derivatives. Ann. statist. 6, 177-184 (1978) · Zbl 0376.62024
[48] Silverman, B. W.: Some properties of a test for multimodality based on kernel density estimates. Probability, statistics and analysis, 248-259 (1983) · Zbl 0504.62036
[49] Silverman, B. W.: Density estimation for statistics and data analysis. (1986) · Zbl 0617.62042
[50] Stone, C. J.: An asymptotically optimal window selection rule for kernel density estimates. Ann. statist. 12, 1285-1297 (1984) · Zbl 0599.62052
[51] Stute, W.: Modified cross-validation in density estimation. J. statist. Plann. inf. 30, 293-305 (1992) · Zbl 0756.62019
[52] Tapia, R. A.; Thompson, J. R.: Nonparametric probability density estimation. (1978) · Zbl 0449.62029
[53] Taylor, C. C.: Bootstrap choice of the smoothing parameter in kernel density estimation. Biometrika 76, 705-712 (1989) · Zbl 0678.62042
[54] Wand, M. P.; Marron, J. S.; Ruppert, D.: Transformations in density estimation. J. amer. Statist. assoc. 86, 343-361 (1991) · Zbl 0742.62046
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.