×

Prediction error estimation under Bregman divergence for non-parametric regression and classification. (English) Zbl 1195.62043

A nonparametric model is considered in which the conditional distribution of the response \(Y\) given the covariate \(X\) belongs to some exponential family with the canonical parameter \(\vartheta(X)\), \(\vartheta\) being an unknown smooth function of \(X\). The problem is to estimate \(\vartheta(\cdot)\). (Note that some classification problems are encompassed by this setting in the case where \(Y\) is discrete). The author proposes a new algorithm of approximate cross-validation (CV) criterion calculation in the case where a Bregman loss function is applied. This algorithm is less computationally expensive than naive leave-one-out algorithms. The asymptotic behavior of the derived criterion is investigated. Applications to bandwidth selection for nonparametric local logistic regression are discussed. Results of simulations and real data examples are presented.

MSC:

62G08 Nonparametric regression and quantile regression
62J12 Generalized linear models (logistic models)
62H30 Classification and discrimination; cluster analysis (statistical aspects)
65C60 Computational problems in statistics (MSC2010)
62G20 Asymptotic properties of nonparametric inference

Software:

ElemStatLearn; Excel
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Albright, Data analysis and decision making with Microsoft Excel (1999)
[2] Altman, Consistent bandwidth selection for kernel binary regression, J. Statist. Plan. Inference 70 pp 121– (1998) · Zbl 0937.62039
[3] Aragaki, Computing science and statistics: Proceedings of the 29th Symposium on the Interface (1997)
[4] Böhning, Monotonicity of quadratic approximation algorithms, Ann. Inst. Statist. Math. 40 pp 641– (1988)
[5] Bregman, A relaxation method of finding a common point of convex sets and its application to the solution of problems in convex programming, U.S.S.R. Comput. Math. and Math. Phys. 7 pp 620– (1967) · Zbl 0186.23807
[6] Cai, Efficient estimation and inferences for varying-coefficient models, J. Amer. Statist. Assoc. 95 pp 888– (2000) · Zbl 0999.62052
[7] Davidson, Bootstrap methods and their application (1997) · doi:10.1017/CBO9780511802843
[8] Efron, How biased is the apparent error rate of a prediction rule?, J. Amer. Statist. Assoc. 81 pp 461– (1986) · Zbl 0621.62073
[9] Efron, The estimation of prediction error: covariance penalties and cross-validation (with discussion), J. Amer. Statist. Assoc. 99 pp 619– (2004) · Zbl 1117.62324
[10] Fan, One-step local quasi-likelihood estimation, J. Roy. Statist. Soc. Ser. B 61 pp 927– (1999) · Zbl 0940.62039
[11] Fan, Local polynomial kernel regression for generalized linear models and quasi-likelihood functions, J. Amer. Statist. Assoc. 90 pp 141– (1995) · Zbl 0818.62036
[12] Fan, Local maximum likelihood estimation and inference, J. Roy. Statist. Soc. Ser. B 60 pp 591– (1998) · Zbl 0909.62036
[13] Fan, On non-concave penalized likelihood with diverging number of parameters, Ann. Statist. 32 pp 928– (2004)
[14] Golub, Matrix computations (1996)
[15] Hall, Empirical functionals and efficient smoothing parameter selection (with discussion), J. Royal. Statist. Soc. B 54 pp 475– (1992) · Zbl 0786.62050
[16] Hardy, Inequalities (1988)
[17] Härdle, Regression smoothing parameters that are not far from their optimum, J. Amer. Statist. Assoc. 87 pp 227– (1992) · Zbl 0850.62352
[18] Harrison, Hedonic housing prices and the demand for clean air, J. Environ. Econ. Manage. 5 pp 81– (1978) · Zbl 0375.90023
[19] Hastie, Generalized additive models (1990) · Zbl 0747.62061
[20] Hastie, The elements of statistical learning: data mining, inference, and prediction (2001) · Zbl 0973.62007
[21] McCullagh, Generalized linear models (1989) · Zbl 0588.62104 · doi:10.1007/978-1-4899-3242-6
[22] Mitrinović, Classical and new inequalities in analysis (1993) · doi:10.1007/978-94-017-1043-5
[23] Müller, Kernel and probit estimates in quantal bioassay, J. Amer. Statist. Assoc. 83 pp 750– (1988) · Zbl 0662.62112
[24] Nelder, Generalized linear models, J. Roy. Statist. Soc. Ser. A 135 pp 370– (1972)
[25] Pregibon, Logistic regression diagnostics, Ann. Statist. 9 pp 705– (1981) · Zbl 0478.62053
[26] Rice, Bandwidth choice for nonparametric regression, Ann. Statist. 20 pp 712– (1984) · Zbl 0554.62035
[27] Ruppert, Multivariate locally weighted least squares regression, Ann. Statist. 22 pp 1346– (1994) · Zbl 0821.62020
[28] Severini, Quasi-likelihood estimation in semiparametric models, J. Amer. Statist. Assoc. 89 pp 501– (1994) · Zbl 0798.62046
[29] Staniswalis, The kernel estimate of a regression function in likelihood-based models, J. Amer. Statist. Assoc. 84 pp 276– (1989) · Zbl 0721.62039
[30] Tibshirani, Local likelihood estimation, J. Amer. Statist. Assoc. 82 pp 559– (1987) · Zbl 0626.62041
[31] Tibshirani (1996)
[32] Wong, On the consistency of cross-validation in kernel nonparametric regression, Ann. Statist. 11 pp 1136– (1983) · Zbl 0539.62046
[33] Xiang, A generalized approximate cross validation for smoothing splines with non-Gaussian data, Statist. Sinica 6 pp 675– (1996) · Zbl 0854.62044
[34] Zhang, Calibrating the degrees of freedom for automatic data smoothing and effective curve checking, J. Amer. Statist. Assoc. 98 pp 609– (2003) · Zbl 1040.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.