zbMATH — the first resource for mathematics

Optimal bandwidth selection in nonparametric regression function estimation. (English) Zbl 0594.62043
Let \((X,Y),(X_ 1,Y_ 1),(X_ 2,Y_ 2)..\). be i.i.d. \((d+1)\) dimensional random vectors with Y real valued. Consider the problem of estimating the regression function \[ m(x)=E(Y| X=x), \] using \((X_ 1,Y_ 1),...,(X_ n,Y_ n)\). In this paper, kernel estimators with a data-driven bandwidth are investigated. The bandwidth selection rule \(\hat h\) is to choose h to minimize \[ CV(h)=n^{- 1}\sum^{n}_{j=1}(Y_ j-\hat m_ j(X_ j))^ 2W(X_ j), \] where \[ \hat m_ j(x)=(n-1)^{-1}\sum^{n}_{i\neq j}h^{-d}K((x-x_ j)/h)Y_ i/\hat f_ j\quad (x),\quad \hat f_ j(x)=(n-1)^{- 1}\sum^{n}_{j=1}(\hat m^ 2_ j(X_ j)W(X_ j)) \] and W(x) is a weight function. The authors establish asymptotic optimality for this bandwidth selection rule which can be interpreted in terms of cross validation. They settle an open problem of C. J. Stone [Ann. Stat. 10, 1040-1053 (1982; Zbl 0511.62048)] regarding the optimal rate uniformly over smoothness classes and show that these selection rules are important in exploratory data analysis.
Reviewer: A.K.Basu

62G05 Nonparametric estimation
62G20 Asymptotic properties of nonparametric inference
Full Text: DOI