×

zbMATH — the first resource for mathematics

Rates of convergence of estimates, Kolmogorov’s entropy and the dimensionality reduction principle in regression. (English) Zbl 0909.62063
Let \((X_1,Y_1),\dots, (X_n,Y_n)\) be a random sample of \(n\) independent pairs, copies of \((X,Y)\). The random vector \(X\) is distributed in \(I=[0,1]^d\). Conditionally on \(X_1= x_1,\dots, X_n=x_n\), the r.v. \(Y_1,\dots, Y_n\) are independent, each having density \(f(y\mid x_i, \theta(x_i))\), \(i=1,\dots, n\), of known form. The unknown function \(\theta\) is an element of \(\Theta_{q,d}\), the space of \(q\)-smooth real-valued functions in \(I\). Like in Y. G. Yatracos [Ann. Stat. 17, No. 4, 1597-1607 (1989; Zbl 0694.62018)] a statistical interpretation of \(\theta\) is not specified, whether for example it is either a mean or a median.
In the present paper \(L_1\)-optimal estimates \(\widetilde{\theta}_n\) of \(\theta\) are constructed for the models of two following types, in the presence or without interactions:
I. The additive supermodel, \[ \theta(x)= \sum_{j=1}^K \theta_{1j}(b_j^Tx)+ \sum_{j=1}^L \psi_j(x_{m_1},\dots, x_{m_{r_j}}); \] II. The multiplicative supermodel, \[ \theta(x)= \prod_{j=1}^K \theta_{1j} (b_j^Tx)\cdot \prod_{j=1}^L \psi_j(x_{m_1},\dots, x_{m_{r_j}}). \] Here \(b_j\) are the unit vectors in \(\mathbb{R}^d\). The parameter \(r=\max_{1\leq j\leq L}r_j\) is called the dimension of the model. For the supermodels without interactions the dimension is \(r=1\).
Y. G. Yatracos [Ann. Stat. 13, 768-774 (1985; Zbl 0576.62057)] constructed \(L_1\)-estimates of a probability measure under the assumption of i.i.d. observations and related the \(L_1\)-rate of convergence of the estimates to Kolmogorov’s entropy of the parameter space. G. G. Roussas and Y. G. Yatracos [D. Pollard et al. (eds.), Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics, 337-344 (1997; Zbl 0892.62023)] provided \(L_1\)-estimates of a probability measure on the basis of observations from a \(\varphi\)-mixing sequence of r.v. All these methods as well as the method of the present paper are close relatives to U. Grenander’s method of sieves [“Abstract inference.” (1981; Zbl 0505.62069)].
The obtained rates of convergence of \(\widetilde{\theta}_n\) to the true value depend on Kolmogorov’s entropy of the assumed model and confirm C. J. Stone’s [Ann. Stat. 13, 689-705 (1985; Zbl 0605.62065)] heuristic dimensionality reduction principle that the optimal rate of convergence is \(n^{-q/(2q+r)}\). The proof is based on the inequalities of W. Hoeffding [J. Am. Stat. Assoc. 58, 13-30 (1963; Zbl 0127.10602)]. Rates of convergence are also obtained for the error in estimating the derivatives of a regression type function.

MSC:
62J02 General nonlinear regression
62G20 Asymptotic properties of nonparametric inference
62G05 Nonparametric estimation
62G30 Order statistics; empirical distribution functions
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Barron, A., Birgé, L. and Massart, P. (1997). Risk bounds for model selection via penalization. Probab. Theory Related Fields. · Zbl 0946.62036
[2] Barron, A. and Cover, T. (1991). Minimum complexity density estimation. IEEE Trans. Inform. Theory 37 1034-1054. · Zbl 0743.62003
[3] Beran, R. J. (1977). Minimum Hellinger distance estimates for parametric models. Ann. Statist. 5 445-463. · Zbl 0381.62028
[4] Birgé, L. (1983). Approximation dans les espaces métriques et théorie de l’estimation. Z. Wahrsch. Verw. Gebiete 65 181-237. · Zbl 0506.62026
[5] Chaudhuri, P. (1991). Nonparametric estimates of regression quantiles and their local Bahadur representation. Ann. Statist. 19 760-777. · Zbl 0728.62042
[6] Chen, H. (1991). Estimation of a projection-pursuit regression model. Ann. Statist. 19 1142-1157. · Zbl 0736.62055
[7] Devroy e, L. P. (1987). A Course in Density Estimation. Birkhäuser, Boston. · Zbl 0617.62043
[8] Devroy e, L. P. and Gy örfi, L. (1985). Nonparametric Density Estimation: The L1-View. Wiley, New York.
[9] Donoho, D. L., Johnstone I. M., Kerky acharian, G. and Picard, D. (1995). Wavelet shrinkage: asy mptopia (with discussion)? J. Roy. Statist. Soc. Ser. B 57 301-369. Donoho, D. L. and Liu, R. C. (1988a). The ”automatic” robustness of minimum distance functionals. Ann. Statist. 16 552-586. Donoho, D. L. and Liu, R. C. (1988b). Pathologies of some minimum distance estimators. Ann. Statist. 16 587-608. JSTOR:
[10] Fisher, R. A. (1922). On the mathematical foundations of theoretical statistics. Philos. Trans. Roy. Soc. A 222 309-368. · JFM 48.1280.02
[11] Fisher, R. A. (1925). Theory of statistical estimation. Proc. Cambridge Philos. Soc. 22 700-725. · JFM 51.0385.01
[12] Friedman, J. H. and Stuetzle, W. (1981). Projection pursuit regression. J. Amer. Statist. Assoc. 76 817-823. JSTOR:
[13] Grenander, U. (1981). Abstract Inference. Wiley, New York. · Zbl 0505.62069
[14] Hoeffding, W. (1963). Probability inequalities for sums of bounded random variables. J. Amer. Statist. Assoc. 58 13-31. JSTOR: · Zbl 0127.10602
[15] Huber, P. J. (1985). Projection pursuit. Ann. Statist. 13 435-475. · Zbl 0595.62059
[16] Kolmogorov, A. N. and Tikhomirov, V. M. (1959). -entropy and -capacity of sets in function spaces. Uspekhi Mat. Nauk. 14 3-86. (In Russian.) [Published in English in (1961) Amer. Math. Soc. Transl. (2) 17 277-364.] · Zbl 0133.06703
[17] Le Cam, L. M. (1973). Convergence of estimates under dimensionality restrictions. Ann. Statist. 1 38-53. · Zbl 0255.62006
[18] Le Cam, L. M. (1986). Asy mptotic Methods in Statistical Decision Theory. Springer, New York. · Zbl 0605.62002
[19] Le Cam, L. M. and Yang, G. L. (1990). Asy mptotics in Statistics: Some Basic Concepts. Springer, New York.
[20] Millar, P. W. (1982). Robust estimation via minimum distance methods. Z. Warsch. Verw. Gebiete 55 73-89. · Zbl 0461.62036
[21] Nicoleris, T. (1994). Selected topics in estimation. Ph.D. dissertation, Univ. Montréal.
[22] Roussas, G. G. and Yatracos, Y. G. (1996). Minimum distance regression-ty pe estimates with rates under weak dependence. Ann. Inst. Statist. Math. 48 267-281. · Zbl 0859.62080
[23] Roussas, G. G. and Yatracos, Y. G. (1997). Minimum distance estimates with rates under mixing. In Festschrift for Lucien Le Cam: Research Papers in Probability and Statistics (D. Pollard, E. Torgersen and G. L. Yang, eds.) 337-345. Springer, New York. · Zbl 0892.62023
[24] Stone, C. J. (1982). Optimal global rates of convergence in nonparametric regression. Ann. Statist. 10 1040-1053. · Zbl 0511.62048
[25] Stone, C. J. (1985). Additive regression and other nonparametric models. Ann. Statist. 13 689- 705. · Zbl 0605.62065
[26] RATES, ENTROPY AND DIMENSIONALITY 2511
[27] Stone, C. J. (1994). The use of poly nomial splines and their tensor product in multivariate function estimation. Ann. Statist. 22 118-184. · Zbl 0827.62038
[28] Truong, Y. K. (1989). Asy mptotic properties of kernel estimators based on local medians. Ann. Statist. 17 606-617. · Zbl 0675.62031
[29] Truong, Y. K. and Stone, C. J. (1994). Semi-parametric time series regression. J. Time Ser. Anal. 15 405-428. · Zbl 0815.62024
[30] Wolfowitz, J. (1957). The minimum distance method. Ann. Math. Statist. 28 75-88. · Zbl 0086.35403
[31] Yatracos, Y. G. (1985). Rates of convergence of minimum distance estimators and Kolmogorov’s entropy. Ann. Statist. 13 768-774. · Zbl 0576.62057
[32] Yatracos, Y. G. (1988). A lower bound on the error in nonparametric regression ty pe problems. Ann. Statist. 16 1180-1187. Yatracos, Y. G. (1989a). A regression ty pe problem. Ann. Statist. 17 1597-1607. Yatracos, Y. G. (1989b). On the estimation of the derivatives of a function via the derivatives of an estimate. J. Multivariate Anal. 28 172-175. · Zbl 0651.62028
[33] Yatracos, Y. G. (1992). L1-optimal estimates for a regression ty pe function in Rd J. Multivariate Anal. 40 213-221. · Zbl 0744.62064
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.