zbMATH — the first resource for mathematics

Nonparametric estimation of regression functions with both categorical and continuous data. (English) Zbl 1337.62062
Summary: In this paper we propose a method for nonparametric regression which admits continuous and categorical data in a natural manner using the method of kernels. A data-driven method of bandwidth selection is proposed, and we establish the asymptotic normality of the estimator. We also establish the rate of convergence of the cross-validated smoothing parameters to their benchmark optimal smoothing parameters. Simulations suggest that the new estimator performs much better than the conventional nonparametric estimator in the presence of mixed data. An empirical application to a widely used and publicly available dynamic panel of patent data demonstrates that the out-of-sample squared prediction error of our proposed estimator is only 14–20% of that obtained by some popular parametric approaches which have been used to model this data set.

62G05 Nonparametric estimation
62G08 Nonparametric regression and quantile regression
Full Text: DOI
[1] Ahmad, I.A.; Cerrito, P.B., Nonparametric estimation of joint discrete-continuous probability densities with applications, Journal of statistical planning and inference, 41, 349-364, (1994) · Zbl 0803.62033
[2] Aitchison, J.; Aitken, C.G.G., Multivariate binary discrimination by the kernel method, Biometrika, 63, 413-420, (1976) · Zbl 0344.62035
[3] Bierens, H., Uniform consistency of kernel estimators of a regression function under generalized conditions, Journal of the American statistical association, 78, 699-707, (1983) · Zbl 0565.62027
[4] Bierens, H., Kernel estimation of regression functions, ()
[5] Delgado, M.A.; Mora, J., Nonparametric and semiparametric estimation with discrete regressors, Econometrica, 63, 1477-1484, (1995) · Zbl 0837.62035
[6] Fahrmeir, L.; Tutz, G., Multivariate statistical modeling based on generalized models, (1994), Springer New York
[7] Fan, J.; Härdle, W.; Mammen, E., Direct estimation of low dimensional components in additive models, Annals of statistics, 26, 943-971, (1998) · Zbl 1073.62527
[8] Grund, B.; Hall, P., On the performance of kernel estimators for high-dimensional sparse binary data, Journal of multivariate analysis, 44, 321-344, (1993) · Zbl 0766.62019
[9] Hall, P., On nonparametric multivariate binary discrimination, Biometrika, 68, 287-294, (1981) · Zbl 0463.62059
[10] Hall, P., Central limit theorem for integrated square error of multivariate nonparametric density estimators, Journal of multivariate analysis, 14, 1-16, (1984) · Zbl 0528.62028
[11] Hall, P.; Wand, M., On nonparametric discrimination using density differences, Biometrika, 75, 541-547, (1988) · Zbl 0651.62029
[12] Härdle, W.; Marron, J.S., Optimal bandwidth selection in nonparametric regression function estimation, The annals of statistics, 13, 1465-1481, (1985) · Zbl 0594.62043
[13] Härdle, W.; Hall, P.; Marron, J.S., How far are automatically chosen regression smoothing parameters from their optimum?, Journal of the American statistical association, 83, 86-95, (1988) · Zbl 0644.62048
[14] Härdle, W.; Hall, P.; Marron, J.S., Regression smoothing parameters that are not far from their optimum, Journal of the American statistical association, 87, 227-233, (1992) · Zbl 0850.62352
[15] Hausman, J.; Hall, B.H.; Griliches, Z., Econometric models for count data with an application to the patents-R&D relationship, Econometrica, 52, 909-938, (1984)
[16] Horowitz, J.L., 1998. Semiparametric Methods in Econometrics. Lecture Notes in Statistics. Springer, Berlin. · Zbl 0897.62128
[17] Hsiao, C., Regression analysis with categorized explanatory variables, (), 93-129
[18] Hsiao, C.; Mountain, D., Estimating the short-run income elasticity of demand for electricity using cross-sectional categorical data, Journal of the American statistical association, 80, 259-265, (1985)
[19] Ichimura, H., 2000. Asymptotic distribution of nonparametric and semiparametric estimators with data dependent smoothing parameters. Unpublished manuscript.
[20] Insik, M., Fang, C., Li, Q., 2002. Investigation of patterns in food-away-from-home expenditure for China: a nonparametric approach. Unpublished manuscript.
[21] Ker, A., Racine, J., 2001. Nonparametric rating of crop insurance policies: adverse selecting against the FCIC/RMA. Unpublished manuscript.
[22] Lee, J., U-statistics: theory and practice, (1990), Marcel Dekker, Inc New York, Basel · Zbl 0771.62001
[23] Li, Q., Racine, J. 2001. Empirical applications of smoothing categorical variables. Unpublished manuscript.
[24] Li, Q., Racine, J., 2003. Nonparametric estimation of distributions with categorical and continuous data. Journal of Multivariate Analysis, forthcoming. · Zbl 1019.62030
[25] Powell, J.L., Stock, J.H., Stoker, T.M., 1989. Semiparametric estimation of index coefficients. Econometrica 1403-1430. · Zbl 0683.62070
[26] Robinson, P., Root-N consistent semiparametric regression, Econometrica, 56, 931-954, (1988) · Zbl 0647.62100
[27] Scott, D., Multivariate density estimation: theory, practice, and visualization, (1992), Wiley New York · Zbl 0850.62006
[28] Simonoff, J.S., Smoothing methods in statistics, (1996), Springer New York · Zbl 0859.62035
[29] Stock, J.H., Nonparametric policy analysis, Journal of the American statistical association, 84, 567-575, (1989)
[30] Wang, M.C.; Van Ryzin, J., A class of smooth estimators for discrete distributions, Biometrika, 68, 301-309, (1981) · Zbl 0483.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.