zbMATH — the first resource for mathematics

Robust and efficient variable selection for semiparametric partially linear varying coefficient model based on modal regression. (English) Zbl 1281.62109
Summary: Semiparametric partially linear varying coefficient models (SPLVCMs) are frequently used in statistical modeling. With high-dimensional covariates both in the parametric and nonparametric part for SPLVCMs, sparse modeling is often considered in practice. We propose a new estimation and variable selection procedure based on modal regression, where the nonparametric functions are approximated by a B-spline basis. The outstanding merit of the proposed variable selection procedure is that it can achieve both robustness and efficiency by introducing an additional tuning parameter (i.e., bandwidth h). Its oracle property is also established for both the parametric and nonparametric part. Moreover, we give a data-driven bandwidth selection method and propose an EM-type algorithm for the proposed method. A Monte Carlo simulation study and real data example are conducted to examine the finite sample performance of the proposed method. Both the simulation results and real data analysis confirm that the newly proposed method works very well.

62G08 Nonparametric regression and quantile regression
62G35 Nonparametric robustness
65D07 Numerical computation using splines
62F10 Point estimation
62F07 Statistical ranking and selection procedures
65C05 Monte Carlo methods
Full Text: DOI
[1] Cai, Z., Xiao, Z. (2012). Semiparametric quantile regression estimation in dynamic models with partially varying coefficients. Journal of Econometrics, 167, 413-425. · Zbl 1441.62623
[2] Cai, Z., Fan, J., Li, R. (2000). Efficient estimation and inference for varying-coefficient models. Journal of the American Statistical Association, 95, 888-902. · Zbl 0999.62052
[3] Candes, E., Tao, T. (2007). The Dantzig selector: statistical estimation when \(p\) is much larger than \(n\). The Annals of Statistics, 35, 2313-2351. · Zbl 1139.62019
[4] Cheng, M., Zhang, W., Chen, L. (2009). Statistical estimation in generalized multiparameter likelihood models. Journal of the American Statistical Association, 104, 1179-1191. · Zbl 1388.62160
[5] Fairfield, K., Fletcher, R. (2002). Vitamins for chronic disease prevention in adults: scientific review. The Journal of the American Medical Association, 287, 3116-3126.
[6] Fan, J., Gijbels, I. (1996). Local polynomial modelling and its application. New York: Chapman and Hall. · Zbl 0873.62037
[7] Fan, J., Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli, 11, 1031-1057. · Zbl 1098.62077
[8] Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348-1360. · Zbl 1073.62547
[9] Fan, J., Zhang, W. (1999). Statistical estimation in varying coefficient models. The Annals of Statistics, 27, 1491-1518. · Zbl 0977.62039
[10] Fan, J., Zhang, W. (2000). Simultaneous confidence bands and hypotheses testing in varying-coefficient models. Scandinavian Journal of Statistics, 27, 715-731. · Zbl 0962.62032
[11] Hastie, T., Tibshirani, R. (1993). Varying-coefficient model. Journal of the Royal Statistical Society, Series B, 55, 757-796. · Zbl 0796.62060
[12] Huang, J., Wu, C., Zhou, L. (2002). Varying-coefficient models and basis function approximation for the analysis of repeated measurements. Biometrika, 89, 111-128. · Zbl 0998.62024
[13] Kai, B., Li, R., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. The Annals of Statistics, 39, 305-332. · Zbl 1209.62074
[14] Lam, C., Fan, J. (2008). Profile-kernel likelihood inference with diverging number of parameters. The Annals of Statistics, 36, 2232-2260. · Zbl 1274.62289
[15] Lee, M. (1989). Mode regression. Journal of Econometrics, 42, 337-349. · Zbl 0692.62092
[16] Leng, C. (2009). A simple approach for varying-coefficient model selection. Journal of Statistical Planning and Inference, 139, 2138-2146. · Zbl 1160.62067
[17] Li, J., Palta, M. (2009). Bandwidth selection through cross-validation for semi-parametric varying-coefficient partially linear models. Journal of Statistical Computation and Simulation, 79, 1277-1286. · Zbl 1178.62039
[18] Li, J., Zhang, W. (2011). A semiparametric threshold model for censored longitudinal data analysis. Journal of the American Statistical Association, 106, 685-696. · Zbl 1232.62065
[19] Li, J., Ray, S., Lindsay, B. (2007). A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research, \(8\), 1687-1723. · Zbl 1222.62076
[20] Li, Q., Huang, C., Li, D., Fu, T. (2002). Semiparametric smooth coefficient models. Journal of Business and Economic Statistics, \(3\), 412-422.
[21] Li, R., Liang, H. (2008). Variable selection in semiparametric regression modeling. The Annals of Statistics, 36, 261-286. · Zbl 1132.62027
[22] Lin, Z., Yuan, Y. (2012). Variable selection for generalized varying coefficient partially linear models with diverging number of parameters. Acta Mathematicae Applicatae Sinica, English Series, 28, 237-246. · Zbl 1360.62183
[23] Lu, Y. (2008). Generalized partially linear varying-coefficient models. Journal of Statistical Planning and Inference, 138, 901-914. · Zbl 1130.62036
[24] Nierenberg, D., Stukel, T., Baron, J., Dain, B., Greenberg, E. (1989). Determinants of plasma levels of beta-carotene and retinol. American Journal of Epidemiology, 130, 511-521.
[25] Schumaker, L. (1981). Splines function: basic theory. New York: Wiley.
[26] Scott, D. (1992). Multivariate density estimation: theory, practice and visualization. New York: Wiley. · Zbl 0850.62006
[27] Stone, C. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10, 1040-1053. · Zbl 0511.62048
[28] Tang, Y., Wang, H., Zhu, Z., Song, X. (2012). A unified variable selection approach for varying coefficient models. Statistica Sinica, 22, 601-628. · Zbl 1238.62021
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B, 58, 267-288. · Zbl 0850.62538
[30] Wang, H., Zhu, Z., Zhou, J. (2009). Quantile regression in partially linear varying coefficient models. The Annals of Statistics, 37, 3841-3866. · Zbl 1191.62077
[31] Wang, L., Li, H., Huang, J. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. Journal of the American Statistical Association, 103, 1556-1569. · Zbl 1286.62034
[32] Xia, Y., Zhand, W., Tong, H. (2004). Efficient estimation for semivarying-coefficient models. Biometrika, 91, 661-681. · Zbl 1108.62019
[33] Xie, H., Huang, J. (2009). SCAD-penalized regression in high-dimensional partially linear models. The Annals of Statistics, 37, 673-696. · Zbl 1162.62037
[34] Yao, W., Li, L. (2011). A new regression model: modal linear regression. Technical report, Kansas State University, Manhattan. http://www-personal.ksu.edu/ wxyao/ · Zbl 1309.62119
[35] Yao, W., Lindsay, B., Li, R. (2012). Local modal regression. Journal of Nonparametric Statistics, 24, 647-663. · Zbl 1254.62059
[36] Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894-942. · Zbl 1183.62120
[37] Zhang, W., Lee, S., Song, X. (2002). Local polynomial fitting in semivarying coefficient model. Journal of Multivariate Analysis, 82, 166-188. · Zbl 0995.62038
[38] Zhao, P., Xue, L. (2009). Variable selection for semiparametric varying coefficient partially linear models. Statistics and Probability Letters, 79, 2148-2157. · Zbl 1171.62026
[39] Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical Association, 101, 1418-1429. · Zbl 1171.62326
[40] Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67, 301-320. · Zbl 1069.62054
[41] Zou, H., Li, R. (2008). One-step sparse estimates in nonconcave penalized like-lihood models (with discussion). The Annals of Statistics, 36, 1509-1533. · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.