# zbMATH — the first resource for mathematics

Robust and efficient variable selection for semiparametric partially linear varying coefficient model based on modal regression. (English) Zbl 1281.62109
Summary: Semiparametric partially linear varying coefficient models (SPLVCMs) are frequently used in statistical modeling. With high-dimensional covariates both in the parametric and nonparametric part for SPLVCMs, sparse modeling is often considered in practice. We propose a new estimation and variable selection procedure based on modal regression, where the nonparametric functions are approximated by a B-spline basis. The outstanding merit of the proposed variable selection procedure is that it can achieve both robustness and efficiency by introducing an additional tuning parameter (i.e., bandwidth h). Its oracle property is also established for both the parametric and nonparametric part. Moreover, we give a data-driven bandwidth selection method and propose an EM-type algorithm for the proposed method. A Monte Carlo simulation study and real data example are conducted to examine the finite sample performance of the proposed method. Both the simulation results and real data analysis confirm that the newly proposed method works very well.

##### MSC:
 62G08 Nonparametric regression and quantile regression 62G35 Nonparametric robustness 65D07 Numerical computation using splines 62F10 Point estimation 62F07 Statistical ranking and selection procedures 65C05 Monte Carlo methods
##### Keywords:
B-splines; oracle property; robustness; efficiency
Full Text:
##### References:
  Cai, Z., Xiao, Z. (2012). Semiparametric quantile regression estimation in dynamic models with partially varying coefficients. Journal of Econometrics, 167, 413-425. · Zbl 1441.62623  Cai, Z., Fan, J., Li, R. (2000). Efficient estimation and inference for varying-coefficient models. Journal of the American Statistical Association, 95, 888-902. · Zbl 0999.62052  Candes, E., Tao, T. (2007). The Dantzig selector: statistical estimation when $$p$$ is much larger than $$n$$. The Annals of Statistics, 35, 2313-2351. · Zbl 1139.62019  Cheng, M., Zhang, W., Chen, L. (2009). Statistical estimation in generalized multiparameter likelihood models. Journal of the American Statistical Association, 104, 1179-1191. · Zbl 1388.62160  Fairfield, K., Fletcher, R. (2002). Vitamins for chronic disease prevention in adults: scientific review. The Journal of the American Medical Association, 287, 3116-3126.  Fan, J., Gijbels, I. (1996). Local polynomial modelling and its application. New York: Chapman and Hall. · Zbl 0873.62037  Fan, J., Huang, T. (2005). Profile likelihood inferences on semiparametric varying-coefficient partially linear models. Bernoulli, 11, 1031-1057. · Zbl 1098.62077  Fan, J., Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. Journal of the American Statistical Association, 96, 1348-1360. · Zbl 1073.62547  Fan, J., Zhang, W. (1999). Statistical estimation in varying coefficient models. The Annals of Statistics, 27, 1491-1518. · Zbl 0977.62039  Fan, J., Zhang, W. (2000). Simultaneous confidence bands and hypotheses testing in varying-coefficient models. Scandinavian Journal of Statistics, 27, 715-731. · Zbl 0962.62032  Hastie, T., Tibshirani, R. (1993). Varying-coefficient model. Journal of the Royal Statistical Society, Series B, 55, 757-796. · Zbl 0796.62060  Huang, J., Wu, C., Zhou, L. (2002). Varying-coefficient models and basis function approximation for the analysis of repeated measurements. Biometrika, 89, 111-128. · Zbl 0998.62024  Kai, B., Li, R., Zou, H. (2011). New efficient estimation and variable selection methods for semiparametric varying-coefficient partially linear models. The Annals of Statistics, 39, 305-332. · Zbl 1209.62074  Lam, C., Fan, J. (2008). Profile-kernel likelihood inference with diverging number of parameters. The Annals of Statistics, 36, 2232-2260. · Zbl 1274.62289  Lee, M. (1989). Mode regression. Journal of Econometrics, 42, 337-349. · Zbl 0692.62092  Leng, C. (2009). A simple approach for varying-coefficient model selection. Journal of Statistical Planning and Inference, 139, 2138-2146. · Zbl 1160.62067  Li, J., Palta, M. (2009). Bandwidth selection through cross-validation for semi-parametric varying-coefficient partially linear models. Journal of Statistical Computation and Simulation, 79, 1277-1286. · Zbl 1178.62039  Li, J., Zhang, W. (2011). A semiparametric threshold model for censored longitudinal data analysis. Journal of the American Statistical Association, 106, 685-696. · Zbl 1232.62065  Li, J., Ray, S., Lindsay, B. (2007). A nonparametric statistical approach to clustering via mode identification. Journal of Machine Learning Research, $$8$$, 1687-1723. · Zbl 1222.62076  Li, Q., Huang, C., Li, D., Fu, T. (2002). Semiparametric smooth coefficient models. Journal of Business and Economic Statistics, $$3$$, 412-422.  Li, R., Liang, H. (2008). Variable selection in semiparametric regression modeling. The Annals of Statistics, 36, 261-286. · Zbl 1132.62027  Lin, Z., Yuan, Y. (2012). Variable selection for generalized varying coefficient partially linear models with diverging number of parameters. Acta Mathematicae Applicatae Sinica, English Series, 28, 237-246. · Zbl 1360.62183  Lu, Y. (2008). Generalized partially linear varying-coefficient models. Journal of Statistical Planning and Inference, 138, 901-914. · Zbl 1130.62036  Nierenberg, D., Stukel, T., Baron, J., Dain, B., Greenberg, E. (1989). Determinants of plasma levels of beta-carotene and retinol. American Journal of Epidemiology, 130, 511-521.  Schumaker, L. (1981). Splines function: basic theory. New York: Wiley.  Scott, D. (1992). Multivariate density estimation: theory, practice and visualization. New York: Wiley. · Zbl 0850.62006  Stone, C. (1982). Optimal global rates of convergence for nonparametric regression. The Annals of Statistics, 10, 1040-1053. · Zbl 0511.62048  Tang, Y., Wang, H., Zhu, Z., Song, X. (2012). A unified variable selection approach for varying coefficient models. Statistica Sinica, 22, 601-628. · Zbl 1238.62021  Tibshirani, R. (1996). Regression shrinkage and selection via the LASSO. Journal of the Royal Statistical Society, Series B, 58, 267-288. · Zbl 0850.62538  Wang, H., Zhu, Z., Zhou, J. (2009). Quantile regression in partially linear varying coefficient models. The Annals of Statistics, 37, 3841-3866. · Zbl 1191.62077  Wang, L., Li, H., Huang, J. (2008). Variable selection in nonparametric varying-coefficient models for analysis of repeated measurements. Journal of the American Statistical Association, 103, 1556-1569. · Zbl 1286.62034  Xia, Y., Zhand, W., Tong, H. (2004). Efficient estimation for semivarying-coefficient models. Biometrika, 91, 661-681. · Zbl 1108.62019  Xie, H., Huang, J. (2009). SCAD-penalized regression in high-dimensional partially linear models. The Annals of Statistics, 37, 673-696. · Zbl 1162.62037  Yao, W., Li, L. (2011). A new regression model: modal linear regression. Technical report, Kansas State University, Manhattan. http://www-personal.ksu.edu/ wxyao/ · Zbl 1309.62119  Yao, W., Lindsay, B., Li, R. (2012). Local modal regression. Journal of Nonparametric Statistics, 24, 647-663. · Zbl 1254.62059  Zhang, C. (2010). Nearly unbiased variable selection under minimax concave penalty. The Annals of Statistics, 38, 894-942. · Zbl 1183.62120  Zhang, W., Lee, S., Song, X. (2002). Local polynomial fitting in semivarying coefficient model. Journal of Multivariate Analysis, 82, 166-188. · Zbl 0995.62038  Zhao, P., Xue, L. (2009). Variable selection for semiparametric varying coefficient partially linear models. Statistics and Probability Letters, 79, 2148-2157. · Zbl 1171.62026  Zou, H. (2006). The adaptive LASSO and its oracle properties. Journal of the American Statistical Association, 101, 1418-1429. · Zbl 1171.62326  Zou, H., Hastie, T. (2005). Regularization and variable selection via the elastic net. Journal of the Royal Statistical Society, Series B, 67, 301-320. · Zbl 1069.62054  Zou, H., Li, R. (2008). One-step sparse estimates in nonconcave penalized like-lihood models (with discussion). The Annals of Statistics, 36, 1509-1533. · Zbl 1142.62027
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.