×

zbMATH — the first resource for mathematics

Coordinate-independent sparse sufficient dimension reduction and variable selection. (English) Zbl 1204.62107
Summary: Sufficient dimension reduction (SDR) in regression, which reduces the dimension by replacing original predictors with a minimal set of their linear combinations without loss of information, is very helpful when the number of predictors is large. The standard SDR methods suffer because the estimated linear combinations usually consist of all original predictors, making it difficult to interpret. We propose a unified method – coordinate-independent sparse estimation (CISE) – that can simultaneously achieve sparse sufficient dimension reduction and screen out irrelevant and redundant variables efficiently. CISE is subspace oriented in the sense that it incorporates a coordinate-independent penalty term with a broad series of model-based and model-free SDR approaches. This results in a Grassmann manifold optimization problem and a fast algorithm is suggested. Under mild conditions, based on manifold theories and techniques, it can be shown that CISE would perform asymptotically as well as if the true irrelevant predictors were known, which is referred to as the oracle property. Simulation studies and a real-data example demonstrate the effectiveness and efficiency of the proposed approach.

MSC:
62H99 Multivariate analysis
62J07 Ridge regression; shrinkage estimators (Lasso)
15A18 Eigenvalues, singular values, and eigenvectors
65C60 Computational problems in statistics (MSC2010)
62H12 Estimation in multivariate analysis
PDF BibTeX XML Cite
Full Text: DOI arXiv
References:
[1] Chiaromonte, F., Cook, R. D. and Li, B. (2002). Sufficient dimension reduction in regressions with categorical predictors. Ann. Statist. 30 475-97. · Zbl 1012.62036 · doi:10.1214/aos/1021379862
[2] Cook, R. D. (1994). On the interpretation of regression plots. J. Amer. Statist. Assoc. 89 177-190. JSTOR: · Zbl 0791.62066 · doi:10.2307/2291214 · links.jstor.org
[3] Cook, R. D. (1998a). Regression Graphics: Ideas for Studying Regressions Through Graphics . Wiley, New York. · Zbl 0903.62001
[4] Cook, R. D. (1998b). Principal Hessian directions revisited (with discussion). J. Amer. Statist. Assoc. 93 84-100. JSTOR: · Zbl 0922.62057 · doi:10.2307/2669605 · links.jstor.org
[5] Cook, R. D. and Li, B. (2002). Dimension reduction for the conditional mean in regression. Ann. Statist. 30 455-474. · Zbl 1012.62035 · doi:10.1214/aos/1021379861
[6] Cook, R. D. (2004). Testing predictor contributions in sufficient dimension reduction. Ann. Statist. 32 1062-1092. · Zbl 1092.62046 · doi:10.1214/009053604000000292
[7] Cook, R. D. (2007). Fisher lecture: Dimension reduction in regression (with discussion). Statist. Sci. 22 1-26. · Zbl 1246.62148 · doi:10.1214/088342306000000682
[8] Cook, R. D. and Forzani, L. (2008). Principal fitted components for dimension reduction in regression. Statist. Sci. 23 485-501. · Zbl 1329.62274 · doi:10.1214/08-STS275
[9] Cook, R. D. and Forzani, L. (2009). Likelihood-based sufficient dimension reduction. J. Amer. Statist. Assoc. 104 197-208. · Zbl 1388.62041 · doi:10.1198/jasa.2009.0106
[10] Cook, R. D. and Weisberg, S. (1991). Discussion of “Sliced inverse regression for dimension reduction” by K. C. Li. J. Amer. Statist. Assoc. 86 328-332. JSTOR: · Zbl 0742.62044 · doi:10.2307/2290563 · links.jstor.org
[11] Donoho, D. L. and Johnstone, I. M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81 425-455. JSTOR: · Zbl 0815.62019 · doi:10.1093/biomet/81.3.425 · links.jstor.org
[12] Edelman, A., Arias, T. A. and Smith, S. T. (1998). The geometry of algorithms with orthogonality constraints. SIAM J. Matrix Anal. Appl. 20 303-353. · Zbl 0928.65050 · doi:10.1137/S0895479895290954
[13] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. JSTOR: · Zbl 1073.62547 · doi:10.1198/016214501753382273 · links.jstor.org
[14] Fung, W. K., He, X., Liu, L. and Shi, P. (2002). Dimension reduction based on canonical correlation. Statist. Sinica 12 1093-1113. · Zbl 1004.62058
[15] Gohberg, I., Lancaster, P. and Rodman, L. (2006). Invariant Subspaces of Matrices with Applications , 2nd ed. SIAM, Philadelphia. · Zbl 0608.15004
[16] Johnstone, I. M. and Lu, A. Y. (2009). On consistency and sparsity for principal components analysis in high dimensions. J. Amer. Statist. Assoc. 104 682-693. · Zbl 1388.62174
[17] Leng, C. and Wang, H. (2009). On general adaptive sparse principal component analysis. J. Comput. Graph. Statist. 18 201-215.
[18] Li, B. and Wang, S. (2007). On directional regression for dimension reduction. J. Amer. Statist. Assoc. 102 997-1008. · Zbl 05564427 · doi:10.1198/016214507000000536
[19] Li, B., Zha, H. and Chiaromonte, F. (2005). Contour regression: A general approach to dimension reduction. Ann. Statist. 33 1580-1616. · Zbl 1078.62033 · doi:10.1214/009053605000000192
[20] Li, L. (2007). Sparse sufficient dimension reduction. Biometrika 94 603-613. · Zbl 1135.62062 · doi:10.1093/biomet/asm044
[21] Li, L., Cook, R. D. and Nachtsheim, C. J. (2005). Model-free variable selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 67 285-299. JSTOR: · Zbl 1069.62053 · doi:10.1111/j.1467-9868.2005.00502.x · links.jstor.org
[22] Li, L. and Nachtsheim, C. J. (2006). Sparse sliced inverse regression. Technometrics 48 503-510. · doi:10.1198/004017006000000129
[23] Li, K.-C. (1991). Sliced inverse regression for dimension reduction (with discussion). J. Amer. Statist. Assoc. 86 316-327. JSTOR: · Zbl 0742.62044 · doi:10.2307/2290563 · links.jstor.org
[24] Li, K.-C. (1992). On principal Hessian directions for data visualization and dimension reduction: Another application of Stein’s lemma. J. Amer. Statist. Assoc. 87 1025-1039. JSTOR: · Zbl 0765.62003 · doi:10.2307/2290640 · links.jstor.org
[25] Li, Y. and Zhu, L.-X. (2007). Asymptotics for sliced average variance estimation. Ann. Statist. 35 41-69. · Zbl 1114.62053 · doi:10.1214/009053606000001091
[26] Manton, J. H. (2002). Optimization algorithms exploiting unitary constraints. IEEE Trans. Signal Process. 50 635-650. · Zbl 1369.90169 · doi:10.1109/78.984753
[27] Ni, L., Cook, R. D. and Tsai, C. L. (2005). A note on shrinkage sliced inverse regression. Biometrika 92 242-247. · Zbl 1068.62080 · doi:10.1093/biomet/92.1.242
[28] Rapcsák, T. (1997). Smoothed Nonlinear Optimization in R n . Kluwer, Boston. · Zbl 1009.90109
[29] Rapcsák, T. (2002). On minimization on Stiefel manifolds. European J. Oper. Res. 143 365-376. · Zbl 1058.90064 · doi:10.1016/S0377-2217(02)00329-6
[30] Shi, P. and Tsai, C.-L. (2002). Regression model selection-a residual likelihood approach. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 237-252. JSTOR: · Zbl 1059.62074 · doi:10.1111/1467-9868.00335 · links.jstor.org
[31] Wang, H., Li, R. and Tsai, C. L. (2007). On the consistency of SCAD tuning parameter selector. Biometrika 94 553-568. · Zbl 1135.62058 · doi:10.1093/biomet/asm053
[32] Yin, X. and Cook, R. D. (2002). Dimension reduction for the conditional k th moment in regression. J. R. Stat. Soc. Ser. B Stat. Methodol. 64 159-75. JSTOR: · Zbl 1067.62042 · doi:10.1111/1467-9868.00330 · links.jstor.org
[33] Yuan, M. and Lin, Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B Stat. Methodol. 68 49-67. · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[34] Zhou, J. and He, X. (2008). Dimension reduction based on constrained canonical correlation and variable filtering. Ann. Statist. 36 1649-1668. · Zbl 1142.62045 · doi:10.1214/07-AOS529
[35] Zhu, L.-X. and Ng, K. W. (1995). Asymptotics of sliced inverse regression. Statist. Sinica 5 727-736. · Zbl 0824.62036
[36] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[37] Zou, H., Hastie, T. and Tibshirani, R. (2006). Sparse principal component analysis. J. Comput. Graph. Statist. 15 265-286. · doi:10.1198/106186006X113430
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.