×

zbMATH — the first resource for mathematics

Statistical significance in high-dimensional linear models. (English) Zbl 1273.62173
Summary: We propose a method for constructing \(p\)-values for general hypotheses in a high-dimensional linear model. The hypotheses can be local for testing a single regression parameter or they may be more global involving several up to all parameters. Furthermore, when considering many hypotheses, we show how to adjust for multiple testing taking dependence among the \(p\)-values into account.
Our technique is based on ridge estimation with an additional correction term due to a substantial projection bias in high dimensions. We prove strong error control for our \(p\)-values and provide sufficient conditions for detection: for the former, we do not make any assumption on the size of the true underlying regression coefficients while regarding the latter, our procedure might not be optimal in terms of power. We demonstrate the method in simulated examples and a real data application.

MSC:
62J07 Ridge regression; shrinkage estimators (Lasso)
62J05 Linear regression; mixed models
62H15 Hypothesis testing in multivariate analysis
62F05 Asymptotic properties of parametric tests
65C60 Computational problems in statistics (MSC2010)
Software:
boost; ElemStatLearn
PDF BibTeX XML Cite
Full Text: DOI Euclid arXiv
References:
[1] Bickel, P.J., Ritov, Y. and Tsybakov, A.B. (2009). Simultaneous analysis of lasso and Dantzig selector. Ann. Statist. 37 1705-1732. · Zbl 1173.62022 · doi:10.1214/08-AOS620
[2] Bickel, P.J., Klaassen, C.A.J., Ritov, Y. and Wellner, J.A. (1998). Efficient and Adaptive Estimation for Semiparametric Models . New York: Springer. · Zbl 0894.62005
[3] Bühlmann, P. (2006). Boosting for high-dimensional linear models. Ann. Statist. 34 559-583. · Zbl 1095.62077 · doi:10.1214/009053606000000092
[4] Bühlmann, P., Kalisch, M. and Maathuis, M.H. (2010). Variable selection in high-dimensional linear models: Partially faithful distributions and the PC-simple algorithm. Biometrika 97 261-278. · Zbl 1233.62135 · doi:10.1093/biomet/asq008
[5] Bühlmann, P. and van de Geer, S. (2011). Statistics for High-dimensional Data : Methods , Theory and Applications. Springer Series in Statistics . Heidelberg: Springer. · Zbl 1273.62015
[6] Bunea, F., Tsybakov, A. and Wegkamp, M. (2007). Sparsity oracle inequalities for the Lasso. Electron. J. Stat. 1 169-194. · Zbl 1146.62028 · doi:10.1214/07-EJS008
[7] Candes, E. and Tao, T. (2007). The Dantzig selector: Statistical estimation when \(p\) is much larger than \(n\). Ann. Statist. 35 2313-2351. · Zbl 1139.62019 · doi:10.1214/009053606000001523 · euclid:aos/1201012958
[8] Dettling, M. (2004). BagBoosting for tumor classification with gene expression data. Bioinformatics 20 3583-3593.
[9] El Karoui, N. (2008). Spectrum estimation for large dimensional covariance matrices using random matrix theory. Ann. Statist. 36 2757-2790. · Zbl 1168.62052 · doi:10.1214/07-AOS581
[10] Fan, J. and Li, R. (2001). Variable selection via nonconcave penalized likelihood and its oracle properties. J. Amer. Statist. Assoc. 96 1348-1360. · Zbl 1073.62547 · doi:10.1198/016214501753382273
[11] Fan, J. and Lv, J. (2008). Sure independence screening for ultrahigh dimensional feature space. J. R. Stat. Soc. Ser. B Stat. Methodol. 70 849-911. · doi:10.1111/j.1467-9868.2008.00674.x
[12] Fan, J. and Lv, J. (2010). A selective overview of variable selection in high dimensional feature space. Statist. Sinica 20 101-148. · Zbl 1180.62080 · www3.stat.sinica.edu.tw
[13] Greenshtein, E. and Ritov, Y. (2004). Persistence in high-dimensional linear predictor selection and the virtue of overparametrization. Bernoulli 10 971-988. · Zbl 1055.62078 · doi:10.3150/bj/1106314846
[14] Hastie, T., Tibshirani, R. and Friedman, J. (2009). The Elements of Statistical Learning : Data Mining , Inference , and Prediction , 2nd ed. Springer Series in Statistics . New York: Springer. · Zbl 1273.62005
[15] Huang, J., Ma, S. and Zhang, C.H. (2008). Adaptive Lasso for sparse high-dimensional regression models. Statist. Sinica 18 1603-1618. · Zbl 1255.62198
[16] Ingster, Y.I., Tsybakov, A.B. and Verzelen, N. (2010). Detection boundary in sparse regression. Electron. J. Stat. 4 1476-1526. · Zbl 1329.62314 · doi:10.1214/10-EJS589
[17] Knight, K. and Fu, W. (2000). Asymptotics for lasso-type estimators. Ann. Statist. 28 1356-1378. · Zbl 1105.62357 · doi:10.1214/aos/1015957397
[18] Koltchinskii, V. (2009a). The Dantzig selector and sparsity oracle inequalities. Bernoulli 15 799-828. · Zbl 1452.62486 · doi:10.3150/09-BEJ187
[19] Koltchinskii, V. (2009b). Sparsity in penalized empirical risk minimization. Ann. Inst. Henri Poincaré Probab. Stat. 45 7-57. · Zbl 1168.62044 · doi:10.1214/07-AIHP146 · eudml:78023
[20] Meinshausen, N. (2007). Relaxed Lasso. Comput. Statist. Data Anal. 52 374-393. · Zbl 1452.62522
[21] Meinshausen, N. and Bühlmann, P. (2006). High-dimensional graphs and variable selection with the lasso. Ann. Statist. 34 1436-1462. · Zbl 1113.62082 · doi:10.1214/009053606000000281
[22] Meinshausen, N. and Bühlmann, P. (2010). Stability selection. J. R. Stat. Soc. Ser. B Stat. Methodol. 72 417-473. · doi:10.1111/j.1467-9868.2010.00740.x
[23] Meinshausen, N., Maathuis, M. and Bühlmann, P. (2011). Asymptotic optimality of the Westfall-Young permutation procedure for multiple testing under dependence. Ann. Statist. 39 3369-3391. · Zbl 1246.62124 · doi:10.1214/11-AOS946
[24] Meinshausen, N., Meier, L. and Bühlmann, P. (2009). \(p\)-values for high-dimensional regression. J. Amer. Statist. Assoc. 104 1671-1681. · Zbl 1205.62089 · doi:10.1198/jasa.2009.tm08647
[25] Meinshausen, N. and Yu, B. (2009). Lasso-type recovery of sparse representations for high-dimensional data. Ann. Statist. 37 246-270. · Zbl 1155.62050 · doi:10.1214/07-AOS582 · www.projecteuclid.org
[26] Raskutti, G., Wainwright, M.J. and Yu, B. (2010). Restricted eigenvalue properties for correlated Gaussian designs. J. Mach. Learn. Res. 11 2241-2259. · Zbl 1242.62071
[27] Shao, J. and Deng, X. (2012). Estimation in high-dimensional linear models with deterministic design matrices. Ann. Statist. 40 812-831. · Zbl 1273.62177 · doi:10.1214/12-AOS982
[28] Sun, T. and Zhang, C.H. (2012). Scaled sparse linear regression. Biometrika 99 879-898. · Zbl 1452.62515 · doi:10.1093/biomet/ass043
[29] Tibshirani, R. (1996). Regression shrinkage and selection via the lasso. J. Roy. Statist. Soc. Ser. B 58 267-288. · Zbl 0850.62538
[30] Tropp, J.A. (2004). Greed is good: Algorithmic results for sparse approximation. IEEE Trans. Inform. Theory 50 2231-2242. · Zbl 1288.94019 · doi:10.1109/TIT.2004.834793
[31] van de Geer, S. (2007). The deterministic Lasso. In JSM Proceedings , 2007 140. American Statistical Association.
[32] van de Geer, S.A. (2008). High-dimensional generalized linear models and the lasso. Ann. Statist. 36 614-645. · Zbl 1138.62323 · doi:10.1214/009053607000000929
[33] van de Geer, S.A. and Bühlmann, P. (2009). On the conditions used to prove oracle results for the Lasso. Electron. J. Stat. 3 1360-1392. · Zbl 1327.62425 · doi:10.1214/09-EJS506
[34] van de Geer, S., Bühlmann, P. and Zhou, S. (2011). The adaptive and the thresholded Lasso for potentially misspecified models (and a lower bound for the Lasso). Electron. J. Stat. 5 688-749. · Zbl 1274.62471 · doi:10.1214/11-EJS624
[35] Vershynin, R. (2012). Introduction to the non-asymptotic analysis of random matrices. In Compressed Sensing (Y. Eldar and G. Kutyniok, eds.) 210-268. Cambridge: Cambridge Univ. Press. · doi:10.1017/CBO9780511794308.006
[36] Wainwright, M.J. (2009). Sharp thresholds for high-dimensional and noisy sparsity recovery using \(\ell_{1}\)-constrained quadratic programming (Lasso). IEEE Trans. Inform. Theory 55 2183-2202. · Zbl 1367.62220 · doi:10.1109/TIT.2009.2016018
[37] Wang, H. (2009). Forward regression for ultra-high dimensional variable screening. J. Amer. Statist. Assoc. 104 1512-1524. · Zbl 1205.62103 · doi:10.1198/jasa.2008.tm08516
[38] Wasserman, L. and Roeder, K. (2009). High-dimensional variable selection. Ann. Statist. 37 2178-2201. · Zbl 1173.62054 · doi:10.1214/08-AOS646
[39] Westfall, P. and Young, S. (1993). Resampling-based Multiple Testing : Examples and Methods for \(P\)-value Adjustment . New York: John Wiley & Sons. · Zbl 0850.62368
[40] Zhang, C.H. (2010). Nearly unbiased variable selection under minimax concave penalty. Ann. Statist. 38 894-942. · Zbl 1183.62120 · doi:10.1214/09-AOS729
[41] Zhang, C.H. and Huang, J. (2008). The sparsity and bias of the LASSO selection in high-dimensional linear regression. Ann. Statist. 36 1567-1594. · Zbl 1142.62044 · doi:10.1214/07-AOS520
[42] Zhang, C.H. and Zhang, S. (2011). Confidence intervals for low-dimensional parameters with high-dimensional data. Available at . 1110.2563v1 · arxiv.org
[43] Zhao, P. and Yu, B. (2006). On model selection consistency of Lasso. J. Mach. Learn. Res. 7 2541-2563. · Zbl 1222.62008 · www.jmlr.org
[44] Zou, H. (2006). The adaptive lasso and its oracle properties. J. Amer. Statist. Assoc. 101 1418-1429. · Zbl 1171.62326 · doi:10.1198/016214506000000735
[45] Zou, H. and Li, R. (2008). One-step sparse estimates in nonconcave penalized likelihood models. Ann. Statist. 36 1509-1533. · Zbl 1142.62027 · doi:10.1214/009053607000000802
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.