×

Oracle inequalities for weighted group Lasso in high-dimensional misspecified Cox models. (English) Zbl 1503.62080

Summary: We study the nonasymptotic properties of a general norm penalized estimator, which include Lasso, weighted Lasso, and group Lasso as special cases, for sparse high-dimensional misspecified Cox models with time-dependent covariates. Under suitable conditions on the true regression coefficients and random covariates, we provide oracle inequalities for prediction and estimation error based on the group sparsity of the true coefficient vector. The nonasymptotic oracle inequalities show that the penalized estimator has good sparse approximation of the true model and enables to select a few meaningful structure variables among the set of features.

MSC:

62N02 Estimation in survival analysis and censored data
62J07 Ridge regression; shrinkage estimators (Lasso)
62G20 Asymptotic properties of nonparametric inference
62J05 Linear regression; mixed models
62G08 Nonparametric regression and quantile regression

Software:

KEGG
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Andersen, P. K.; Borgan, O.; Gill, R. D.; Keiding, N., Statistical Models Based on Counting Processes (1993), Berlin: Springer, Berlin · Zbl 0769.62061
[2] Andersen, P. K.; Gill, R. D., Cox’s regression model for counting processes: a large sample study, Ann. Stat., 10, 4, 1100-1120 (1982) · Zbl 0526.62026
[3] Bartlett, P. L.; Mendelson, S.; Neeman, J., L1-regularized linear regression: persistence and oracle inequalities, Probab. Theory Relat. Fields, 154, 1, 193-224 (2012) · Zbl 1395.62207
[4] Bickel, P. J.; Ritov, Y. A.; Tsybakov, A. B., Simultaneous analysis of lasso and Dantzig selector, Ann. Stat., 37, 1705-1732 (2009) · Zbl 1173.62022
[5] Blazere, M.; Loubes, J. M.; Gamboa, F., Oracle inequalities for a group lasso procedure applied to generalized linear models in high dimension, IEEE Trans. Inf. Theory, 60, 4, 2303-2318 (2014) · Zbl 1360.94123
[6] Cox, D. R., Regression models and life-tables, J. R. Stat. Soc., Ser. B, Methodol., 34, 187-220 (1972) · Zbl 0243.62041
[7] Cox, D. R., Partial likelihood, Biometrika, 62, 269-276 (1975) · Zbl 0312.62002
[8] Dvoretzky, A.; Kiefer, J.; Wolfowitz, J., Asymptotic minimax character of the sample distribution function and of the classical multinomial estimator, Ann. Math. Stat., 27, 3, 642-669 (1956) · Zbl 0073.14603
[9] Fan, J.; Li, R., Variable selection for Cox’s proportional hazards model and frailty model, Ann. Stat., 30, 74-99 (2002) · Zbl 1012.62106
[10] Greenshtein, E.; Ritov, Y. A., Persistence in high-dimensional linear predictor selection and the virtue of overparametrization, Bernoulli, 10, 6, 971-988 (2004) · Zbl 1055.62078
[11] Honda, T.; Hardle, W. K., Variable selection in Cox regression models with varying coefficients, J. Stat. Plan. Inference, 148, 67-81 (2014) · Zbl 1432.62338
[12] Huang, H., Gao, Y., Zhang, H., Li, B.: Weighted lasso estimates for sparse logistic regression: non-asymptotic properties with measurement error. Acta Math. Sci. (2021, in press). arXiv preprint, arXiv:2006.06136
[13] Huang, J.; Sun, T.; Ying, Z.; Yu, Y.; Zhang, C. H., Oracle inequalities for the lasso in the Cox model, Ann. Stat., 41, 3, 1142-1165 (2013) · Zbl 1292.62135
[14] Kanehisa, M.; Goto, S., KEGG: Kyoto encyclopedia of genes and genomes, Nucleic Acids Res., 28, 1, 27-30 (2000)
[15] Knight, K.; Fu, W., Asymptotics for lasso-type estimators, Ann. Stat., 28, 1356-1378 (2000) · Zbl 1105.62357
[16] Kong, S.; Nan, B., Non-asymptotic oracle inequalities for the high-dimensional Cox regression via lasso, Stat. Sin., 24, 1, 25-42 (2014) · Zbl 1416.62404
[17] Lemler, S., Oracle inequalities for the lasso in the high-dimensional Aalen multiplicative intensity model, Ann. Inst. Henri Poincaré Probab. Stat., 52, 2, 981-1008 (2016) · Zbl 1342.62158
[18] Lounici, K.; Pontil, M.; Van De Geer, S.; Tsybakov, A. B., Oracle inequalities and optimal inference under group sparsity, Ann. Stat., 39, 4, 2164-2204 (2011) · Zbl 1306.62156
[19] Massart, P., The tight constant in the Dvoretzky-Kiefer-Wolfowitz inequality, Ann. Probab., 18, 1269-1283 (1990) · Zbl 0713.62021
[20] Rosenwald, A.; Wright, G.; Chan, W. C.; Connors, J. M.; Campo, E.; Fisher, R. I.; Giltnane, J. M., The use of molecular profiling to predict survival after chemotherapy for diffuse large-B-cell lymphoma, N. Engl. J. Med., 346, 25, 1937-1947 (2002)
[21] Struthers, C. A.; Kalbfleisch, J. D., Misspecified proportional hazard models, Biometrika, 73, 2, 363-369 (1986) · Zbl 0606.62108
[22] Talagrand, M., Sharper bounds for Gaussian and empirical processes, Ann. Probab., 22, 28-76 (1994) · Zbl 0798.60051
[23] Tibshirani, R., Regression shrinkage and selection via the lasso, J. R. Stat. Soc., Ser. B, Methodol., 58, 267-288 (1996) · Zbl 0850.62538
[24] Tibshirani, R., The lasso method for variable selection in the Cox model, Stat. Med., 16, 4, 385-395 (1997)
[25] van der Vaart, A. W.; Wellner, J. A., Weak Convergence and Empirical Processes: With Applications to Statistics (1996), Berlin: Springer, Berlin · Zbl 0862.60002
[26] Wainwright, M. J., High-Dimensional Statistics: A Non-asymptotic Viewpoint (2019), Cambridge: Cambridge University Press, Cambridge · Zbl 1457.62011
[27] Wang, S.; Nan, B.; Zhu, N.; Zhu, J., Hierarchically penalized Cox regression with grouped variables, Biometrika, 96, 2, 307-322 (2009) · Zbl 1163.62089
[28] Yan, J.; Huang, J., Model selection for Cox models with time-varying coefficients, Biometrics, 68, 2, 419-428 (2012) · Zbl 1251.62052
[29] Yuan, M.; Lin, Y., Model selection and estimation in regression with grouped variables, J. R. Stat. Soc., Ser. B, Stat. Methodol., 68, 1, 49-67 (2006) · Zbl 1141.62030
[30] Zhang, D. X., Tail bounds for the suprema of empirical processes over unbounded classes of functions, Acta Math. Sin., 22, 339-345 (2006) · Zbl 1112.62043
[31] Zhang, H., Chen, S.X.: Concentration inequalities for statistical inference. arXiv preprint, arXiv:2011.02258
[32] Zhang, H.; Jia, J., Elastic-net regularized high-dimensional negative binomial regression: consistency and weak signals detection, Stat. Sin. (2021)
[33] Zhang, H.; Wu, X., Compound Poisson point processes, concentration and oracle inequalities, J. Inequal. Appl., 2019, 1 (2019) · Zbl 1499.60155
[34] Zhang, H. H.; Lu, W., Adaptive lasso for Cox’s proportional hazards model, Biometrika, 94, 3, 691-703 (2007) · Zbl 1135.62083
[35] Zhao, H.; Wu, Q.; Li, G.; Sun, J., Simultaneous estimation and variable selection for interval-censored data with broken adaptive ridge regression, J. Am. Stat. Assoc., 115, 204-216 (2020) · Zbl 1437.62283
[36] Zhou, S.; Zhou, J.; Zhang, B., High-dimensional generalized linear models incorporating graphical structure among predictors, Electron. J. Stat., 13, 2, 3161-3194 (2019) · Zbl 1431.62322
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.