×

An improved variable selection procedure for adaptive Lasso in high-dimensional survival analysis. (English) Zbl 1429.62535

Summary: Motivated by high-dimensional genomic studies, we develop an improved procedure for adaptive Lasso in high-dimensional survival analysis. The proposed procedure effectively reduces the false discoveries while successfully maintaining the false negative proportions, which improves the existing adaptive Lasso procedures. The implementation of the proposed procedure is straightforward and it is sufficiently flexible to accommodate large-scale problems where traditional procedures are impractical. To quantify the uncertainty of variable selection and control the family-wise error rate, a multiple sample-splitting based testing algorithm is developed. The practical utility of the proposed procedure are examined through simulation studies. The methods developed are then applied to a multiple myeloma data set.

MSC:

62P10 Applications of statistics to biology and medical sciences; meta analysis
92D10 Genetics and epigenetics
62J07 Ridge regression; shrinkage estimators (Lasso)
62N02 Estimation in survival analysis and censored data

Software:

ElemStatLearn
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Alexande DH, Lange K (2011) Stability selection for genome-wide association. Genet Epidemiol 35(7):722-728 · doi:10.1002/gepi.20623
[2] Bataille R, Grenier J, Sany J (1984) Beta-2-microglobulin in myeloma: optimal use for staging, prognosis, and treatment-a prospective study of 160 patients. Blood 63(2):468-476
[3] Bühlmann P, van de Geer S (2011) Statistics for high-dimensional data: methods, theory and applications. Springer, Berlin · Zbl 1273.62015 · doi:10.1007/978-3-642-20192-9
[4] Chapman MA, Lawrence MS, Keats JJ, Cibulskis K, Sougnez C, Schinzel AC, Golub TR (2011) Initial genome sequencing and analysis of multiple myeloma. Nature 471(7339):467-472 · doi:10.1038/nature09837
[5] Di Luccio E (2015) Inhibition of nuclear receptor binding SET domain 2/multiple myeloma SET domain by LEM-06 implication for epigenetic cancer therapies. J Cancer Prev 20(2):113-120 · doi:10.15430/JCP.2015.20.2.113
[6] Fan J, Li R (2001) Variable selection via nonconcave penalized likelihood and its oracle properties. J Am Stat Assoc 96(456):1348-1360 · Zbl 1073.62547 · doi:10.1198/016214501753382273
[7] Fan J, Li R (2002) Variable selection for Cox’s proportional hazards model and frailty model. Ann Stat 30(1):74-99 · Zbl 1012.62106 · doi:10.1214/aos/1015362185
[8] Geoman JJ (2010) L1 penalized estimation in the Cox proportional hazards model. Biom J 52(1):70-84 · Zbl 1207.62185
[9] Gui J, Li H (2005) Penalized cox regression analysis in the high-dimensional and low-sample size settings with application to microarray gene expression data. Bioinformatics 21(13):3001-3008 · doi:10.1093/bioinformatics/bti422
[10] Hastie T, Tibshirani R, Friedman J (2009) The elements of statistical learning: data mining, inference, and prediction. Springer, New York · Zbl 1273.62005 · doi:10.1007/978-0-387-84858-7
[11] Heagerty PJ, Zheng Y (2005) Survival model predictive accuracy and ROC curves. Biometrics 61(1):92105 · Zbl 1077.62077 · doi:10.1111/j.0006-341X.2005.030814.x
[12] Kyle RA, Rajkuma SV (2008) Multiple myeloma. Blood 111(6):2962-2972 · doi:10.1182/blood-2007-10-078022
[13] MAQC Consortium (2010) The MAQC-II project: a comprehensive study of common practices for the development and validation of microarray-based predictive models. Nat Biotechnol 28(8):827-838 · doi:10.1038/nbt.1665
[14] Meinshausen N, Meier L, Bühlmann P (2009) P-values for high-dimensional regression. J Am Stat Assoc 104(488):1671-1681 · Zbl 1205.62089 · doi:10.1198/jasa.2009.tm08647
[15] Shaughnessy JD, Zhan F, Burington BE, Huang Y, Colla S, Hanamura I, Stewart JP, Kordsmeier B, Randolph C, Williams DR, Xiao Y, Xu H, Epstein J, Anaissie E, Krishna SG, Cottler-Fox M, Hollmig K, Mohiuddin A, Pineda-Roman M, Tricot G, van Rhee F, Sawyer J, Alsayed Y, Walker R, Zangari M, Crowley J, Barlogie B (2007) A validated gene expression model of high-risk multiple myeloma is defined by deregulated expression of genes mapping to chromosome 1. Blood 109(6):2276-2284 · doi:10.1182/blood-2006-07-038430
[16] Simon N, Friedman J, Hastie T, Tibshirani R (2011) Regularization paths for Cox’s proportional hazards model via coordinate descent. J Stat Softw 39(5):1-13 · doi:10.18637/jss.v039.i05
[17] Song LL, Ponomareva L, Shen H, Duan X, Alimirah F, Choubey D (2010) Interferon-inducible IFI16, a negative regulator of cell growth, down-regulates expression of human telomerase reverse transcriptase (hTERT) gene. PLOS ONE 5(1):e8569 · doi:10.1371/journal.pone.0008569
[18] Sun S, Hood M, Scott L, Peng Q, Mukherjee S, Tung J, Zhou X (2017) Differential expression analysis for RNAseq using Poisson mixed models. Nucleic Acids Res 45(11):e106 · doi:10.1093/nar/gkx204
[19] Tibshirani R (1996) Regression shrinkage and selection via the lasso. J R Stat Soc Ser B (Methodol) 58(1):267-288 · Zbl 0850.62538
[20] Tibshirani R (1997) The lasso method for variable selection in the Cox model. Stat Med 16(4):385-395 · doi:10.1002/(SICI)1097-0258(19970228)16:4<385::AID-SIM380>3.0.CO;2-3
[21] Uno H, Cai T, Pencina MJ, D‘gostino RB, Wei LJ (2011) On the C-statistics for evaluating overall adequacy of risk prediction procedures with censored survival data. Stat Med 30(10):1105-1117
[22] Zhang H, Lu W (2007) Adaptive Lasso for Cox’s proportional hazards model. Biometrika 94(3):691-703 · Zbl 1135.62083 · doi:10.1093/biomet/asm037
[23] Zhao DS, Li Y (2014) Score test variable screening. Biometrics 70(4):862-871 · Zbl 1393.62116 · doi:10.1111/biom.12209
[24] Zhou SH, van de Geer S, Bühlmann P (2009) Adaptive Lasso forhigh dimensional regression and Gaussian graphical modeling. arXiv:0903.2515
[25] Zou H, Hastie T (2005) Regression shrinkage and selection via the elastic net with application to microarrays. J R Stat Soc Ser B (Methodol) 67(2):301-320 · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
[26] Zou H, Li R (2008) One-step sparse estimates in nonconcave penalized likelihood models. Ann Stat 36(4):1509-1533 · Zbl 1142.62027 · doi:10.1214/009053607000000802
[27] Zou H, Zhang HH (2009) On the adaptive elastic-net with a diverging number of parameters. Ann Stat 37(4):1733-1751 · Zbl 1168.62064 · doi:10.1214/08-AOS625
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.