×

A fast unified algorithm for solving group-lasso penalize learning problems. (English) Zbl 1331.62343

Summary: This paper concerns a class of group-lasso learning problems where the objective function is the sum of an empirical loss and the group-lasso penalty. For a class of loss function satisfying a quadratic majorization condition, we derive a unified algorithm called groupwise-majorization-descent (GMD) for efficiently computing the solution paths of the corresponding group-lasso penalized learning problem. GMD allows for general design matrices, without requiring the predictors to be group-wise orthonormal. As illustration examples, we develop concrete algorithms for solving the group-lasso penalized least squares and several group-lasso penalized large margin classifiers. These group-lasso models have been implemented in an R package gglasso publicly available from the Comprehensive R Archive Network (CRAN) at http://cran.r-project.org/web/packages/gglasso. On simulated and real data, gglasso consistently outperforms the existing software for computing the group-lasso that implements either the classical groupwise descent algorithm or Nesterov’s method.

MSC:

62J05 Linear regression; mixed models
62J07 Ridge regression; shrinkage estimators (Lasso)
62-04 Software, source code, etc. for problems pertaining to statistics

Software:

SLEP; gglasso; glmnet; R; CRAN
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Alon, U., Barkai, N., Notterman, D., Gish, K., Ybarra, S., Mack, D., Levine, A.: Broad patterns of gene expression revealed by clustering analysis of tumor and normal colon tissues probed by oligonucleotide arrays. Proc. Nat. Acad. Sci. 96(12), 6745 (1999) · doi:10.1073/pnas.96.12.6745
[2] Daubechies, I., Defrise, M., De Mol, C.: An iterative thresholding algorithm for linear inverse problems with a sparsity constraint. Commun. Pure Appl. Math. 57, 1413-1457 (2004) · Zbl 1077.65055 · doi:10.1002/cpa.20042
[3] Efron, B., Hastie, T., Johnstone, I., Tibshirani, R.: Least angle regression. Ann. Stat. 32(2), 407-451 (2004) · Zbl 1091.62054 · doi:10.1214/009053604000000067
[4] Friedman, J., Hastie, T., Tibshirani, R.: Regularized paths for generalized linear models via coordinate descent. J. Stat. Softw. 33, 1-22 (2010) · doi:10.18637/jss.v033.i01
[5] Fu, W.: Penalized regressions: the bridge versus the lasso. J. Comput. Gr. Stat. 7(3), 397-416 (1998)
[6] Genkin, A., Lewis, D., Madigan, D.: Large-scale Bayesian logistic regression for text categorization. Technometrics 49(3), 291-304 (2007) · doi:10.1198/004017007000000245
[7] Gorman, R., Sejnowski, T.: Analysis of hidden units in a layered network trained to classify sonar targets. Neural Netw. 1(1), 75-89 (1988) · doi:10.1016/0893-6080(88)90023-8
[8] Graham, K., de Las Morenas, A., Tripathi, A., King, C., Kavanah, M., Mendez, J., Stone, M., Slama, J., Miller, M., Antoine, G., et al.: Gene expression in histologically normal epithelium from breast cancer patients and from cancer-free prophylactic mastectomy patients shares a similar profile. Br. J. Cancer 102(8), 1284-1293 (2010) · doi:10.1038/sj.bjc.6605576
[9] Hunter, D., Lange, K.: A tutorial on MM algorithms. Am. Stat. 58(1), 30-37 (2004) · doi:10.1198/0003130042836
[10] Lange, K., Hunter, D., Yang, I.: Optimization transfer using sur- rogate objective functions (with discussion). J. Comput. Gr. Stat. 9, 1-20 (2000)
[11] Liu, J., Ji, S., Ye, J.: SLEP: Sparse Learning with Efficient Projections, Arizona State University. URL: http://www.public.asu.edu/jye02/Software/SLEP (2009) · Zbl 1006.65062
[12] Meier, L., van de Geer, S., Bühlmann, P.: The group lasso for logistic regression. J. R. Stat. Soc. Ser. B 70, 53-71 (2008) · Zbl 1400.62276 · doi:10.1111/j.1467-9868.2007.00627.x
[13] Nesterov, Y.: ‘Introductory lectures on convex optimization: A basic course’, Operations Research. (2004) · Zbl 1086.90045
[14] Nesterov, Y.: Gradient methods for minimizing composite objective function, Technical report, Technical Report, Center for Operations Research and Econometrics (CORE), Catholic University of Louvain (UCL) (2007)
[15] Osborne, M., Presnell, B., Turlach, B.: A new approach to variable selection in least squares problems. IMA J. Numer. Anal. 20(3), 389-403 (2000) · Zbl 0962.65036 · doi:10.1093/imanum/20.3.389
[16] Quinlan, J.: Combining instance-based and model-based learning. In: Proceedings of the Tenth International Conference on Machine Learning, pp. 236-243 (1993)
[17] Sæbø, S., Almøy, T., Aarøe, J., Aastveit, A.: St-pls: a multi-directional nearest shrunken centroid type classifier via pls. J. Chemom. 22(1), 54-62 (2008) · doi:10.1002/cem.1101
[18] Scheetz, T., Kim, K., Swiderski, R., Philp, A., Braun, T., Knudtson, K., Dorrance, A., DiBona, G., Huang, J., Casavant, T., et al.: Regulation of gene expression in the mammalian eye and its relevance to eye disease. Proc. Nat. Acad. Sci. 103(39), 14429-14434 (2006) · doi:10.1073/pnas.0602562103
[19] Segal, M., Dahlquist, K., Conklin, B.: Regression approaches for microarray data analysis. J. Comput. Biol. 10(6), 961-980 (2003) · doi:10.1089/106652703322756177
[20] Singh, D., Febbo, P., Ross, K., Jackson, D., Manola, J., Ladd, C., Tamayo, P., Renshaw, A., D’Amico, A., Richie, J., et al.: Gene expression correlates of clinical prostate cancer behavior. Cancer Cell 1(2), 203-209 (2002) · doi:10.1016/S1535-6108(02)00030-2
[21] Tibshirani, R.: Regression shrinkage and selection via the lasso. J. R. Stat. Soc. Ser. B 58, 267-288 (1996) · Zbl 0850.62538
[22] Tibshirani, R., Bien, J., Friedman, J., Hastie, T., Simon, N., Taylor, J., Tibshirani, R.: Strong rules for discarding predictors in lasso-type problems. J. R. Stat. Soc. Ser. B 74, 245-266 (2012) · Zbl 1411.62213 · doi:10.1111/j.1467-9868.2011.01004.x
[23] Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475-494 (2001) · Zbl 1006.65062 · doi:10.1023/A:1017501703105
[24] Vogt, J., Roth, V.: A complete analysis of the \[l_{1, p}\] l1,p group-lasso. In: Proceedings of the 29th International Conference on Machine Learning (ICML-12), ICML 2012, Omnipress, pp 185-192 (2012) · Zbl 1091.62054
[25] Wang, H., Leng, C.: A note on adaptive group lasso. Comput. Stat. Data Anal. 52, 5277-5286 (2008) · Zbl 1452.62524 · doi:10.1016/j.csda.2008.05.006
[26] Wang, L., Zhu, J., Zou, H.: Hybrid huberized support vector machines for microarray classification and gene selection. Bioinformatics 24, 412-419 (2008) · doi:10.1093/bioinformatics/btm579
[27] Wu, T., Lange, K.: Coordinate descent algorithms for lasso penalized regression. Ann. Appl. Stat. 2, 224-244 (2008) · Zbl 1137.62045 · doi:10.1214/07-AOAS147
[28] Wu, T., Lange, K.: The MM alternative to EM. Stat. Sci. 4, 492-505 (2010) · Zbl 1329.62106 · doi:10.1214/08-STS264
[29] Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. Ser. B 68, 49-67 (2006) · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[30] Zhang, T.: Statistical behavior and consistency of classification methods based on convex risk minimization. Ann. Stat. 32, 56-85 (2004) · Zbl 1105.62323 · doi:10.1214/aos/1079120130
[31] Zou, H.: The adaptive lasso and its oracle properties. J. Am. Stat. Assoc. 101, 1418-1429 (2006) · Zbl 1171.62326 · doi:10.1198/016214506000000735
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.