zbMATH — the first resource for mathematics

Efficient block-coordinate descent algorithms for the group Lasso. (English) Zbl 1275.90059
Summary: We present two algorithms to solve the group Lasso problem [M. Yuan and Y. Lin, J. R. Stat. Soc., Ser. B, Stat. Methodol. 68, No. 1, 49–67 (2006; Zbl 1141.62030)]. First, we propose a general version of the block coordinate descent (BCD) algorithm for the group Lasso that employs an efficient approach for optimizing each subproblem exactly. We show that it exhibits excellent performance when the groups are of moderate size. For groups of large size, we propose an extension of ISTA/FISTA SIAM [A. Beck and M. Teboulle, SIAM J. Imaging Sci. 2, No. 1, 183–202 (2009; Zbl 1175.94009)] based on variable step-lengths that can be viewed as a simplified version of BCD. By combining the two approaches we obtain an implementation that is very competitive and often outperforms other state-of-the-art approaches for this problem. We show how these methods fit into the globally convergent general block coordinate gradient descent framework [P. Tseng and S. Yun, Math. Program. 117, No. 1–2 (B), 387–423 (2009; Zbl 1166.90016)]. We also show that the proposed approach is more efficient in practice than the one implemented by Tseng and Yun [loc. cit.]. In addition, we apply our algorithms to the multiple measurement vector (MMV) recovery problem, which can be viewed as a special case of the group Lasso problem, and compare their performance to other methods in this particular instance.

90C25 Convex programming
Full Text: DOI
[1] Bach, F.: Consistency of the group Lasso and multiple kernel learning. J. Mach. Learn. Res. 9, 1179–1225 (2008) · Zbl 1225.68147
[2] Beck, A., Teboulle, M.: A fast iterative shrinkage-thresholding algorithm for linear inverse problems. SIAM J. Imag. Sci. 2(1), 183–202 (2009) · Zbl 1175.94009 · doi:10.1137/080716542
[3] van den Berg, E., Friedlander, M.: Joint-sparse recovery from multiple measurements. arXiv 904 (2009)
[4] van den Berg, E., Friedlander, M.: Sparse Optimization With Least-squares Constraints. Tech. rep., Technical Report TR-2010-02, Department of Computer Science, University of British Columbia, Columbia (2010) · Zbl 1242.49061
[5] van den Berg, E., Schmidt, M., Friedlander, M., Murphy, K.: Group sparsity via linear-time projection. Tech. rep., Technical Report TR-2008-09, Department of Computer Science, University of British Columbia, Columbia (2008)
[6] Candès, E., Romberg, J., Tao, T.: Robust uncertainty principles: exact signal reconstruction from highly incomplete frequency information. Inform Theory IEEE Trans 52(2), 489–509 (2006) · Zbl 1231.94017 · doi:10.1109/TIT.2005.862083
[7] Chen, J., Huo, X.: Theoretical results on sparse representations of multiple-measurement vectors. IEEE Trans. Signal Process. 54, 12 (2006) · Zbl 1375.94051
[8] Dolan, E., Moré, J.: Benchmarking optimization software with performance profiles. Math. Program. 91(2), 201–213 (2002) · Zbl 1049.90004 · doi:10.1007/s101070100263
[9] Donoho, D.: Compressed sensing, information theory. IEEE Trans. 52(4), 1289–1306 (2006) · Zbl 1288.94016
[10] Friedman, J., Hastie, T., Tibshirani, R.: A note on the group lasso and a sparse group lasso. preprint, Leipzig (2010)
[11] Jacob, L., Obozinski, G., Vert, J.: Group Lasso with overlap and graph Lasso. In: Proceedings of the 26th Annual International Conference on Machine Learning, ACM, New York, pp. 433–440 (2009)
[12] Kim, D., Sra, S., Dhillon, I.: A scalable trust-region algorithm with application to mixed-norm regression. vol. 1. In: Internetional Conference Machine Learning (ICML), Atlanta (2010) · Zbl 1220.93085
[13] Kim, S., Xing, E.: Tree-guided group lasso for multi-task regression with structured sparsity. In: Proceedings of the 27th Annual International Conference on, Machine Learning, New York (2010)
[14] Liu, J., Ji, S., Ye, J.: Multi-task feature learning via efficient l 2, 1-norm minimization. In: Proceedings of the Twenty-Fifth Conference on Uncertainty in Artificial Intelligence, AUAI Press, Corvallis, pp. 339–348 (2009)
[15] Liu, J., Ji, S., Ye, J.: SLEP: Sparse Learning with Efficient Projections. Arizona State University, Arizona (2009)
[16] Ma, S., Song, X., Huang, J.: Supervised group Lasso with applications to microarray data analysis. BMC bioinformatics 8(1), 60 (2007) · doi:10.1186/1471-2105-8-60
[17] Meier, L., Van De Geer, S., Buhlmann, P.: The group lasso for logistic regression. J. Royal Stat. Soc. Ser. B (Stat. Methodol.) 70(1), 53–71 (2008) · Zbl 1400.62276 · doi:10.1111/j.1467-9868.2007.00627.x
[18] Moré, J., Sorensen, D.: Computing a trust region step. SIAM J. Sci. Statist. Comput. 4, 553 (1983) · Zbl 0551.65042 · doi:10.1137/0904038
[19] Nesterov, Y.: Efficiency of coordinate descent methods on huge-scale optimization problems. CORE Discussion Papers, Belgique (2010) · Zbl 1257.90073
[20] Nocedal, J., Wright, S.: Numerical optimization. Springer verlag, New York (1999) · Zbl 0930.65067
[21] Rakotomamonjy, A.: Surveying and comparing simultaneous sparse approximation (or group-lasso) algorithms. Sig. Process. 91(7), 1505–1526 (2011) · Zbl 1213.94044 · doi:10.1016/j.sigpro.2011.01.012
[22] Richtárik, P., Takáč, M.: Iteration complexity of randomized block-coordinate descent methods for minimizing a composite function. Arxiv, preprint arXiv:1107.2848 (2011)
[23] Roth, V., Fischer, B.: The group-lasso for generalized linear models: uniqueness of solutions and efficient algorithms. In: Proceedings of the 25th international conference on Machine learning, ACM, Bellevue, pp. 848–855 (2008)
[24] Subramanian, A., Tamayo, P., Mootha, V., Mukherjee, S., Ebert, B., Gillette, M., Paulovich, A., Pomeroy, S., Golub, T., Lander, E., et al.: Gene set enrichment analysis: a knowledge-based approach for interpreting genome-wide expression profiles. Proc. Natl Acad. Sci. U.S.A. 102(43), 15,545 (2005) · doi:10.1073/pnas.0506580102
[25] Sun, L., Liu, J., Chen, J., Ye, J.: Efficient Recovery of Jointly Sparse Vectors. NIPS, Canada, (2009)
[26] R, Tibshirani: Regression shrinkage and selection via the lasso. J. Royal Statist. Soc. Ser. B (Methodol.) 58(1), 267–288 (1966) · Zbl 0850.62538
[27] Tseng, P.: Convergence of a block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109(3), 475–494 (2001) · Zbl 1006.65062 · doi:10.1023/A:1017501703105
[28] Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Math. Program. 117(1), 387–423 (2009) · Zbl 1166.90016 · doi:10.1007/s10107-007-0170-0
[29] Van De Vijver, M., He, Y., van’t Veer, L., Dai, H., Hart, A., Voskuil, D., Schreiber, G., Peterse, J., Roberts, C., Marton, M., et al.: A gene-expression signature as a predictor of survival in breast cancer. N. Engl. J. Med. 347(25), 1999 (2002) · doi:10.1056/NEJMoa021967
[30] Vandenberghe, L.: Gradient methods for nonsmooth problems. EE236C course notes (2008)
[31] Wright, S., Nowak, R., Figueiredo, M.: Sparse reconstruction by separable approximation. IEEE Trans. Signal Process. 57(7), 2479–2493 (2009) · Zbl 1391.94442 · doi:10.1109/TSP.2009.2016892
[32] Yang, H., Xu, Z., King, I., Lyu, M.: Online learning for group lasso. In: 27th Intl Conf. on Machine Learning (ICML2010). Citeseer (2010)
[33] Yuan, M., Lin, Y.: Model selection and estimation in regression with grouped variables. J. Royal Statist. Soc. Ser. B (Statist. Methodol.) 68(1), 49–67 (2006) · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[34] Zou, H., Hastie, T.: Regularization and variable selection via the elastic net. J. Royal Statist. Soc. Ser. B (Statist. Methodol.) 67(2), 301–320 (2005) · Zbl 1069.62054 · doi:10.1111/j.1467-9868.2005.00503.x
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.