×

Activity identification and local linear convergence of forward-backward-type methods. (English) Zbl 1357.49064

Summary: In this paper, we consider a class of Forward-Backward (FB) splitting methods that includes several variants (e.g., inertial schemes, FISTA) for minimizing the sum of two proper convex and lower semicontinuous functions, one of which has a Lipschitz continuous gradient, and the other is partly smooth relative to a smooth active manifold \(\mathcal M\). We propose a unified framework, under which we show that this class of FB-type algorithms (i) correctly identifies the active manifold in a finite number of iterations (finite activity identification), and (ii) then enters a local linear convergence regime, which we characterize precisely in terms of the structure of the underlying active manifold. We also establish and explain why FISTA (with convergent sequences) locally oscillates and can be locally slower than FB. These results may have numerous applications including in signal/image processing, sparse recovery and machine learning. Indeed, the obtained results explain the typical behaviour that has been observed numerically for many problems in these fields such as the Lasso, the group Lasso, the fused Lasso and the nuclear norm minimization to name only a few.

MSC:

49J52 Nonsmooth analysis
65K05 Numerical mathematical programming methods
65K10 Numerical optimization and variational techniques
90C25 Convex programming
90C31 Sensitivity, stability, parametric optimization

Software:

ParNes
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] P.-A. Absil, R. Mahony, and R. Sepulchre, {\it Optimization Algorithms on Matrix Manifolds}, Princeton University Press, Princeton, NJ, 2008. · Zbl 1147.65043
[2] P.-A. Absil, R. Mahony, and J. Trumpf, {\it An extrinsic look at the Riemannian Hessian}, in Geometric Science of Information, Lecture Notes in Comput. Sci. 8085, Springer, Heidelberg, 2013, pp. 361-368, . · Zbl 1323.53014
[3] A. Agarwal, S. Negahban, and M. J. Wainwright, {\it Fast global convergence of gradient methods for high-dimensional statistical recovery}, Ann. Statist., 40 (2012), pp. 2452-2482, . · Zbl 1373.62244
[4] F. Alvarez, {\it On the minimizing property of a second order dissipative system in Hilbert spaces}, SIAM J. Control Optim., 38 (2000), pp. 1102-1119, . · Zbl 0954.34053
[5] F. Alvarez and H. Attouch, {\it An inertial proximal method for maximal monotone operators via discretization of a nonlinear oscillator with damping}, Set-Valued Anal., 9 (2001), pp. 3-11, . · Zbl 0991.65056
[6] H. Attouch, Z. Chbani, J. Peypouquet, and P. Redont, {\it Fast convergence of inertial dynamics and algorithms with asymptotic vanishing damping}, Math. Program., to appear, . · Zbl 1375.49028
[7] H. Attouch and J. Peypouquet, {\it The rate of convergence of Nesterov’s accelerated Forward-Backward method is actually \(o(k^{-2})\)}, preprint, arXiv:1510.08740, 2015, . · Zbl 1346.49048
[8] H. Attouch, J. Peypouquet, and P. Redont, {\it A dynamical approach to an inertial forward-backward algorithm for convex minimization}, SIAM J. Optim., 24 (2014), pp. 232-256, . · Zbl 1295.90044
[9] H. Attouch, J. Peypouquet, and P. Redont, {\it Fast convergence of an inertial gradient-like dynamics with vanishing viscosity}, preprint, arXiv:1507.04782, 2015, . · Zbl 1395.34068
[10] J.-F. Aujol and Ch. Dossal, {\it Stability of over-relaxations for the forward-backward algorithm, application to FISTA}, SIAM J. Optim., 25 (2015), pp. 2408-2433, . · Zbl 1357.49123
[11] H. H. Bauschke and P. L. Combettes, {\it Convex Analysis and Monotone Operator Theory in Hilbert Spaces}, Springer, New York, 2011. · Zbl 1218.47001
[12] A. Beck and M. Teboulle, {\it Fast gradient-based algorithms for constrained total variation image denoising and deblurring problems}, IEEE Trans. Image Process., 18 (2009), pp. 2419-2434, . · Zbl 1371.94049
[13] A. Beck and M. Teboulle, {\it A fast iterative shrinkage-thresholding algorithm for linear inverse problems}, SIAM J. Imaging Sci., 2 (2009), pp. 183-202, . · Zbl 1175.94009
[14] J. Bolte, A. Daniilidis, and A. Lewis, {\it The Łojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems}, SIAM J. Optim, 17 (2006), pp. 1205-1223, . · Zbl 1129.26012
[15] K. Bredies and D. A. Lorenz, {\it Linear convergence of iterative soft-thresholding}, J. Fourier Anal. Appl., 14 (2008), pp. 813-837, . · Zbl 1175.65061
[16] R. S. Burachik and A. N. Iusem, {\it Set-valued Mappings and Enlargements of Monotone Operators}, Springer Optim. Appl. 8, Springer, New York, 2008. · Zbl 1154.49001
[17] R. S. Burachik and B. F. Svaiter, {\it \(ε\)-enlargements of maximal monotone operators in banach spaces}, Set-Valued Anal., 7 (1999), pp. 117-132, . · Zbl 0948.47050
[18] A. Chambolle and Ch. Dossal, {\it On the convergence of the iterates of the “fast iterative shrinkage/thresholding algorithm”}, J. Optim. Theory Appl., 166 (2015), pp. 968-982, . · Zbl 1371.65047
[19] I. Chavel, {\it Riemannian Geometry: A Modern Introduction}, 2nd ed., Cambridge Stud. Adv. Math. 98, Cambridge University Press, Cambridge, UK, 2006. · Zbl 1099.53001
[20] P. L. Combettes, {\it Quasi-Fejérian analysis of some optimization algorithms}, Stud. Comput. Math. 8, North-Holland, Amsterdam, 2001, pp. 115-152, . · Zbl 0992.65065
[21] P. L. Combettes and B. C. Vu͂, {\it Variable metric forward-backward splitting with applications to monotone inclusions in duality}, Optimization, 63 (2014), pp. 1289-1318, . · Zbl 1309.90109
[22] A. Daniilidis, D. Drusvyatskiy, and A. S. Lewis, {\it Orthogonal invariance and identifiability}, SIAM J. Matrix Anal. Appl., 35 (2014), pp. 580-598, . · Zbl 1339.15007
[23] A. Daniilidis, W. Hare, and J. Malick, {\it Geometrical interpretation of the predictor-corrector type algorithms in structured optimization problems}, Optimization, 55 (2006), pp. 482-503, . · Zbl 1108.49027
[24] T. Goldstein, B. O’Donoghue, S. Setzer, and R. Baraniuk, {\it Fast alternating direction optimization methods}, SIAM J. Imaging Sci., 7 (2014), pp. 1588-1623, . · Zbl 1314.49019
[25] M. Gu, L.-H. Lim, and C. J. Wu, {\it ParNes: a rapidly convergent algorithm for accurate recovery of sparse and approximately sparse signals}, Numer. Algorithms, 64 (2012), pp. 321-347, . · Zbl 1284.65055
[26] E. Hale, W. Yin, and Y. Zhang, {\it Fixed-point continuation for \(ℓ_1\)-minimization: Methodology and convergence}, SIAM J. Optim., 19 (2008), pp. 1107-1130, . · Zbl 1180.65076
[27] W. L. Hare, {\it Identifying active manifolds in regularization problems}, in Fixed-Point Algorithms for Inverse Problems in Science and Engineering, H. H. Bauschke, R. S., Burachik, P. L. Combettes, V. Elser, D. R. Luke, and H. Wolkowicz, eds., Springer Optim. Appl. 49, Springer, New York, 2011, pp. 261-271, .
[28] W. L. Hare and A. S. Lewis, {\it Identifying active constraints via partial smoothness and prox-regularity}, J. Convex Anal., 11 (2004), pp. 251-266. · Zbl 1178.90314
[29] W. L. Hare and A. S. Lewis, {\it Identifying active manifolds}, Algorithmic Oper. Res., 2 (2007), pp. 75-82, . · Zbl 1206.49017
[30] J.-B. Hiriart-Urruty and C. Lemaréchal, {\it Convex Analysis And Minimization Algorithms}, Vols. I and II, Springer-Verlag, Berlin, 2001.
[31] K. Hou, Z. Zhou, A. M.-C. So, and Z. Luo, {\it On the linear convergence of the proximal gradient method for trace norm regularization}, in Proceeding of the 26th International Conference on Neural Information Processing Systems, 2013, pp. 710-718.
[32] P. R. Johnstone and P. Moulin, {\it Local and global convergence of an inertial version of forward-backward splitting}, preprint arXiv:1502.02281, 2015, .
[33] J. M. Lee, {\it Introduction to Smooth Manifolds}, Grad. Texts in Math. 218, Springer-Verlag, New York, 2003.
[34] C. Lemaréchal, F. Oustry, and C. Sagastizábal, {\it The U-Lagrangian of a convex function}, Trans. Amer. Math. Soc., 352 (2000), pp. 711-729, . · Zbl 0939.49014
[35] A. S. Lewis, {\it Active sets, nonsmoothness, and sensitivity}, SIAM J. Optim., 13 (2003), pp. 702-725, . · Zbl 1055.90072
[36] A. S. Lewis and S. Zhang, {\it Partial smoothness, tilt stability, and generalized Hessians}, SIAM J. Optim., 23 (2013), pp. 74-94, . · Zbl 1267.90147
[37] J. Liang, J. Fadili, and G. Peyré, {\it Local linear convergence of Forward-Backward under partial smoothness}, in Advances in Neural Information Processing Systems, 2014, pp. 1970-1978.
[38] P. L. Lions and B. Mercier, {\it Splitting algorithms for the sum of two nonlinear operators}, SIAM J. Numer. Anal., 16 (1979), pp. 964-979, . · Zbl 0426.65050
[39] D. A. Lorenz and T. Pock, {\it An inertial forward-backward algorithm for monotone inclusions}, preprint, arXiv:1403.3522, 2014, . · Zbl 1327.47063
[40] S. A. Miller and J. Malick, {\it Newton methods for nonsmooth convex minimization: connections among-Lagrangian, Riemannian Newton and SQP methods}, Math. Program., 104 (2005), pp. 609-633, . · Zbl 1124.90021
[41] B. Mordukhovich, {\it Sensitivity analysis in nonsmooth optimization}, Theoretical Aspects of Industrial Design (D. A. Field and V. Komkov, eds.), SIAM, Philadelphia, 1992, pp. 32-46. · Zbl 0769.90075
[42] A. Moudafi and M. Oliny, {\it Convergence of a splitting inertial proximal method for monotone operators}, J. Comput. Appl. Math., 155 (2003), pp. 447-454, . · Zbl 1027.65077
[43] Yu. E. Nesterov, {\it A method for solving the convex programming problem with convergence rate \(O(1/k^2)\)}, Dokl. Akad. Nauk SSSR, 269 (1983), pp. 543-547.
[44] Y. Nesterov, {\it Introductory Lectures on Convex Optimization: A Basic Course}, Appl. Optim. 87, Kluwer Academic Publishers, Boston, 2004. · Zbl 1086.90045
[45] Y. Nesterov, {\it Gradient methods for minimizing composite objective function}, Math. Program., 140 (2013), pp. 125-161, . · Zbl 1287.90067
[46] B. O’Donoghue and E. Candes, {\it Adaptive restart for accelerated gradient schemes}, Found. Comput. Math., 15 (2015), pp. 715-732, . · Zbl 1320.90061
[47] Z. Opial, {\it Weak convergence of the sequence of successive approximations for nonexpansive mappings}, Bull. Amer. Math. Soc., 73 (1967), pp. 591-597, . · Zbl 0179.19902
[48] R. A. Poliquin and R. T. Rockafellar, {\it Tilt stability of a local minimum}, SIAM J. Optim., 8 (1998), pp. 287-299, . · Zbl 0918.49016
[49] B. T. Polyak, {\it Some methods of speeding up the convergence of iterative methods}, Ž. Vyčhisl. Mat. i Mat. Fiz., 4 (1964), pp. 791-803, .
[50] B. T. Polyak, {\it Introduction to Optimization}, Optimization Software, New York, 1987.
[51] R. T. Rockafellar and R. Wets, {\it Variational Analysis}, Grundlehren Math. Wiss. 317, Springer-Verlag, Berlin, 1998. · Zbl 0888.49001
[52] S. T. Smith, {\it Optimization techniques on Riemannian manifolds}, Fields Inst. Commun. 3, AMS, Providence, RI, 1994, pp. 113-136. · Zbl 0816.49032
[53] S. Tao, D. Boley, and S. Zhang, {\it Local linear convergence of ISTA and FISTA on the LASSO problem}, SIAM J. Optim., 26 (2016), pp. 313-336, . · Zbl 1358.90101
[54] P. Tseng and S. Yun, {\it A coordinate gradient descent method for nonsmooth separable minimization}, Math. Progam., 117 (2009), pp. 387-423, . · Zbl 1166.90016
[55] S. Vaiter, M. Golbabaee, J. Fadili, and G. Peyré, {\it Model selection with low complexity priors}, Inf. Inference, 4 (2015), pp. 230-287, . · Zbl 1386.94040
[56] S. Vaiter, G. Peyré, and J. Fadili, {\it Model consistency of partly smooth regularizers}, preprint, arXiv:1405.1004, 2014, . · Zbl 1464.62345
[57] S. J. Wright, {\it Identifiable surfaces in constrained optimization}, SIAM J. Control Optim., 31 (1993), pp. 1063-1079, . · Zbl 0804.90105
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.