×

zbMATH — the first resource for mathematics

A coordinate gradient descent method for nonsmooth separable minimization. (English) Zbl 1166.90016
The authors presented a block coordinate gradient descent method for minimizing the sum of a smooth function and a convex separable function. The method may be viewed as a hybrid of gradient-projection and coordinate descent methods, or as a block coordinate version of descent methods in [J. V. Burke, Math. Program. 33, 260–279 (1985; Zbl 0581.90084)] and [M. Fukushima and H. Mine, Int. J. Syst. Sci. 12, 989–1000 (1981; Zbl 0467.65028)]. The global convergence is proved and, under a local Lipschitzian error bound assumption, linear convergence of this method. The local Lipschitzian error bound holds under assumptions analogous to those for constrained smooth optimization, e.g. the convex function is polyhedral and the smooth function is (nonconvex) quadratic or is the composition of a strongly convex function with a linear mapping. The numerical results are presented to verify the practical efficiency of the method.

MSC:
90C30 Nonlinear programming
90C55 Methods of successive quadratic programming type
65K05 Numerical mathematical programming methods
49M27 Decomposition methods
49M37 Numerical methods based on nonlinear programming
90C06 Large-scale problems in mathematical programming
90C25 Convex programming
90C26 Nonconvex programming, global optimization
PDF BibTeX XML Cite
Full Text: DOI
References:
[1] Auslender A. (1978). Minimisation de fonctions localement lipschitziennes: applications à la programmation mi-convexe, mi-différentiable. In: Mangasarian, O.L., Meyer, R.R. and Robinson, S.M. (eds) Nonlinear Programming, vol 3., pp 429–460. Academic, New York · Zbl 0461.90063
[2] Balakrishnan, S.: Private communication (2006)
[3] Bertsekas D.P. (1982). Constrained Optimization and Lagrange Multiplier Methods. Academic, New York · Zbl 0572.90067
[4] Bertsekas D.P. (1999). Nonlinear Programming, 2nd edn. Athena Scientific, Belmont · Zbl 1015.90077
[5] Bradley P.S., Fayyad U.M. and Mangasarian O.L. (1999). Mathematical programming for data mining: formulations and challenges. INFORMS J. Comput. 11: 217–238 · Zbl 0973.90096 · doi:10.1287/ijoc.11.3.217
[6] Burke J.V. (1985). Descent methods for composite nondifferentiable optimization problems. Math. Program. 33: 260–279 · Zbl 0581.90084 · doi:10.1007/BF01584377
[7] Chen S., Donoho D. and Saunders M. (1999). Atomic decomposition by basis pursuit. SIAM J. Sci. Comput. 20: 33–61 · Zbl 0919.94002 · doi:10.1137/S1064827596304010
[8] Censor Y. and Zenios S.A. (1997). Parallel Optimization: Theory, Algorithms and Applications. Oxford University Press, New York · Zbl 0945.90064
[9] Conn A.R., Gould N.I.M. and Toint Ph.L. (2000). Trust-Region Methods. SIAM, Philadelphia · Zbl 0958.65071
[10] Donoho D.L. and Johnstone I.M. (1994). Ideal spatial adaptation by wavelet shrinkage. Biometrika 81: 425–455 · Zbl 0815.62019 · doi:10.1093/biomet/81.3.425
[11] Donoho D.L. and Johnstone I.M. (1995). Adapting to unknown smoothness via wavelet shrinkage. J. Am. Stat. Assoc. 90: 1200–1224 · Zbl 0869.62024 · doi:10.2307/2291512
[12] Drucker, H., Burges, C.J.C., Kaufman, L., Smola, A., Vapnik, V.: Support vector regression machines. In: Mozer, M.C., Jordan, M.I., Petsche, T. (eds.) Advances in Neural Information Processing Systems 9. MIT Press, Cambridge (1997)
[13] Facchinei F., Fischer A. and Kanzow C. (1998). On the accurate identification of active constraints. SIAM J. Optim. 9: 14–32 · Zbl 0960.90080 · doi:10.1137/S1052623496305882
[14] Facchinei F. and Pang J.-S. (2003). Finite-Dimensional Variational Inequalities and Complementarity Problems, Vols. I and II. Springer, New York · Zbl 1062.90002
[15] Ferris M.C. and Mangasarian O.L. (1994). Parallel variable distribution. SIAM J. Optim. 4: 815–832 · Zbl 0820.90098 · doi:10.1137/0804047
[16] Fletcher R. (1982). A model algorithm for composite nondifferentiable optimization problems. Math. Program. Study 17: 67–76 · Zbl 0478.90063
[17] Fletcher R. (1987). Practical Methods of Optimization, 2nd edn. Wiley, Chichester · Zbl 0905.65002
[18] Fletcher R. (1994). An overview of unconstrained optimization. In: Spedicato, E. (eds) Algorithms for Continuous Optimization., pp 109–143. Kluwer, Dordrecht · Zbl 0828.90123
[19] Fukushima M. (1990). A successive quadratic programming method for a class of constrained nonsmooth optimization problems. Math. Program. 49: 231–251 · Zbl 0735.90062 · doi:10.1007/BF01588789
[20] Fukushima M. (1998). Parallel variable transformation in unconstrained optimization. SIAM J. Optim. 8: 658–672 · Zbl 0913.90243 · doi:10.1137/S1052623496309879
[21] Fukushima M. and Mine H. (1981). A generalized proximal point algorithm for certain non-convex minimization problems. Int. J. Syst. Sci. 12: 989–1000 · Zbl 0467.65028 · doi:10.1080/00207728108963798
[22] Gould N.I.M., Orban D. and Toint Ph.L. (2003). CUTEr, a constrained and unconstrained testing environment, revisited. ACM Trans. Math. Softw. 29: 373–394 · Zbl 1068.90526 · doi:10.1145/962437.962439
[23] Grippo L. and Sciandrone M. (2000). On the convergence of the block nonlinear Gauss–Seidel method under convex constraints. Oper. Res. Lett. 26: 127–136 · Zbl 0955.90128 · doi:10.1016/S0167-6377(99)00074-7
[24] Kelley C.T. (1999). Iterative Methods for Optimization. SIAM, Philadelphia · Zbl 0934.90082
[25] Kiwiel K.C. (1986). A method for minimizing the sum of a convex function and a continuously differentiable function. J. Optim. Theory Appl. 48: 437–449 · Zbl 0562.90073 · doi:10.1007/BF00940570
[26] Luo Z.-Q. and Tseng P. (1992). Error bounds and the convergence analysis of matrix splitting algorithms for the affine variational inequality problem. SIAM J. Optim. 2: 43–54 · Zbl 0777.49010 · doi:10.1137/0802004
[27] Luo Z.-Q. and Tseng P. (1992). On the linear convergence of descent methods for convex essentially smooth minimization. SIAM J. Control Optim. 30: 408–425 · Zbl 0756.90084 · doi:10.1137/0330025
[28] Luo Z.-Q. and Tseng P. (1993). On the convergence rate of dual ascent methods for linearly constrained convex minimization. Math. Oper. Res. 18: 846–867 · Zbl 0804.90103 · doi:10.1287/moor.18.4.846
[29] Luo Z.-Q. and Tseng P. (1993). Error bounds and convergence analysis of feasible descent methods: a general approach. Ann. Oper. Res. 46: 157–178 · Zbl 0793.90076 · doi:10.1007/BF02096261
[30] Mangasarian O.L. (1984). Sparsity-preserving SOR algorithms for separable quadratic and linear programming. Comput. Oper. Res. 11: 105–112 · Zbl 0606.90102 · doi:10.1016/0305-0548(69)90004-5
[31] Mangasarian O.L. (1995). Parallel gradient distribution in unconstrained optimization. SIAM J. Control Optim. 33: 1916–1925 · Zbl 0843.90111 · doi:10.1137/S0363012993250220
[32] Mangasarian O.L. and De Leone R. (1988). Parallel gradient projection successive overrelaxation for symmetric linear complementarity problems and linear programs. Ann. Oper. Res. 14: 41–59 · Zbl 0739.90068 · doi:10.1007/BF02186473
[33] Mangasarian O.L. and Musicant D.R. (1999). Successive overrelaxation for support vector machines. IEEE Trans. Neural Netw. 10: 1032–1037 · doi:10.1109/72.788643
[34] Mangasarian O.L. and Musicant D.R. (2002). Large scale kernel regression via linear programming. Mach. Learn. 46: 255–269 · Zbl 0998.68106 · doi:10.1023/A:1012422931930
[35] Meier, L., van de Geer, S., Bühlmann, P.: The group Lasso for logistic regression. Report Seminar für Statistik, ETH Zürich, Zürich · Zbl 1400.62276
[36] Mine H. and Fukushima M. (1981). A minimization method for the sum of a convex function and a continuously differentiable function. J. Optim. Theory Appl. 33: 9–23 · Zbl 0422.90070 · doi:10.1007/BF00935173
[37] Moré J.J., Garbow B.S. and Hillstrom K.E. (1981). Testing unconstrained optimization software. ACM Trans. Math. Softw. 7: 17–41 · Zbl 0454.65049 · doi:10.1145/355934.355936
[38] Moré J.J. and Toraldo G. (1991). On the solution of large quadratic programming problems with bound constraints. SIAM J. Optim. 1: 93–113 · Zbl 0752.90053 · doi:10.1137/0801008
[39] Murtagh, B.A., Saunders, M.A.: MINOS 5.5 user’s guide. Report SOL 83-20R. Department of Operations Research, Stanford University, Stanford (1998)
[40] Nocedal J. (1980). Updating quasi-Newton matrices with limited storage. Math. Comp. 35: 773–782 · Zbl 0464.65037 · doi:10.1090/S0025-5718-1980-0572855-7
[41] Nocedal J. and Wright S.J. (1999). Numerical Optimization. Springer, New York · Zbl 0930.65067
[42] Ortega J.M. and Rheinboldt W.C. (2000). Iterative Solution of Nonlinear Equations in Several Variables. Reprinted by SIAM, Philadelphia · Zbl 0949.65053
[43] Powell M.J.D. (1973). On search directions for minimization algorithms. Math. Program. 4: 193–201 · Zbl 0258.90043 · doi:10.1007/BF01584660
[44] Robinson S.M. (1981). Some continuity properties of polyhedral multifunctions. Math. Program. Study 14: 206–214 · Zbl 0449.90090
[45] Robinson S.M. (1999). Linear convergence of \(\epsilon\)-subgradient descent methods for a class of convex functions. Math. Program. 86: 41–50 · Zbl 1029.90056 · doi:10.1007/s101070050078
[46] Robinson, S.M.: Calmness and Lipschitz continuity for multifunctions. Report, Department of Industrial Engineering, University of Wisconsin, Madison (2006)
[47] Rockafellar R.T. (1970). Convex Analysis. Princeton University Press, Princeton · Zbl 0193.18401
[48] Rockafellar R.T. and Wets R.J.-B. (1998). Variational Analysis. Springer, New York · Zbl 0888.49001
[49] Sardy S., Bruce A. and Tseng P. (2000). Block coordinate relaxation methods for nonparametric wavelet denoising. J. Comput. Graph. Stat. 9: 361–379 · doi:10.2307/1390659
[50] Sardy S., Bruce A. and Tseng P. (2001). Robust wavelet denoising. IEEE Trans. Signal Proc. 49: 1146–1152 · doi:10.1109/78.923297
[51] Sardy S. and Tseng P. (2004). AMlet, RAMlet and GAMlet: automatic nonlinear fitting of additive models, robust and generalized, with wavelets. J. Comput. Graph. Stat. 13: 283–309 · doi:10.1198/1061860043434
[52] Sardy S. and Tseng P. (2004). On the statistical analysis of smoothing by maximizing dirty Markov random field posterior distributions. J. Am. Stat. Assoc. 99: 191–204 · Zbl 1089.62518 · doi:10.1198/016214504000000188
[53] Tseng P. (1991). On the rate of convergence of a partially asynchronous gradient projection algorithm. SIAM J. Optim. 1: 603–619 · Zbl 0754.90055 · doi:10.1137/0801036
[54] Tseng P. (1993). Dual coordinate ascent methods for non-strictly convex minimization. Math. Program. 59: 231–247 · Zbl 0782.90073 · doi:10.1007/BF01581245
[55] Tseng P. (2001). Convergence of block coordinate descent method for nondifferentiable minimization. J. Optim. Theory Appl. 109: 473–492 · Zbl 1006.65062 · doi:10.1023/A:1017501703105
[56] Tseng, P., Yun, S.: A coordinate gradient descent method for nonsmooth separable minimization. Report, Department of Mathematics, University of Washington, Seattle; June 2006; revised February 2007. http://www.math.washington.edu/\(\sim\)tseng/papers.html · Zbl 1166.90016
[57] Vapnik, V., Golowich, S.E., Smola, A.: Support vector method for function approximation, regression estimation, and signal processing. In: Mozer, M.C., Jordan, M.I., and Petsche, T., (eds.) Advances in Neural Information Processing Systems, vol. 9. MIT Press, Cambridge (1997)
[58] Yuan L. and Lin Y. (2006). Model selection and estimation in regression with grouped variables. J. R. Stat. Soc. 68: 49–67 · Zbl 1141.62030 · doi:10.1111/j.1467-9868.2005.00532.x
[59] Zhu C., Byrd R.H. and Nocedal J. (1997). L-BFGS-B: Algorithm 778: L-BFGS-B, FORTRAN routines for large scale bound constrained optimization. ACM Trans. Math. Softw. 23: 550–560 · Zbl 0912.65057 · doi:10.1145/279232.279236
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.