×

A primal-dual algorithm with line search for general convex-concave saddle point problems. (English) Zbl 1507.65106

Summary: In this paper, we propose a primal-dual algorithm with a novel momentum term using the partial gradients of the coupling function that can be viewed as a generalization of the method proposed by A. Chambolle and T. Pock [Math. Program. 159, No. 1–2 (A), 253–287 (2016; Zbl 1350.49035)] for solving saddle point problems defined by a convex-concave function \(\mathcal{L}(x,y)=f(x)+\Phi(x,y)-h(y)\) with a general coupling term \(\Phi(x,y)\) that is not assumed to be bilinear. Assuming \(\nabla_x\Phi(\cdot,y)\) is Lipschitz for any fixed \(y\), and \(\nabla_y\Phi(\cdot,\cdot)\) is Lipschitz, we show that the iterate sequence converges to a saddle point, and for any \((x,y)\), we derive error bounds in terms of \(\mathcal{L}(\bar{x}_k,y)-\mathcal{L}(x,\bar{y}_k)\) for the ergodic sequence \(\{\bar{x}_k,\bar{y}_k\}\). In particular, we show \(\mathcal{O}(1/k)\) rate when the problem is merely convex in \(x\). Furthermore, assuming \(\Phi(x,\cdot)\) is linear for each fixed \(x\) and \(f\) is strongly convex, we obtain the ergodic convergence rate of \(\mathcal{O}(1/k^2)\) – we are not aware of another single-loop method in the related literature achieving the same rate when \(\Phi\) is not bilinear. Finally, we propose a backtracking technique which does not require knowledge of Lipschitz constants yet ensures the same convergence results. We also consider convex optimization problems with nonlinear functional constraints, and we show that by using the backtracking scheme, the optimal convergence rate can be achieved even when the dual domain is unbounded. We tested our method against other state-of-the-art first-order algorithms for solving quadratically constrained quadratic programming (QCQP): in the first set of experiments, we considered QCQPs with synthetic data, and in the second set, we focused on QCQPs with real data originating from a variant of the linear regression problem with fairness constraints arising in machine learning.

MSC:

65K10 Numerical optimization and variational techniques
49M29 Numerical methods involving duality
65Y20 Complexity and performance of numerical algorithms
90C25 Convex programming

Citations:

Zbl 1350.49035

Software:

Mosek
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] MOSEK ApS, The MOSEK Optimization Toolbox for MATLAB 8.1.0.83, http://docs.mosek.com/8.1/toolbox/index.html, 2020.
[2] K. J. Arrow, L. Hurwicz, and H. Uzawa, Studies in Linear and Non-linear Programming, with contributions by H. B. Chenery, S. M. Johnson, S. Karlin, T. Marschak, R. M. Solow, Stanford Mathematical Studies in the Social Sciences II, Stanford University Press, 1958. · Zbl 0091.16002
[3] N. S. Aybat and E. Y. Hamedani, A distributed ADMM-like method for resource sharing over time-varying networks, SIAM J. Optim., 29 (2019), pp. 3036-3068, https://doi.org/10.1137/17M1151973. · Zbl 1427.90214
[4] N. S. Aybat, Z. Wang, T. Lin, and S. Ma, Distributed linearized alternating direction method of multipliers for composite convex consensus optimization, IEEE Trans. Automat. Control, 63 (2018), pp. 5-20. · Zbl 1390.90420
[5] C. Bhattacharyya, P. K. Shivaswamy, and A. J. Smola, A second order cone programming formulation for classifying missing data, in Advances in Neural Information Processing Systems 17 (NIPS 2004), MIT Press, 2005, pp. 153-160.
[6] D. Boob, Q. Deng, and G. Lan, Stochastic First-Order Methods for Convex and Nonconvex Functional Constrained Optimization, preprint, https://arxiv.org/abs/1908.02734, 2019.
[7] S. Boyd and L. Vandenberghe, Convex Optimization, Cambridge University Press, 2004. · Zbl 1058.90049
[8] A. Chambolle and T. Pock, A first-order primal-dual algorithm for convex problems with applications to imaging, J. Math. Imaging Vis., 40 (2011), pp. 120-145. · Zbl 1255.68217
[9] A. Chambolle and T. Pock, On the ergodic convergence rates of a first-order primal-dual algorithm, Math. Program., 159 (2016), pp. 253-287. · Zbl 1350.49035
[10] Y. Chen, G. Lan, and Y. Ouyang, Optimal primal-dual methods for a class of saddle point problems, SIAM J. Optim., 24 (2014), pp. 1779-1814, https://doi.org/10.1137/130919362. · Zbl 1329.90090
[11] L. Condat, A primal-dual splitting method for convex optimization involving Lipschitzian, proximable and linear composite terms, J. Optim. Theory Appl., 158 (2013), pp. 460-479. · Zbl 1272.90110
[12] C. Dang and G. Lan, Randomized First-Order Methods for Saddle Point Optimization, preprint, https://arxiv.org/abs/1409.8625, 2014.
[13] S. S. Du and W. Hu, Linear Convergence of the Primal-Dual Gradient Method for Convex-Concave Saddle Point Problems without Strong Convexity, preprint, https://arxiv.org/abs/1802.01504, 2018.
[14] M. Gönen and E. Alpayd\in, Multiple kernel learning algorithms, J. Mach. Learn. Res., 12 (2011), pp. 2211-2268. · Zbl 1280.68167
[15] E. Y. Hamedani and N. S. Aybat, A Primal-Dual Algorithm for General Convex-Concave Saddle Point Problems, preprint, https://arxiv.org/abs/1803.01401v1, 2018.
[16] E. Y. Hamedani and N. S. Aybat, A Decentralized Primal-Dual Method for Constrained Minimization of a Strongly Convex Function, preprint, https://arxiv.org/abs/1908.11835, 2019.
[17] N. He, A. Juditsky, and A. Nemirovski, Mirror Prox algorithm for multi-term composite minimization and semi-separable problems, Comput. Optim. Appl., 61 (2015), pp. 275-319. · Zbl 1321.65105
[18] Y. He and R. D. C. Monteiro, An accelerated HPE-type algorithm for a class of composite convex-concave saddle-point problems, SIAM J. Optim., 26 (2016), pp. 29-56, https://doi.org/10.1137/14096757X. · Zbl 1329.90179
[19] J.-B. Hiriart-Urruty and C. Lemaréchal, Fundamentals of Convex Analysis, Springer, 2001. · Zbl 0998.49001
[20] A. Jalilzadeh, E. Yazdandoost Hamedani, N. Aybat, and U. Shanbhag, A Doubly-Randomized Block-Coordinate Primal-Dual Method for Large-Scale Saddle Point Problems, preprint, https://arxiv.org/abs/1907.03886, 2019.
[21] A. Juditsky and A. Nemirovski, First order methods for nonsmooth convex large-scale optimization II: Utilizing problem’s structure, Optim. Mach. Learn., 30 (2011), pp. 149-183.
[22] O. Kolossoski and R. Monteiro, An accelerated non-Euclidean hybrid proximal extragradient-type algorithm for convex-concave saddle-point problems, Optim. Methods Softw., 32 (2017), pp. 1244-1272. · Zbl 1375.90236
[23] J. Komiyama, A. Takeda, J. Honda, and H. Shimao, Nonconvex optimization for regression with fairness constraints, in Proceedings of the 35th International Conference on Machine Learning, Stockholm, Sweden, Proc. Mach. Learn. Res. 80, 2018, pp. 2737-2746.
[24] G. R. Lanckriet, N. Cristianini, P. Bartlett, L. E. Ghaoui, and M. I. Jordan, Learning the kernel matrix with semidefinite programming, J. Mach. Learn. Res., 5 (2004), pp. 27-72. · Zbl 1222.68241
[25] Q. Lin, S. Nadarajah, and N. Soheili, A Level-Set Method for Convex Optimization with a Feasible Solution Path, Technical report, http://www.optimization-online.org/DB_HTML/2017/10/6274.html, 2017. · Zbl 1411.90268
[26] Y. Malitsky, Projected reflected gradient methods for monotone variational inequalities, SIAM J. Optim., 25 (2015), pp. 502-520, https://doi.org/10.1137/14097238X. · Zbl 1314.47099
[27] Y. Malitsky, Proximal extrapolated gradient methods for variational inequalities, Optim. Methods Softw., 33 (2018), pp. 140-164. · Zbl 1483.65107
[28] Y. Malitsky and T. Pock, A first-order primal-dual algorithm with linesearch, SIAM J. Optim., 28 (2018), pp. 411-432, https://doi.org/10.1137/16M1092015. · Zbl 1390.49033
[29] Y. Malitsky and M. K. Tam, A forward-backward splitting method for monotone inclusions without cocoercivity, SIAM J. Optim. 30 (2020), pp. 1451-1472, https://doi.org/10.1137/18M1207260. · Zbl 1445.47041
[30] A. Nemirovski, Prox-method with rate of convergence \(O(1/t)\) for variational inequalities with Lipschitz continuous monotone operators and smooth convex-concave saddle point problems, SIAM J. Optim., 15 (2004), pp. 229-251, https://doi.org/10.1137/S1052623403425629. · Zbl 1106.90059
[31] Y. Nesterov and A. Nemirovskii, Interior-Point Polynomial Algorithms in Convex Programming, Stud. Appl. Numer. Math. 13, SIAM, 1994, https://doi.org/10.1137/1.9781611970791. · Zbl 0824.90112
[32] Y. Ouyang and Y. Xu, Lower complexity bounds of first-order methods for convex-concave bilinear saddle-point problems, Math. Program., 185 (2021), pp. 1-35. · Zbl 1458.90516
[33] P. Palaniappan and F. Bach, Stochastic variance reduction methods for saddle-point problems, in Advances in Neural Information Processing Systems 29 (NIPS 2016), Barcelona, Spain, 2016, pp. 1416-1424.
[34] H. Robbins and D. Siegmund, A convergence theorem for non negative almost supermartingales and some applications, in Optimizing Methods in Statistics (Proc. Sympos., Ohio State Univ., Columbus, Ohio, 1971), Academic Press, 1971, pp. 233-257. · Zbl 0286.60025
[35] R. T. Rockafellar, Convex Analysis, Princeton University Press, 1997. · Zbl 0932.90001
[36] R. Shefi and M. Teboulle, Rate of convergence analysis of decomposition methods based on the proximal method of multipliers for convex minimization, SIAM J. Optim., 24 (2014), pp. 269-297, https://doi.org/10.1137/130910774. · Zbl 1291.90176
[37] P. K. Shivaswamy and T. Jebara, Ellipsoidal kernel machines, in Proceedings of the Eleventh International Conference on Artificial Intelligence and Statistics, Proc. Mach. Learn. Res. 2, 2007, pp. 484-491.
[38] P. Tseng, A modified forward-backward splitting method for maximal monotone mappings, SIAM J. Control Optim., 38 (2000), pp. 431-446, https://doi.org/10.1137/S0363012998338806. · Zbl 0997.90062
[39] P. Tseng, On Accelerated Proximal Gradient Methods for Convex-Concave Optimization, Tech. report, http://www.mit.edu/ dimitrib/PTseng/papers/apgm.pdf, 2008.
[40] B. C. Vu͂, A splitting algorithm for dual monotone inclusions involving cocoercive operators, Adv. Comput. Math., 38 (2013), pp. 667-681. · Zbl 1284.47045
[41] J. Wang and L. Xiao, Exploiting Strong Convexity from Data with Primal-Dual First-Order Algorithms, preprint, https://arxiv.org/abs/1703.02624, 2017.
[42] E. P. Xing, M. I. Jordan, A. Y. Ng, and S. J. Russell, Distance metric learning with application to clustering, with side-information, in Advances in Neural Information Processing Systems 15 (NIPS 2002), 2003, pp. 521-528.
[43] Y. Xu, First-Order Methods for Constrained Convex Programming Based on Linearized Augmented Lagrangian Function, preprint, https://arxiv.org/abs/1711.08020, 2017.
[44] E. Yazdandoost Hamedani, A. Jalilzadeh, N. Aybat, and U. Shanbhag, Iteration Complexity of Randomized Primal-Dual Methods for Convex-Concave Saddle Point Problems, preprint, https://arxiv.org/abs/1806.04118, 2018.
[45] H. Yu and M. J. Neely, A Primal-Dual Parallel Method with \(O(1/\epsilon)\) Convergence for Constrained Composite Convex Programs, preprint, https://arxiv.org/abs/1708.00322, 2017.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.