×

Stochastic regularized Newton methods for nonlinear equations. (English) Zbl 1519.90141

Summary: In this paper, we study stochastic regularized Newton methods to find zeros of nonlinear equations, whose exact function information is normally expensive to calculate but approximations can be easily accessed via calls to stochastic oracles. To handle the potential singularity of Jacobian approximations, we compute a regularized Newton step at each iteration. Then we take a unit step if it can be accepted by an inexact line search condition, and a preset step otherwise. We investigate the global convergence properties and the convergence rate of the proposed algorithm with high probability. We also propose a stochastic regularized Newton method incorporating a variance reduction technique and establishing the corresponding sample complexities in terms of total numbers of stochastic oracle calls to find an approximate solution. Finally, we report some numerical results and demonstrate the promising performances of the two proposed algorithms.

MSC:

90C15 Stochastic programming
49M37 Numerical methods based on nonlinear programming
65K10 Numerical optimization and variational techniques
90C30 Nonlinear programming

Software:

GQTPAR; RCV1
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Aliprantis, C.D., Border, K.C.: Infinite dimensional analysis. Springer Berlin, third ed. A hitchhiker’s guide (2006) · Zbl 1156.46001
[2] Allen-Zhu, Z., Hazan, E.: Variance reduction for faster non-convex optimization. In: ICML (2016)
[3] Behling, R., Iusem, A.N.: The effect of calmness on the solution set of systems of nonlinear equations. Math. Program. 137(1), 155-165 (2013) · Zbl 1266.65079
[4] Bianchi, P., Ergodic convergence of a stochastic proximal point algorithm, SIAM J. Optim., 26, 4, 2235-2260 (2016) · Zbl 1355.90062 · doi:10.1137/15M1017909
[5] Causality workbench team (2008). A marketing dataset. http://www.causality.inf.ethz.ch/data/CINA.html
[6] Chen, S.; Pang, L-P; Guo, F-F; Xia, Z-Q, Stochastic methods based on Newton method to the stochastic variational inequality problem with constraint conditions, Math. Comput. Model., 55, 3, 779-784 (2012) · Zbl 1255.90085 · doi:10.1016/j.mcm.2011.09.003
[7] Dvurechensky, P.; Gasnikov, A., Stochastic intermediate gradient method for convex problems with stochastic inexact oracle, J. Optim. Theory App., 171, 1, 121-145 (2016) · Zbl 1351.90150 · doi:10.1007/s10957-016-0999-6
[8] Fan, J., Accelerating the modified Levenberg-Marquardt method for nonlinear equations, Math. Comput., 83, 287, 1173-1187 (2013) · Zbl 1296.65077 · doi:10.1090/S0025-5718-2013-02752-4
[9] Guyon, I., Gunn, S., Ben-Hur, A., Dror, G.: Result analysis of the NIPS 2003 feature selection challenge. In: Advances in NIPS (pp. 545-552) (2005)
[10] Iusem, AN; Jofré, A.; Oliveira, RI; Thompson, P., Extragradient method with variance reduction for stochastic variational inequalities, SIAM J. Optim., 27, 2, 686-724 (2017) · Zbl 1365.65179 · doi:10.1137/15M1031953
[11] Iusem, AN; Jofré, A.; Oliveira, RI; Thompson, P., Variance-based extragradient methods with line search for stochastic variational inequalities, SIAM J. Optim., 29, 1, 175-206 (2019) · Zbl 1415.65145 · doi:10.1137/17M1144799
[12] Iusem, AN; Jofré, A.; Thompson, P., Incremental constraint projection methods for monotone stochastic variational inequalities, Math. Oper. Res., 44, 1, 236-263 (2018) · Zbl 1477.65101
[13] Jiang, H.; Xu, H., Stochastic approximation approaches to the stochastic variational inequality problem, IEEE Trans. Automat. Contr., 53, 6, 1462-1475 (2008) · Zbl 1367.90072 · doi:10.1109/TAC.2008.925853
[14] Kloeden, PE; Platen, E., Numerical Solution of Stochastic Differential Equations (1992), Berlin: Springer, Berlin · Zbl 0752.60043 · doi:10.1007/978-3-662-12616-5
[15] Lan, G., First-order and Stochastic Optimization Methods for Machine Learning (2020), Berlin: Springer, Berlin · Zbl 1442.68003 · doi:10.1007/978-3-030-39568-1
[16] Lee, JD; Lin, Q.; Ma, T.; Yang, T., Distributed stochastic variance reduced gradient methods by sampling extra data with replacement, J. Mach. Learn. Res., 18, 122, 1-43 (2017) · Zbl 1435.68380
[17] Lewis, DD; Yang, Y.; Rose, TG; Li, F., RCV1: A new benchmark collection for text categorization research, J. Math. Learn. Res., 5, 361-397 (2004)
[18] Li, X., Zhao, T., Arora, R., Liu, H., Haupt, J.: Stochastic variance reduced optimization for nonconvex sparse learning. ICML 48 (pp. 917-925) (2016)
[19] Lord, G.; Malham, SJA; Wiese, A., Efficient strong integrators for linear stochastic systems, SIAM J. Numer. Anal., 46, 6, 2892-2919 (2008) · Zbl 1179.60046 · doi:10.1137/060656486
[20] Moré, J.J., Sorensen, D.C.: Computing a trust region step. SIAM J. Sci. Comput., 4(3), doi:10.1137/0904038 (1983) · Zbl 0551.65042
[21] Mukherjee, I., Canini, K., Frongillo, R., Singer, Y.: Parallel boosting with momentum. In: Joint European Conference on Machine Learning and Knowledge Discovery in Databases (pp. 17-32). Springer, Berlin, Heidelberg (2013, September)
[22] Nemirovski, A.; Juditsky, A.; Lan, G.; Shapiro, A., Robust stochastic approximation approach to stochastic programming, SIAM J. Optim., 19, 4, 1574-1609 (2009) · Zbl 1189.90109 · doi:10.1137/070704277
[23] Nguyen, L.M., Liu, J., Scheinberg, K., Takác̆, M.: SARAH: A novel method for machine learning problems using stochastic recursive gradient. In: ICML, 2613-2621 (2017, July)
[24] Nocedal, J.; Wight, S., Numerical Optimization (2006), Berlin: Springer, Berlin · Zbl 1104.65059
[25] Nunno, GD; Zhang, T., Approximations of stochastic partial differential equations, Ann. Appl. Probab., 26, 3, 1443-1466 (2016) · Zbl 1345.60053 · doi:10.1214/15-AAP1122
[26] Øksendal, BK, Stochastic differential equations: an introduction with applications, J. Am. Stat. Assoc., 82, 399, 948 (1987) · doi:10.2307/2288814
[27] Papini, M., Binaghi, D., Canonaco, G., Pirotta, M., Restelli, M.: Stochastic variance-reduced policy gradient. In: ICML (Vol. 80, pp. 4026-4035) (2018)
[28] Paquette, C.; Scheinberg, K., A stochastic line search method with convergence rate analysis, SIAM J. Optim., 30, 1, 349-376 (2020) · Zbl 1431.90153 · doi:10.1137/18M1216250
[29] Reddi, S.J., Sra, S., Poczos, B., Smola, A.J.: Proximal stochastic methods for nonsmooth nonconvex finite-sum optimization. In: NeurIPS (pp. 1145-1153) (2016)
[30] Ross, SM, Introduction to Stochastic Dynamic Progrmaming (1983), Academic Press · Zbl 0567.90065
[31] Shapiro, A., Dentcheva, D., Ruszczyński, A.: Lectures on stochastic programming: modeling and theory. MOS-SIAM Series on Optimization (2009) · Zbl 1183.90005
[32] Tang, J.; Ma, C., A smoothing Newton method for solving a class of stochastic linear complementarity problems, Nonlinear Anal-Real World Appl, 12, 6, 3585-3601 (2011) · Zbl 1231.65113 · doi:10.1016/j.nonrwa.2011.06.017
[33] Tran-Dinh, Q., Pham N.H., Nguyen, L.: Stochastic Gauss-Newton Algorithms for Nonconvex Compositional Optimization. In: ICML (2020)
[34] Tropp, JA, User-friendly tail bounds for sums of random matrices, Found. Comput. Math., 12, 4, 389-434 (2012) · Zbl 1259.60008 · doi:10.1007/s10208-011-9099-z
[35] Ueda, K.; Yamashita, N., On a global complexity bound of the levenberg-marquardt method, J. Optim. Theory Appl., 147, 3, 443-453 (2010) · Zbl 1203.90152 · doi:10.1007/s10957-010-9731-0
[36] Vaswani, S., Mishkin, A., Laradji, I., Schmidt, M., Gidel, G., Lacoste-Julien, S.: Painless stochastic gradient: interpolation, line search, and convergence rates. arXiv preprint arXiv:1905.09997 (2019)
[37] Wang, Z., Zhou, Y., Liang, Y., Lan, G.: Stochastic variance-reduced cubic regularization for nonconvex optimization. nt. Conf. Artif. Intell. Stat., 2731-2740 (2019)
[38] Ypma, TJ, Historical development of the Newton-Raphson method, SIAM Rev., 37, 4, 531-551 (1995) · Zbl 0842.01005 · doi:10.1137/1037125
[39] Zhang, J., Xiao, L., Zhang, S.: Adaptive stochastic variance reduction for subsampled newton method with cubic regularization. Optimization and Control, ArXiv (2018)
[40] Zhang, J.; Xiao, L., A stochastic composite gradient method with incremental variance reduction, Adv. NeurIPS, 32, 9078-9088 (2019)
[41] Zhang, J., Xiao, L.: Stochastic Variance-reduced Prox-linear Algorithms for Nonconvex Composite Optimization. Math. Program., 1-43 (2021)
[42] Zhao, R.; Fan, J., Global complexity bound of the Levenberg-Marquardt method, Optim. Methods Softw., 31, 4, 805-814 (2016) · Zbl 1406.65037 · doi:10.1080/10556788.2016.1179737
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.