×

Convergence and convergence rate of stochastic gradient search in the case of multiple and non-isolated extrema. (English) Zbl 1357.62274

Given a continuously differentiable function \(f=f (\theta)\), \(\theta \in \mathbb{R}^n\), the problem is to determine a solution \(\hat{\theta}\) of the equation \(\nabla f (\theta)=0\). Having an estimate of the gradient \(\nabla f (\theta)\), for the iterative computation of a zero of the gradient, a standard stochastic gradient procedure is considered. Under certain assumptions on the additive noise term and the function \(f\), results on the almost sure convergence and convergence rate are provided. The results are illustrated by means of examples from principle component analysis and maximum likelihood estimation.

MSC:

62L20 Stochastic approximation
90C15 Stochastic programming
32B20 Semi-analytic sets, subanalytic sets, and generalizations

Software:

ICALAB
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Absil, P.-A.; Kurdyka, K., On the stable equilibrium points of gradient systems, Systems Control Lett., 55, 573-577 (2006) · Zbl 1129.34320
[2] Absil, P.-A.; Mahony, R.; Andrews, B., Convergence of the iterates of descent methods for analytic cost functions, SIAM J. Optim., 16, 531-547 (2005) · Zbl 1092.90036
[3] Benveniste, A.; Metivier, M.; Priouret, P., Adaptive Algorithms and Stochastic Approximations (1990), Springer-Verlag
[4] Bertsekas, D. P., Nonlinear Programming (1999), Athena Scientific · Zbl 1015.90077
[5] Bertsekas, D. P.; Tsitsiklis, J. N., Gradient convergence in gradient methods with errors, SIAM J. Optim., 10, 627-642 (2000) · Zbl 1049.90130
[6] Bierstone, E.; Milman, P. D., Semianalytic and subanalytic sets, Inst. Hautes Études Sci. Publ. Math., 67, 5-42 (1988) · Zbl 0674.32002
[7] Bolte, J.; Daniilidis, A.; Lewis, A., The Lojasiewicz inequality for nonsmooth subanalytic functions with applications to subgradient dynamical systems, SIAM J. Optim., 17, 1205-1223 (2006) · Zbl 1129.26012
[8] Borkar, V. S., Stochastic Approximation: Dynamical Systems Viewpoint (2008), Cambridge University Press · Zbl 1181.62119
[9] Borkar, V. S.; Meyn, S. P., The ODE method for convergence of stochastic approximation and reinforcement learning, SIAM J. Control Optim., 38, 447-469 (2000) · Zbl 0990.62071
[10] Chen, H.-F., Stochastic Approximation and its Application (2002), Kluwer
[11] Cichocki, A.; Amari, S., Adaptive Blind Signal and Image Processing: Learning Algorithms and Applications (2002), Wiley
[12] Delmas, J.-P., Subspace tracking for signal processing, (Adali, T.; Haykin, S., Adaptive Signal Processing: Next Generation Solutions (2010), Wiley)
[13] Delmas, J.-P.; Cardoso, J.-F., Asymptotic distributions associated to Oja’s learning equation for neural networks, IEEE Trans. Neural Netw., 9, 1246-1257 (1998)
[14] Gu, M. G.; Zhu, H.-T., Maximum likelihood estimation for spatial models by Markov chain Monte Carlo stochastic approximation, J. R. Stat. Soc. Ser. B, 63, 339-355 (2001) · Zbl 0979.62060
[15] Hall, P.; Heyde, C. C., Martingale Limit Theory and its Applications (1980), Prentice Hall · Zbl 0462.60045
[16] Khalil, H. K., Nonlinear Systems (2002), Prentice Hall
[17] Krantz, S. G.; Parks, H. R., A Primer of Real Analytic Functions (2002), Birikhäuser · Zbl 1015.26030
[18] Kurdyka, K., On gradients of functions definable in o-minimal structures, Ann. Inst. Fourier (Grenoble), 48, 769-783 (1998) · Zbl 0934.32009
[19] Kushner, H. J.; Yin, G. G., Stochastic Approximation and Recursive Algorithms and Applications (2003), Springer-Verlag · Zbl 1026.62084
[20] Ljung, L.; Söderström, T., Theory and Practice of Recursive Identification (1983), MIT Press · Zbl 0548.93075
[21] Lojasiewicz, S., Sur le problème de la division, Studia Math., 18, 87-136 (1959) · Zbl 0115.10203
[22] Lojasiewicz, S., Sur la géométrie semi- et sous-analytique, Ann. Inst. Fourier (Grenoble), 43, 1575-1595 (1993) · Zbl 0803.32002
[23] Metivier, M.; Priouret, P., Applications of a Kushner-Clark lemma to general classes of stochastic algorithms, IEEE Trans. Inform. Theory, 30, 140-151 (1984) · Zbl 0546.62056
[24] Nevel’son, M. B.; Has’minskii, R. Z., Stochastic Approximation and Recursive Estimation (1973), American Mathematical Society · Zbl 0355.62075
[25] Pflug, G. Ch., Optimization of Stochastic Models: The Interface Between Simulation and Optimization (1996), Kluwer · Zbl 0909.90220
[26] Polyak, B. T., Introduction to Optimization (1987), Optimization Software
[27] Polyak, B. T.; Tsypkin, Y. Z., Criterion algorithms of stochastic optimization, Autom. Remote Control, 45, 766-774 (1984) · Zbl 0553.90075
[28] Spall, J. C., Introduction to Stochastic Search and Optimization (2003), Wiley · Zbl 1088.90002
[29] Tadić, V. B., On the Almost sure rate of convergence of linear stochastic approximation, IEEE Trans. Inform. Theory, 50, 401-409 (2004) · Zbl 1288.60037
[30] Tadić, V. B., Analyticity, convergence and convergence rate of recursive maximum likelihood estimation in hidden Markov models, IEEE Trans. Inform. Theory, 56, 6406-6432 (2010) · Zbl 1366.62166
[32] Yang, B., Projection approximation subspace tracking, IEEE Trans. Signal Process., 43, 95-107 (1995)
[33] Younes, L., Estimation and annealing for Gibbsian fields, Ann. Inst. H. Poincaré Probab. Statist., 24, 269-294 (1988) · Zbl 0651.62091
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.