×

Optimal synchronization control of multiple Euler-Lagrange systems via event-triggered reinforcement learning. (English) Zbl 1478.93425

Summary: In this paper, an event-triggered reinforcement learning-based met-hod is developed for model-based optimal synchronization control of multiple Euler-Lagrange systems (MELSs) under a directed graph. The strategy of event-triggered optimal control is deduced through the establishment of Hamilton-Jacobi-Bellman (HJB) equation and the triggering condition is then proposed. Event-triggered policy iteration (PI) algorithm is then borrowed from reinforcement learning algorithms to find the optimal solution. One neural network is used to represent the value function to find the analytical solution of the event-triggered HJB equation, weights of which are updated aperiodically. It is proved that both the synchronization error and the weight estimation error are uniformly ultimately bounded (UUB). The Zeno behavior is also excluded in this research. Finally, an example of multiple 2-DOF prototype manipulators is shown to validate the effectiveness of our method.

MSC:

93C65 Discrete event control/observation systems
49L20 Dynamic programming in optimal control and differential games
93A16 Multi-agent systems
35F21 Hamilton-Jacobi equations
PDF BibTeX XML Cite
Full Text: DOI

References:

[1] A. Abdessameud; A. Tayebi; I. G. Polushin, Leader-follower synchronization of Euler-Lagrange systems with time-varying leader trajectory and constrained discrete-time communication, IEEE Trans. Autom. Control, 62, 2539-2545 (2017) · Zbl 1366.93559
[2] C. Amato; G. Konidaris; A. Anders; G. Cruz; J. P. How; L. P. Kaelbling, Policy search for multi-robot coordination under uncertainty, Int. J. Robot. Res., 35, 1760-1778 (2016)
[3] M. Bardi and I. Capuzzo-Dolcetta, Optimal Control and Viscosity Solutions of Hamilton-Jacobi-Bellman Equations, Systems & Control: Foundations & Applications. Birkhäuser Boston, Inc., Boston, MA, 1997. · Zbl 0890.49011
[4] D. P. Bertsekas, J. N. Tsitsiklis and A. Volgenant, Neuro-Dynamic Programming, Second edition. Athena Scientific Optimization and Computation Series. Athena Scientific, Belmont, MA, 1999.
[5] D. P. Bertsekas, Dynamic Programming and Optimal Control, Athena scientific, Belmont, MA, 1995. · Zbl 0904.90170
[6] G. Chen; Y. Yue; Y. Song, Finite-time cooperative-tracking control for networked Euler-Lagrange systems, IET Control Theory Appl., 7, 1487-1497 (2013)
[7] S. J. Chung; J. J. E. Slotine, Cooperative robot control and concurrent synchronization of Lagrangian systems, IEEE Trans. Robot., 25, 686-700 (2009)
[8] D. V. Dimarogonas; E. Frazzoli; K. H. Johansson, Distributed event-triggered control for multi-agent systems, IEEE Trans. Autom. Control, 57, 1291-1297 (2012) · Zbl 1369.93019
[9] F. Heppner and U. Grenander, A stochastic nonlinear model for coordinated bird flocks, Proc. Ubiquity Chaos, 233 (1990), 238.
[10] W. Hu; L. Liu; G. Feng, Consensus of multi-agent systems by distributed event-triggered control, Proc. IFAC, 47, 9768-9773 (2014)
[11] N. Huang; Z. Duan; Y. Zhao, Distributed consensus for multiple Euler-Lagrange systems: An event-triggered approach, Sci. China Technol. Sci., 59, 33-44 (2016)
[12] B. Igelnik; Y. H. Pao, Stochastic choice of basis functions in adaptive function approximation and the functional-link net, IEEE Trans. Neural Netw., 6, 1320-1329 (1995)
[13] X. Jin, D. Wei, W. He, L. Kocarev, Y. Tang and J. Kurths, Twisting-based finite-time consensus for Euler-Lagrange systems with an event-triggered strategy, IEEE Trans. Netw. Sci. Eng., (2019), 1-1.
[14] Y. Katz; K. Tunstrøm; C. C. Ioannou; C. Huepe; I. D. Couzin, Inferring the structure and dynamics of interactions in schooling fish, Proc. Natl Acad. Sci., 108, 18720-18725 (2011)
[15] H. K. Khalil, Nonlinear Systems, Upper Saddle River, NJ: Prentice hall, 2002. · Zbl 1003.34002
[16] J. R. Klotz; Z. Kan; J. M. Shea; E. L. Pasiliao; W. E. Dixon, Asymptotic synchronization of a leader-follower network of uncertain Euler-Lagrange systems, IEEE Trans. Control Network Syst., 2, 174-182 (2014) · Zbl 1370.93014
[17] J. R. Klotz; S. Obuz; Z. Kan; W. E. Dixon, Synchronization of uncertain Euler-Lagrange systems with uncertain time-varying communication delays, IEEE Trans. Cybern., 48, 807-817 (2018)
[18] F. L. Lewis, D. Vrabie and V. L. Syrmos, Optimal Control, John Wiley & Sons, New Jersey, 2012. · Zbl 1284.49001
[19] X. Li; X. Yang; T. Huang, Persistence of delayed cooperative models: Impulsive control method, Appl. Math. Comput., 342, 130-146 (2019) · Zbl 1428.34113
[20] J. Li; H. Modares; T. Chai; F. L. Lewis; L. Xie, Off-policy reinforcement learning for synchronization in multiagent graphical games, IEEE Trans. Neural Netw. Learn. Syst., 28, 2434-2445 (2017)
[21] A. Loria; H. Nijmeijer, Bounded output feedback tracking control of fully actuated Euler-Lagrange systems, Syst. Control Lett., 33, 151-161 (1998) · Zbl 0902.93034
[22] J. Mei; W. Ren; G. Ma, Distributed containment control for Lagrangian networks with parametric uncertainties under a directed graph, Automatica, 48, 653-659 (2012) · Zbl 1238.93009
[23] J. J. Murray; C. J. Cox; G. G. Lendaris; R. Saeks, Adaptive dynamic programming, IEEE Trans. Syst. Man Cybern., 32, 140-153 (2002)
[24] E. Nuno; R. Ortega; L. Basanez; D. Hill, Synchronization of networks of nonidentical Euler-Lagrange systems with uncertain parameters and communication delays, IEEE Trans. Autom. Control, 56, 935-941 (2011) · Zbl 1368.93308
[25] J. Qin; M. Li; Y. Shi; Q. Ma; W. X. Zheng, Optimal synchronization control of multiagent systems with input saturation via off-policy reinforcement learning, IEEE Trans. Neural Netw. Learn. Syst., 30, 85-96 (2018)
[26] Z. Qiu; Y. Hong; L. Xie, Optimal consensus of Euler-Lagrangian systems with kinematic constraints, Proc. IFAC, 49, 327-332 (2016)
[27] R. S. Sutton and A. G. Barto, Introduction to Reinforcement Learning, Second edition. Adaptive Computation and Machine Learning. MIT Press, Cambridge, MA
[28] Y. Tang, X. Wu, P. Shi and F. Qian, Input-to-state stability for nonlinear systems with stochastic impulses, Automatica, 113 (2020), 108766, 12pp. · Zbl 1440.93220
[29] K. G. Vamvoudakis; F. L. Lewis; G. R. Hudas, Multi-agent differential graphical games: Online adaptive learning solution for synchronization with optimality, Automatica, 48, 1598-1611 (2012) · Zbl 1267.93190
[30] K. G. Vamvoudakis, Event-triggered optimal adaptive control algorithm for continuous-time nonlinear systems, IEEE/CAA J. Autom. Sinica, 1, 282-293 (2014)
[31] X. F. Wang; Z. Deng; S. Ma; X. Du, Event-triggered design for multi-agent optimal consensus of Euler-Lagrangian systems, Kybernetika, 53, 179-194 (2017) · Zbl 1424.49028
[32] C. Wei; J. Luo; H. Dai; J. Yuan, Adaptive model-free constrained control of postcapture flexible spacecraft: A Euler-Lagrange approach, J. Vib. Contr., 24, 4885-4903 (2018)
[33] S. Weng; D. Yue; J. Shi, Distributed cooperative control for multiple photovoltaic generators in distribution power system under event-triggered mechanism, J. Franklin Inst., 353, 3407-3427 (2016) · Zbl 1347.93030
[34] D. Yang; X. Li; J. Qiu, Output tracking control of delayed switched systems via state-dependent switching and dynamic output feedback, Nonlinear Anal. Hybrid Syst., 32, 294-305 (2019) · Zbl 1425.93149
[35] H. Zhang; F. L. Lewis; A. Das, Optimal design for synchronization of cooperative systems: State feedback, observer and output feedback, IEEE Trans. Autom. Control, 56, 1948-1952 (2011) · Zbl 1368.93265
[36] W. Zhang; Y. Tang; T. Huang; A. V. Vasilakos, Consensus of networked Euler-Lagrange systems under time-varying sampled-data control, IEEE Trans. Ind. Inform., 14, 535-544 (2018)
[37] W. Zhang; Q. Han; Y. Tang; Y. Liu, Sampled-data control for a class of linear time-varying systems, Automatica, 103, 126-134 (2019) · Zbl 1415.93169
[38] W. Zhao; H. Zhang, Distributed optimal coordination control for nonlinear multi-agent systems using event-triggered adaptive dynamic programming method, ISA Trans., 91, 184-195 (2019)
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.