Qin, Chunbin; Zhang, Huaguang; Luo, Yanhong; Wang, Binrui Finite horizon optimal control of non-linear discrete-time switched systems using adaptive dynamic programming with \(\epsilon\)-error bound. (English) Zbl 1291.49021 Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 45, No. 8, 1683-1693 (2014). Summary: In this paper, we aim to solve the finite-horizon optimal control problem for a class of non-linear discrete-time switched systems using an Adaptive Dynamic Programming (ADP) algorithm. A new \(\epsilon\)-optimal control scheme based on the iterative ADP algorithm is presented which makes the value function converge iteratively to the greatest lower bound of all value function indices within an error according to \(\epsilon\) within finite time. Two neural networks are used as parametric structures to implement the iterative ADP algorithm with \(\epsilon\)-error bound, which aim at approximating the value function and the control policy, respectively, and then the optimal control policy is obtained. Finally, a simulation example is included to illustrate the applicability of the proposed method. Cited in 7 Documents MSC: 49M30 Other numerical methods in calculus of variations (MSC2010) 49L20 Dynamic programming in optimal control and differential games 90C39 Dynamic programming 93C55 Discrete-time control/observation systems 93C10 Nonlinear systems in control theory Keywords:non-linear switched system; adaptive dynamic programming; optimal control; finite horizon; neural network PDFBibTeX XMLCite \textit{C. Qin} et al., Int. J. Syst. Sci., Princ. Appl. Syst. Integr. 45, No. 8, 1683--1693 (2014; Zbl 1291.49021) Full Text: DOI References: [1] DOI: 10.1109/TSMCB.2008.926614 · doi:10.1109/TSMCB.2008.926614 [2] Bertsekas D.P., Neuro-Dynamic Programming (1996) · Zbl 0924.68163 [3] DOI: 10.1016/j.automatica.2005.04.017 · Zbl 1125.49310 · doi:10.1016/j.automatica.2005.04.017 [4] DOI: 10.1109/TAC.2008.2006100 · Zbl 1367.93626 · doi:10.1109/TAC.2008.2006100 [5] DOI: 10.1109/TAC.2003.817938 · Zbl 1364.49042 · doi:10.1109/TAC.2003.817938 [6] DOI: 10.1080/00207720802645253 · Zbl 1291.93024 · doi:10.1080/00207720802645253 [7] DOI: 10.1080/00207721.2010.549590 · Zbl 1312.49029 · doi:10.1080/00207721.2010.549590 [8] DOI: 10.1109/TAC.2007.902777 · Zbl 1366.93652 · doi:10.1109/TAC.2007.902777 [9] DOI: 10.1016/j.sysconle.2006.06.014 · Zbl 1120.93311 · doi:10.1016/j.sysconle.2006.06.014 [10] Jagannathan S., Neural Network Control of Nonlinear Discrete-Time Systems (2006) · Zbl 1123.93010 [11] DOI: 10.1080/00207720902974736 · Zbl 1213.93038 · doi:10.1080/00207720902974736 [12] DOI: 10.1007/978-1-4612-0017-8 · doi:10.1007/978-1-4612-0017-8 [13] DOI: 10.1109/TAC.2006.878720 · Zbl 1366.90208 · doi:10.1109/TAC.2006.878720 [14] DOI: 10.1109/MCAS.2009.933854 · doi:10.1109/MCAS.2009.933854 [15] DOI: 10.1080/00207720902974520 · Zbl 1291.93078 · doi:10.1080/00207720902974520 [16] DOI: 10.1109/TAC.2009.2029310 · Zbl 1367.93287 · doi:10.1109/TAC.2009.2029310 [17] DOI: 10.1109/TSMCC.2002.801727 · doi:10.1109/TSMCC.2002.801727 [18] DOI: 10.1080/00207720903201865 · Zbl 1202.93032 · doi:10.1080/00207720903201865 [19] DOI: 10.1080/00207721.2011.652222 · Zbl 1278.93142 · doi:10.1080/00207721.2011.652222 [20] DOI: 10.1109/72.623201 · doi:10.1109/72.623201 [21] Sutton R.S., Reinforcement Learning: An Introduction (1998) [22] DOI: 10.1109/72.914523 · doi:10.1109/72.914523 [23] DOI: 10.1109/9780470544785 · doi:10.1109/9780470544785 [24] DOI: 10.1109/TAC.2006.875053 · Zbl 1366.49038 · doi:10.1109/TAC.2006.875053 [25] DOI: 10.1080/00207721.2012.683830 · Zbl 1307.93362 · doi:10.1080/00207721.2012.683830 [26] Werbos P.J., Handbook of Intelligent Control pp 493– (1992) [27] Wang F.Y., IEEE Computational Intelligence Magazine 43 pp 9– (2009) [28] DOI: 10.1080/00207170903177766 · Zbl 1184.93028 · doi:10.1080/00207170903177766 [29] DOI: 10.1080/00207720902974686 · Zbl 1175.93191 · doi:10.1080/00207720902974686 [30] DOI: 10.1109/TNN.2010.2076370 · doi:10.1109/TNN.2010.2076370 [31] DOI: 10.1007/3-540-36580-X_39 · doi:10.1007/3-540-36580-X_39 [32] DOI: 10.1016/S0167-6911(02)00288-8 · Zbl 1134.93403 · doi:10.1016/S0167-6911(02)00288-8 [33] DOI: 10.1080/00207721.2011.604738 · Zbl 1307.93366 · doi:10.1080/00207721.2011.604738 [34] DOI: 10.1109/TAC.2004.841935 · Zbl 1365.93377 · doi:10.1109/TAC.2004.841935 [35] DOI: 10.1109/TSMCB.2008.920269 · doi:10.1109/TSMCB.2008.920269 [36] DOI: 10.1109/TNN.2009.2027233 · doi:10.1109/TNN.2009.2027233 [37] DOI: 10.1016/j.automatica.2010.02.021 · Zbl 1191.93068 · doi:10.1016/j.automatica.2010.02.021 [38] DOI: 10.1109/TNN.2011.2168538 · doi:10.1109/TNN.2011.2168538 This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.