×

Finite horizon optimal control of non-linear discrete-time switched systems using adaptive dynamic programming with \(\epsilon\)-error bound. (English) Zbl 1291.49021

Summary: In this paper, we aim to solve the finite-horizon optimal control problem for a class of non-linear discrete-time switched systems using an Adaptive Dynamic Programming (ADP) algorithm. A new \(\epsilon\)-optimal control scheme based on the iterative ADP algorithm is presented which makes the value function converge iteratively to the greatest lower bound of all value function indices within an error according to \(\epsilon\) within finite time. Two neural networks are used as parametric structures to implement the iterative ADP algorithm with \(\epsilon\)-error bound, which aim at approximating the value function and the control policy, respectively, and then the optimal control policy is obtained. Finally, a simulation example is included to illustrate the applicability of the proposed method.

MSC:

49M30 Other numerical methods in calculus of variations (MSC2010)
49L20 Dynamic programming in optimal control and differential games
90C39 Dynamic programming
93C55 Discrete-time control/observation systems
93C10 Nonlinear systems in control theory
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] DOI: 10.1109/TSMCB.2008.926614 · doi:10.1109/TSMCB.2008.926614
[2] Bertsekas D.P., Neuro-Dynamic Programming (1996) · Zbl 0924.68163
[3] DOI: 10.1016/j.automatica.2005.04.017 · Zbl 1125.49310 · doi:10.1016/j.automatica.2005.04.017
[4] DOI: 10.1109/TAC.2008.2006100 · Zbl 1367.93626 · doi:10.1109/TAC.2008.2006100
[5] DOI: 10.1109/TAC.2003.817938 · Zbl 1364.49042 · doi:10.1109/TAC.2003.817938
[6] DOI: 10.1080/00207720802645253 · Zbl 1291.93024 · doi:10.1080/00207720802645253
[7] DOI: 10.1080/00207721.2010.549590 · Zbl 1312.49029 · doi:10.1080/00207721.2010.549590
[8] DOI: 10.1109/TAC.2007.902777 · Zbl 1366.93652 · doi:10.1109/TAC.2007.902777
[9] DOI: 10.1016/j.sysconle.2006.06.014 · Zbl 1120.93311 · doi:10.1016/j.sysconle.2006.06.014
[10] Jagannathan S., Neural Network Control of Nonlinear Discrete-Time Systems (2006) · Zbl 1123.93010
[11] DOI: 10.1080/00207720902974736 · Zbl 1213.93038 · doi:10.1080/00207720902974736
[12] DOI: 10.1007/978-1-4612-0017-8 · doi:10.1007/978-1-4612-0017-8
[13] DOI: 10.1109/TAC.2006.878720 · Zbl 1366.90208 · doi:10.1109/TAC.2006.878720
[14] DOI: 10.1109/MCAS.2009.933854 · doi:10.1109/MCAS.2009.933854
[15] DOI: 10.1080/00207720902974520 · Zbl 1291.93078 · doi:10.1080/00207720902974520
[16] DOI: 10.1109/TAC.2009.2029310 · Zbl 1367.93287 · doi:10.1109/TAC.2009.2029310
[17] DOI: 10.1109/TSMCC.2002.801727 · doi:10.1109/TSMCC.2002.801727
[18] DOI: 10.1080/00207720903201865 · Zbl 1202.93032 · doi:10.1080/00207720903201865
[19] DOI: 10.1080/00207721.2011.652222 · Zbl 1278.93142 · doi:10.1080/00207721.2011.652222
[20] DOI: 10.1109/72.623201 · doi:10.1109/72.623201
[21] Sutton R.S., Reinforcement Learning: An Introduction (1998)
[22] DOI: 10.1109/72.914523 · doi:10.1109/72.914523
[23] DOI: 10.1109/9780470544785 · doi:10.1109/9780470544785
[24] DOI: 10.1109/TAC.2006.875053 · Zbl 1366.49038 · doi:10.1109/TAC.2006.875053
[25] DOI: 10.1080/00207721.2012.683830 · Zbl 1307.93362 · doi:10.1080/00207721.2012.683830
[26] Werbos P.J., Handbook of Intelligent Control pp 493– (1992)
[27] Wang F.Y., IEEE Computational Intelligence Magazine 43 pp 9– (2009)
[28] DOI: 10.1080/00207170903177766 · Zbl 1184.93028 · doi:10.1080/00207170903177766
[29] DOI: 10.1080/00207720902974686 · Zbl 1175.93191 · doi:10.1080/00207720902974686
[30] DOI: 10.1109/TNN.2010.2076370 · doi:10.1109/TNN.2010.2076370
[31] DOI: 10.1007/3-540-36580-X_39 · doi:10.1007/3-540-36580-X_39
[32] DOI: 10.1016/S0167-6911(02)00288-8 · Zbl 1134.93403 · doi:10.1016/S0167-6911(02)00288-8
[33] DOI: 10.1080/00207721.2011.604738 · Zbl 1307.93366 · doi:10.1080/00207721.2011.604738
[34] DOI: 10.1109/TAC.2004.841935 · Zbl 1365.93377 · doi:10.1109/TAC.2004.841935
[35] DOI: 10.1109/TSMCB.2008.920269 · doi:10.1109/TSMCB.2008.920269
[36] DOI: 10.1109/TNN.2009.2027233 · doi:10.1109/TNN.2009.2027233
[37] DOI: 10.1016/j.automatica.2010.02.021 · Zbl 1191.93068 · doi:10.1016/j.automatica.2010.02.021
[38] DOI: 10.1109/TNN.2011.2168538 · doi:10.1109/TNN.2011.2168538
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.