×

Robust reinforcement learning control with static and dynamic stability. (English) Zbl 0994.93054

A robust controller for reinforcement learning is designed. A stability analysis of a neural network controller implemented in parallel with the robust controller is performed. The results are demonstrated and analysed for two control tasks.

MSC:

93D21 Adaptive or robust stabilization
92B20 Neural networks for/in biological studies, artificial life and related topics
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] ?-Analysis and Synthesis Toolbox. The MathWorks Inc.: 24 Prime Park Way Natick, MA 01760-1500 (1 ed.) 1996.
[2] Barto, IEEE Transactions on Systems, Man, and Cybernetics 13 pp 835– (198)
[3] Learning and sequential decision making. In Learning and Computational Neuroscience: Foundations of Adaptive Networks. Gabriel M, Moore J (eds.) MIT Press: Cambridge, MA, 1990, p. 539-602.
[4] Fu-Chuang, IEEE Transactions on Automatic Control 40 pp 791– (1995) · Zbl 0925.93461 · doi:10.1109/9.384214
[5] Feedback Control Theory. Macmillan Publishing Company: UK, 1992.
[6] LMI Control Toolbox. MathWorks Inc.: MA, 1995.
[7] Learning to control an unstable system with forward modeling. In Advances in Neural Information Processing Systems. (ed.) vol. 2, Morgan Kaufmann: San Meteo, CA, 1990; p. 324-331.
[8] A Guide to IQC?: Software for Robustness Analysis. MIT/Lund Institute of Technology, http://www.mit.edu/people/ameg/home.html, 1999.
[9] Megretski, IEEE Transactions on Automatic Control 42 pp 830– (1997) · Zbl 0881.93062 · doi:10.1109/9.587335
[10] System analysis via integral quadratic constraints Part II. Technical Report ISRN LUTFD2/TFRT-7559-SE, Lund Institute of Technology. September 1997.
[11] Packard, Automatica 29 pp 71– (1993) · Zbl 0772.93023 · doi:10.1016/0005-1098(93)90175-S
[12] Feedback Control Systems. Englewood cliffs: NJ, Prentice Hall, (3 ed.) 1996.
[13] Polycarpou, IEEE Transactions on Automatic Control 41 pp 447– (1996) · Zbl 0846.93060 · doi:10.1109/9.486648
[14] Rantzer, Systems & Control Letters 28 pp 7– (1996) · Zbl 0866.93052 · doi:10.1016/0167-6911(95)00063-1
[15] Learning internal representations by error propagation. In Parallel Distributed Processing. Rumelhart D, McClelland J (eds.) vol. 1. Bradford Books: 1986.
[16] Skogestad, IEEE Transactions on Automatic Control 33 pp 1092– (1988) · Zbl 0669.93055 · doi:10.1109/9.14431
[17] Multivariable Feedback Control. Wiley: New York, 1996.
[18] Reinforcement Learning: An Introduction. The MIT Press: Cambridge, MA, 1998.
[19] Suykens, Neural Networks 10 (1997) · Zbl 05472830 · doi:10.1016/S0893-6080(96)00104-9
[20] Tsitsiklis, IEEE Transactions on Automatic Control 42 pp 677– (1997) · Zbl 0914.93075 · doi:10.1109/9.580874
[21] Learning with Delayed Rewards. Ph.D. Thesis. Cambridge University Psychology Department, Cambridge, England, 1989.
[22] Essentials of Robust Control. Prentice-Hall: Englewood cliffs, NJ, 1998.
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.