zbMATH — the first resource for mathematics

On terminating Markov decision processes with a risk-averse objective function. (English) Zbl 0995.93075
This paper deals with terminating risk-sensitive finite states Markov decision processes with an absorbing and cost-free extra state. So the terminating problem is to seek stochastic shortest paths. Introducing two dynamic programming operators, the author gives the following results. (i) The existence and characterization of an optimal policy. (ii) Convergence properties for value iteration and policy iteration. Moreover, he illustrates the results with two computational examples.
Reviewer: M.Nisio (Osaka)

93E20 Optimal stochastic control
90C40 Markov and semi-Markov decision processes
49L20 Dynamic programming in optimal control and differential games
49J55 Existence of optimal solutions to problems involving randomness
Full Text: DOI
[1] Bertsekas, D.P.; Tsitsiklis, J.N., Parallel and distributed computation: numerical methods, (1989), Prentice-Hall Englewood Cliffs, NJ · Zbl 0743.65107
[2] Bertsekas, D.P.; Tsitsiklis, J.N., Analysis of stochastic shortest path problems, Mathematics of operations research, 16, 3, 580-595, (1991) · Zbl 0751.90077
[3] Chung, K.-J.; Sobel, M.J., Discounted MDPs: distribution functions and exponential utility maximization, SIAM journal on control and optimization, 25, 49-62, (1987) · Zbl 0617.90085
[4] Coraluppi, S.P.; Marcus, S.I., Risk-sensitive and minimax control of discrete-time, finite-state Markov decision processes, Automatica, 35, 301-309, (1999) · Zbl 0936.93052
[5] Denardo, E.V.; Rothblum, U.G., Optimal stopping, exponential utility, and linear programming, Mathematical programming, 16, 2, 228-244, (1979)
[6] Fleming, W. H., & McEneaney, W. M. (1992). Risk-sensitive optimal control and differential games. In: T. E. Duncan and B. Pask-Duncan (Eds.), Proceedings of the stochastic theory and adaptive controls workshop, Lecture Notes in Control and Information Sciences, Vol. 184, New York: Springer. · Zbl 0788.90097
[7] Glover, K.; Doyle, J.C., State-space formulae for all stabilizing controllers that satisfy an H∞-norm bound and relations to risk-sensitivity, Systems and control letters, 11, 167-172, (1988) · Zbl 0671.93029
[8] Hernandez-Hernandez, D.; Marcus, S.I., Risk-sensitive control of Markov processes in countable state space, Systems and control letters, 29, 147-155, (1996) · Zbl 0866.93101
[9] Howard, R.S.; Matheson, J.E., Risk-sensitive Markov decision processes, Management sciences, 8, 356-369, (1972) · Zbl 0238.90007
[10] Jacobson, D.H., Optimal stochastic linear systems with exponential performance criteria and their relation to deterministic differential games, IEEE transactions on automatic control, AC-18, 124-131, (1973) · Zbl 0274.93067
[11] Jaquette, S. C. (1975). Utility optimal policies in an undiscounted Markov decision process. Technical Report 275, Department of Operations Research, Cornell University, Ithaca, NY.
[12] Marcus, S. I., Fernandez-Gaucherand, E., Hernandez-Hernandez, D., Coraluppi, S., & Fard, P. (1997). Risk sensitive Markov decision processes. In C. I. Byrnes et al. (Eds.), Systems and control in the twenty-first century. Basel: Birkhauser. · Zbl 1065.90543
[13] Patek, S.D.; Bertsekas, D.P., Stochastic shortest path games, SIAM journal on control and optimization, 37, 3, 804-824, (1999) · Zbl 0918.90148
[14] Rothblum, U. G. (1974). Multiplicative Markov decision chains. Ph.D. Thesis, Stanford University, Stanford, CA. · Zbl 0535.90097
[15] Runolfsson, T., The equivalence between infinite-horizon optimal control of stochastic systems with exponential-of-integral performance index and stochastic differential games, IEEE transactions on automatic control, 39, 8, 1551-1563, (1994) · Zbl 0930.93084
[16] Whittle, P., Risk-sensitive linear/quadratic/Gaussian control, Advances in applied probability, 13, 764-777, (1981) · Zbl 0489.93067
[17] Whittle, P. (1990). Risk-sensitive optimal control. Wiley-Interscience Series in Systems and Optimization. Chichester: Wiley.
[18] Whittle, P. (1996a). Optimal control: Basics and beyond. Wiley-Interscience Series in Systems and Optimization. Chichester: Wiley.
[19] Whittle, P. (1996b). Why discount? The rationale of discounting in optimisation problems. In C. C. Heyde et al. (Eds.), Athens conference on applied probability and time series: Vol. 1. Applied probability. Lecture Notes in Statistics, Vol. 114. Berlin: Springer. · Zbl 0858.90012
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.