×

Continuous-time Markov decision processes with unbounded transition and discounted-reward rates. (English) Zbl 1191.90091

Summary: We study continuous-time Markov decision processes in Polish spaces. The optimality criterion to be maximized is the expected discounted criterion. The transition rates may be unbounded, and the reward rates may have neither upper nor lower bounds. We provide conditions on the controlled system’s primitive data under which we prove that the transition functions of possibly non-homogeneous continuous-time Markov processes are regular by using Feller’s construction approach to such transition functions. Then, under continuity and compactness conditions we prove the existence of optimal stationary policies by using the technique of extended infinitesimal operators associated with the transition functions of possibly non-homogeneous continuous-time Markov processes, and also provide a recursive way to compute (or at least to approximate) the optimal reward values. The conditions provided in this paper are different from those used in the previous literature, and they are illustrated with an example.

MSC:

90C40 Markov and semi-Markov decision processes
PDFBibTeX XMLCite
Full Text: DOI

References:

[1] Anderson W.J., Continuous-Time Markov Chains (1991) · Zbl 0731.60067
[2] DOI: 10.1214/aoms/1177704593 · Zbl 0133.12906 · doi:10.1214/aoms/1177704593
[3] Blumental R.M., Markov Processes and Potential Theory (1968)
[4] DOI: 10.1142/9789812562456 · doi:10.1142/9789812562456
[5] Chung K.L., Lectures on Boundary Theory for Markov Chains (1970) · Zbl 0204.51003
[6] DOI: 10.1214/aos/1176343653 · Zbl 0345.93073 · doi:10.1214/aos/1176343653
[7] Dynkin E.B., Controlled Markov Processes (1979) · doi:10.1007/978-1-4615-6746-2
[8] DOI: 10.1287/moor.1040.0089 · Zbl 1082.90126 · doi:10.1287/moor.1040.0089
[9] DOI: 10.1090/S0002-9947-1940-0002697-3 · JFM 66.0624.02 · doi:10.1090/S0002-9947-1940-0002697-3
[10] Fleming W.H., Controlled Markov Processes and Viscosity Solutions (1993) · Zbl 0773.60070
[11] Gihman I.I., Controlled Stochastic Processes (1979) · Zbl 0404.60061 · doi:10.1007/978-1-4612-6202-2
[12] DOI: 10.1287/moor.1060.0210 · Zbl 1278.90426 · doi:10.1287/moor.1060.0210
[13] DOI: 10.1214/aoap/1042765671 · Zbl 1049.60067 · doi:10.1214/aoap/1042765671
[14] DOI: 10.1023/B:ACAP.0000003675.06200.45 · Zbl 1043.93067 · doi:10.1023/B:ACAP.0000003675.06200.45
[15] DOI: 10.1109/9.975505 · Zbl 1017.90120 · doi:10.1109/9.975505
[16] DOI: 10.1239/jap/1025131422 · Zbl 1028.90078 · doi:10.1239/jap/1025131422
[17] DOI: 10.1239/jap/1032192558 · Zbl 0903.90176 · doi:10.1239/jap/1032192558
[18] Hernández-Lerma O., Lectures on Continuous-Time Markov Control Processes (1994) · Zbl 0866.93102
[19] DOI: 10.1023/A:1011970418845 · Zbl 1160.93397 · doi:10.1023/A:1011970418845
[20] Hernández-Lerma O., Further Topics on Discrete-Time Markov Control Processes (1999) · Zbl 0928.93002 · doi:10.1007/978-1-4612-0561-6
[21] Hou Z.T., Markov Decision Processes (1998)
[22] Howard R.A., Dynamic Programming and Markov Processes (1960) · Zbl 0091.16001
[23] DOI: 10.1214/aoms/1177693321 · Zbl 0234.93027 · doi:10.1214/aoms/1177693321
[24] DOI: 10.1287/opre.29.5.971 · Zbl 0478.92014 · doi:10.1287/opre.29.5.971
[25] DOI: 10.1239/jap/1014842288 · Zbl 1018.90009 · doi:10.1239/jap/1014842288
[26] DOI: 10.1109/9.898698 · Zbl 1017.90121 · doi:10.1109/9.898698
[27] DOI: 10.1007/BF00531427 · Zbl 0444.60096 · doi:10.1007/BF00531427
[28] DOI: 10.1287/opre.23.4.687 · Zbl 0312.60048 · doi:10.1287/opre.23.4.687
[29] DOI: 10.1081/STA-120028693 · Zbl 1114.90476 · doi:10.1081/STA-120028693
[30] DOI: 10.1016/0022-247X(68)90194-7 · Zbl 0157.50301 · doi:10.1016/0022-247X(68)90194-7
[31] DOI: 10.1002/9780470316887 · doi:10.1002/9780470316887
[32] Sennott L.I., Stochastic Dynamic Programming and the Control of Queueing System (1999) · Zbl 0997.93503
[33] DOI: 10.2307/1426467 · Zbl 0446.60062 · doi:10.2307/1426467
[34] Williams D., Diffusions, Markov Processes, and Martingales (1979)
[35] DOI: 10.1137/1125005 · Zbl 0456.90086 · doi:10.1137/1125005
[36] DOI: 10.1137/1124014 · Zbl 0437.93033 · doi:10.1137/1124014
[37] DOI: 10.1007/BF02080199 · Zbl 0752.90083 · doi:10.1007/BF02080199
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.