×

Stochastic dynamic programming with non-linear discounting. (English) Zbl 1478.90139

Summary: In this paper, we study a Markov decision process with a non-linear discount function and with a Borel state space. We define a recursive discounted utility, which resembles non-additive utility functions considered in a number of models in economics. Non-additivity here follows from non-linearity of the discount function. Our study is complementary to the work of A. Jaśkiewicz et al. [Math. Oper. Res. 38, No. 1, 108–121 (2013; Zbl 1291.90290)], where also non-linear discounting is used in the stochastic setting, but the expectation of utilities aggregated on the space of all histories of the process is applied leading to a non-stationary dynamic programming model. Our aim is to prove that in the recursive discounted utility case the Bellman equation has a solution and there exists an optimal stationary policy for the problem in the infinite time horizon. Our approach includes two cases: (a) when the one-stage utility is bounded on both sides by a weight function multiplied by some positive and negative constants, and (b) when the one-stage utility is unbounded from below.

MSC:

90C40 Markov and semi-Markov decision processes
60J05 Discrete-time Markov processes on general state spaces
90C39 Dynamic programming
91B32 Resource and cost allocation (including fair division, apportionment, etc.)
91B62 Economic growth models

Citations:

Zbl 1291.90290
PDFBibTeX XMLCite
Full Text: DOI arXiv

References:

[1] Balbus, Ł., On recursive utilities with non-affine aggregator and conditional certainty equivalent, Econ. Theory, 70, 551-577 (2019) · Zbl 1450.91013 · doi:10.1007/s00199-019-01221-8
[2] Bäuerle, N.; Jaśkiewicz, A., Stochastic optimal growth model with risk-sensitive preferences, J. Econ. Theory, 173, 181-200 (2018) · Zbl 1400.91321 · doi:10.1016/j.jet.2017.11.005
[3] Bäuerle, N.; Rieder, U., Markov Decision Processes with Applications to Finance (2011), Berlin: Springer, Berlin · Zbl 1236.90004 · doi:10.1007/978-3-642-18324-9
[4] Becker, RA; Boyd, JH III, Capital Theory. Equilibrium Analysis and Recursive Utility (1997), New York: Blackwell Publishers, New York
[5] Berge, C., Topological Spaces (1963), New York: MacMillan, New York · Zbl 0114.38602
[6] Bertsekas, DP, Monotone mappings with application in dynamic programming, SIAM J. Control Optim., 15, 438-464 (1977) · Zbl 0357.90051 · doi:10.1137/0315031
[7] Bertsekas, DP; Shreve, SE, Stochastic Optimal Control: The Discrete Time Case (1978), New York: Academic Press, New York · Zbl 0471.93002
[8] Blackwell, D., Discounted dynamic programming, Ann. Math. Stat., 36, 226-235 (1965) · Zbl 0133.42805 · doi:10.1214/aoms/1177700285
[9] Boyd, JH III, Recursive utility and the Ramsey problem, J. Econ. Theory, 50, 326-345 (1990) · Zbl 0716.90011 · doi:10.1016/0022-0531(90)90006-6
[10] Brown, LD; Purves, R., Measurable selections of extrema, Ann. Stat., 1, 902-912 (1973) · Zbl 0265.28003
[11] Carbon Pricing Dashborad. The World Bank. https://carbonpricingdashboard.worldbank.org/map_data
[12] Denardo, EV, Contraction mappings in the theory underlying dynamic programming, SIAM Rev., 9, 165-177 (1967) · Zbl 0154.45101 · doi:10.1137/1009030
[13] Dugundji, J.; Granas, A., Fixed Point Theory (2003), New York: Springer, New York · Zbl 1025.47002
[14] Durán, J., Discounting long run average growth in stochastic dynamic programs, Econ. Theory, 22, 395-413 (2003) · Zbl 1033.90144 · doi:10.1007/s00199-002-0316-5
[15] Feinberg, EA; Shwartz, A., Handbook of Markov Decision Processes: Theory and Methods (2002), Dordrecht: Kluwer Academic Publishers, Dordrecht · Zbl 0979.90001
[16] Hernández-Lerma, O.; Lasserre, JB, Discrete-Time Markov Control Processes: Basic Optimality Criteria (1996), New York: Springer, New York · doi:10.1007/978-1-4612-0729-0
[17] Hernández-Lerma, O.; Lasserre, JB, Further Topics on Discrete-Time Markov Control Processes (1999), New York: Springer, New York · Zbl 0928.93002 · doi:10.1007/978-1-4612-0561-6
[18] Hinderer, K., Foundations of Non-Stationary Dynamic Programming with Discrete Time Parameter Lecture Notes in Operations Research (1970), New York: Springer, New York · Zbl 0202.18401 · doi:10.1007/978-3-642-46229-0
[19] Howard, RA, Dynamic Programming and Markov Processes (1960), Cambridge, MA: The Technology Press of MIT, Cambridge, MA · Zbl 0091.16001
[20] Jaśkiewicz, A.; Nowak, AS, Stochastic games with unbounded payoffs: applications to robust control in economics, Dyn. Games Appl., 1, 253-279 (2011) · Zbl 1263.91008 · doi:10.1007/s13235-011-0013-8
[21] Jaśkiewicz, A.; Matkowski, J.; Nowak, AS, Persistently optimal policies in stochastic dynamic programming with generalized discounting, Math. Oper. Res., 38, 108-121 (2013) · Zbl 1291.90290 · doi:10.1287/moor.1120.0561
[22] Jaśkiewicz, A.; Matkowski, J.; Nowak, AS, On variable discounting in dynamic programming: applications to resource extraction and other economic models, Ann. Oper. Res., 220, 263-278 (2014) · Zbl 1309.90119 · doi:10.1007/s10479-011-0931-2
[23] Jaśkiewicz, A.; Matkowski, J.; Nowak, AS, Generalised discounting in dynamic programming with unbounded returns, Oper. Res. Lett., 42, 231-233 (2014) · Zbl 1408.90305 · doi:10.1016/j.orl.2014.03.004
[24] Kechris, AS, Classical Descriptive Set Theory (1995), New York: Springer, New York · Zbl 0819.04002 · doi:10.1007/978-1-4612-4190-4
[25] Koopmans, TC, Stationary ordinal utility and impatience, Econometrica, 28, 287-309 (1960) · Zbl 0149.38401 · doi:10.2307/1907722
[26] Matkowski, J., Integral solutions of functional equations, Dissertationes Math., 127, 1-68 (1975) · Zbl 0318.39005
[27] Miao, J., Economic Dynamics in Discrete Time (2014), Cambridge: MIT Press, Cambridge · Zbl 1369.91001
[28] Ozaki, H.; Streufert, PA, Dynamic programming for non-additive stochastic objectives, J. Math. Econ., 25, 391-442 (1996) · Zbl 0870.90038 · doi:10.1016/0304-4068(95)00737-7
[29] Samuelson, P., A note on measurement of utility, Rev. Econ. Stud., 4, 155-161 (1937) · doi:10.2307/2967612
[30] Schäl, M., Conditions for optimality in dynamic programming and for the limit of n-stage optimal policies to be optimal, Z. Wahrscheinlichkeitstheorie Verwand. Gebiete, 32, 179-196 (1975) · Zbl 0316.90080 · doi:10.1007/BF00532612
[31] Stokey, NL; Lucas, RE Jr; Prescott, E., Recursive Methods in Economic Dynamics (1989), Cambridge, MA: Harvard University Press, Cambridge, MA · Zbl 0774.90018 · doi:10.2307/j.ctvjnrt76
[32] van der Wal, J., Stochastic Dynamic Programming (1981), Amsterdam: Mathematical Centre Tracts, Amsterdam · Zbl 0462.90055
[33] Weil, P., Nonexpected utility in macroeconomics, Quart. J. Econ., 105, 29-42 (1990) · doi:10.2307/2937817
[34] Wessels, J., Markov programming by successive approximations with respect to weighted supremum norms, J. Math. Anal. Appl., 58, 326-335 (1977) · Zbl 0354.90087 · doi:10.1016/0022-247X(77)90210-4
This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. In some cases that data have been complemented/enhanced by data from zbMATH Open. This attempts to reflect the references listed in the original paper as accurately as possible without claiming completeness or a perfect matching.