Computing rank dependent utility in graphical models for sequential decision problems.

*(English)*Zbl 1225.68256Summary: This paper is devoted to automated sequential decision in AI. More precisely, we focus here on the rank dependent utility (RDU) model. This model is able to encompass rational decision behaviors that the expected utility model cannot accommodate. However, the non-linearity of RDU makes it difficult to compute an RDU-optimal strategy in sequential decision problems. This has considerably slowed the use of RDU in operational contexts. In this paper, we are interested in providing new algorithmic solutions to compute an RDU-optimal strategy in graphical models. Specifically, we present algorithms for solving decision tree models and influence diagram models of sequential decision problems. For decision tree models, we propose a mixed integer programming formulation that is valid for a subclass of RDU models (corresponding to risk seeking behaviors). This formulation reduces to a linear program when mixed strategies are considered. In the general case (i.e., when there is no particular assumption on the parameters of RDU), we propose a branch and bound procedure to compute an RDU-optimal strategy among the pure ones. After highlighting the difficulties induced by the use of RDU in influence diagram models, we show how this latter procedure can be extended to optimize RDU in an influence diagram. Finally, we provide empirical evaluations of all the presented algorithms.

##### MSC:

68T42 | Agent technology and artificial intelligence |

68T20 | Problem solving in the context of artificial intelligence (heuristics, search strategies, etc.) |

##### Keywords:

algorithmic decision theory; rank dependent utility; decision trees; influence diagrams; planning under uncertainty
PDF
BibTeX
XML
Cite

\textit{G. Jeantet} and \textit{O. Spanjaard}, Artif. Intell. 175, No. 7--8, 1366--1389 (2011; Zbl 1225.68256)

Full Text:
DOI

**OpenURL**

##### References:

[1] | Abdellaoui, M., Parameter-free elicitation of utilities and probabilities weighting functions, Management science, 46, 11, 1497-1512, (2000) · Zbl 1232.91114 |

[2] | Allais, M., Le comportement de lʼhomme rationnel devant le risque: critique des postulats de lʼécole américaine, Econometrica, 21, 503-546, (1953) · Zbl 0050.36801 |

[3] | Allais, M., An outline of my main contributions to economic science, The American economic review, 87, 6, 3-12, (1997) |

[4] | Arrow, K.J., The theory of risk aversion, (1965), Ynjo Jahnsonin Saatio Helsinki · Zbl 0215.58602 |

[5] | Bertelé, U.; Brioschi, F., Nonserial dynamic programming, (1972), Academic Press · Zbl 0187.17801 |

[6] | Blythe, J., Decision-theoretic planning, AI magazine, 20, 1-15, (1999) |

[7] | Boutilier, C.; Dean, T.; Hanks, S., Decision-theoretic planning: structural assumptions and computational leverage, Journal of AI research, 11, 1-94, (1999) · Zbl 0918.68110 |

[8] | Camerer, C.; Ho, T., Non-linear weighting of probabilities and violations of the betweenness axiom, Journal of risk and uncertainty, 8, 167-196, (1994) · Zbl 0925.90038 |

[9] | Chateauneuf, A., Comonotonicity axioms and rank-dependent expected utility theory for arbitrary consequences, Journal of mathematical economics, 32, 21-45, (1999) · Zbl 0962.91023 |

[10] | Damiani, E.; De Capitani di Vimercati, S.; Samarati, P.; Viviani, M., A WOWA-based aggregation technique on trust values connected to metadata, Electronic notes in theoretical computer science, 157, 131-142, (2006) |

[11] | T. Dean, L. Kaelbling, J. Kirman, A. Nicholson, Planning with deadlines in stochastic domains, in: Proc. of the 11th AAAI, 1993, pp. 574-579. |

[12] | Dubois, D.; Prade, H.; Sabbadin, R., Decision-theoretic foundations of qualitative possibility theory, European journal of operational research, 128, 3, 459-478, (2001) · Zbl 0982.90028 |

[13] | Gonzales, R.; Wu, G., On the shape of the probability weighting function, Cognitive psychology, 38, 129-166, (1999) |

[14] | Hammond, P., Consequentialism and the independence axiom, (), 503-515 |

[15] | Hammond, P., Consequentialist foundations for expected utility, Theory and decision, 25, 1, 25-78, (1988) · Zbl 0645.90002 |

[16] | Handa, J., Risk, probabilities and a new theory of cardinal utility, Journal of political economics, 85, 97-122, (1977) |

[17] | Howard, R.; Matheson, J., Risk-sensitive Markov decision processes, Management science, 18, 7, 356-369, (1972) · Zbl 0238.90007 |

[18] | Howard, R.; Matheson, J., Influence diagrams, (), Decision analysis, 2, 3, 127-143, (2005), Reprinted: |

[19] | Jaffray, J.-Y.; Nielsen, T., An operational approach to rational decision making based on rank dependent utility, European journal of operational research, 169, 1, 226-246, (2006) |

[20] | G. Jeantet, O. Spanjaard, Rank-dependent probability weighting in sequential decision problems under uncertainty, in: Proc. of the International Conference on Automated Planning and Scheduling (ICAPS), 2008, pp. 148-155. |

[21] | F. Jensen, F.V. Jensen, S.L. Dittmer, From influence diagrams to junction trees, in: Proc. of the Tenth Annual Conference on Uncertainty in Artificial Intelligence (UAI), 1994, pp. 367-373. |

[22] | Kaebling, L.; Littman, M.; Cassandra, A., Planning and acting in partially observable stochastic domains, Artificial intelligence, 101, 99-134, (1999) · Zbl 0908.68165 |

[23] | Kahneman, D.; Tversky, A., Prospect theory: an analysis of decision under risk, Econometrica, 47, 263-291, (1979) · Zbl 0411.90012 |

[24] | Karmarkar, U., Subjectively weighted utility and the allais paradox, Organisational behavior and human performance, 24, 1, 67-72, (1979) |

[25] | S. Koenig, R.G. Simmons, Risk-sensitive planning with probabilistic decision graphs, in: Proc. of the International Conference on Principles of Knowledge Representation and Reasoning (KR), 1994, PP. 363-373. |

[26] | Y. Liu, S. Koenig, Existence and finiteness conditions for risk-sensitive planning: Results and conjectures, in: Proc. of the Twentieth Annual Conference on Uncertainty in Artificial Intelligence (UAI), 2005, pp. 354-363. |

[27] | Y. Liu, S. Koenig, Risk-sensitive planning with one-switch utility functions: Value iteration, in: Proc. of the Twentieth National Conference on Artificial Intelligence, AAAI, 2005, pp. 993-999. |

[28] | Y. Liu, S. Koenig, An exact algorithm for solving MDPs under risk-sensitive planning objectives with one-switch utility functions, in: Proc. of the International Joint Conference on Autonomous Agents and Multiagent Systems (AAMAS), 2008, pp. 453-460. |

[29] | Luce, R.; Raiffa, H., Games and decisions, (1957), John Wiley New York · Zbl 0084.15704 |

[30] | Machina, M.J., Dynamic consistency and non-expected utility models of choice under uncertainty, Journal of economic literature, 27, 4, 1622-1668, (1989) |

[31] | McClennen, E., Rationality and dynamic choice: foundational explorations, (1990), Cambridge University Press |

[32] | Morin, T., Monotonicity and the principle of optimality, Journal of mathematical analysis and applications, 86, 665-674, (1982) · Zbl 0511.49018 |

[33] | Nielsen, T.D.; Jensen, F.V., Learning a decision makerʼs utility function from (possibly) inconsistent behavior, Artificial intelligence journal, 160, 53-78, (2004) · Zbl 1086.91019 |

[34] | W. Ogryczak, T. Sliwinski, On decision support under risk by the WOWA optimization, in: ECSQARU 2007, pp. 779-790. · Zbl 1148.68531 |

[35] | Ogryczak, W., WOWA enhancement of the preference modeling in the reference point method, (), 38-49 · Zbl 1178.68634 |

[36] | Ogryczak, W.; Sliwinski, T., On efficient WOWA optimization for decision support under risk, International journal of approximate reasoning, 50, 915-928, (2009) · Zbl 1191.68696 |

[37] | Perea, F.; Puerto, J., Dynamic programming analysis of the TV game “who wants to be a millionaire?, European journal of operational research, 183, 805-811, (2007) · Zbl 1179.90343 |

[38] | P. Perny, O. Spanjaard, L.-X. Storme, State space search for risk-averse agents, in: 20th International Joint Conference on Artificial Intelligence, 2007, pp. 2353-2358. |

[39] | Pratt, J., Risk aversion in the small and in the large, Econometrica, 32, 1, 122-136, (1964) · Zbl 0132.13906 |

[40] | Quiggin, J., Generalized expected utility theory: the rank-dependent model, (1993), Kluwer |

[41] | Raiffa, H., Decision analysis: introductory lectures on choices under uncertainty, (1968), Addison-Wesley · Zbl 0181.21802 |

[42] | Rotshild, M.; Stiglitz, J., Increasing risk I: A definition, Journal of economic theory, 2, 3, 225-243, (1970) |

[43] | Savage, L., The foundations of statistics, (1954), Wiley · Zbl 0055.12604 |

[44] | Shachter, R., Evaluating influence diagrams, Operations research, 34, 871-882, (1986) |

[45] | V. Torra, Weighted OWA operators for synthesis of information in: Proc. of the Fifth IEEE International Conference on Fuzzy Systems (IEEE-FUZZʼ96), 1996, pp. 966-971. |

[46] | Torra, V., The weighted OWA operator, International journal of intelligent systems, 12, 153-166, (1997) · Zbl 0867.68089 |

[47] | Torra, V., The WOWA operator and the interpolation function \(W^\ast\): Chen and ottoʼs interpolation method revisited, Fuzzy sets and systems, 113, 3, 389-396, (2000) · Zbl 1147.03313 |

[48] | Tversky, A.; Kahneman, D., Advances in prospect theory: cumulative representation of uncertainty, Journal of risk and uncertainty, 5, 297-323, (1992) · Zbl 0775.90106 |

[49] | von Neumann, J.; Morgenstern, O., Theory of games and economic behaviour, (1947), Princeton University Press |

This reference list is based on information provided by the publisher or from digital mathematics libraries. Its items are heuristically matched to zbMATH identifiers and may contain data conversion errors. It attempts to reflect the references listed in the original paper as accurately as possible without claiming the completeness or perfect precision of the matching.