Previous |  Up |  Next

Article

Keywords:
possibly transient Markov chains; discounted approach; first return time; uniqueness of solutions to the multiplicative Poisson equation
Summary:
This work concerns a discrete-time Markov chain with time-invariant transition mechanism and denumerable state space, which is endowed with a nonnegative cost function with finite support. The performance of the chain is measured by the (long-run) risk-sensitive average cost and, assuming that the state space is communicating, the existence of a solution to the risk-sensitive Poisson equation is established, a result that holds even for transient chains. Also, a sufficient criterion ensuring that the functional part of a solution is uniquely determined up to an additive constant is provided, and an example is given to show that the uniqueness result may fail when that criterion is not satisfied.
References:
[1] A. Arapostathis, V. K. Borkar, E. Fernández-Gaucherand, M. K. Gosh, and S. I. Marcus: Discrete-time controlled Markov processes with average cost criteria: a survey. SIAM J. Control Optim. 31 (1993), 282–334. MR 1205981
[2] A. Brau-Rojas, R. Cavazos-Cadena, and E. Fernández-Gaucherand: Controlled Markov chains with a risk-sensitive criteria: some counterexamples In: Proc. 37th IEEE Conference on Decision and Control, Tempa 1998, pp. 1853–1858.
[3] R. Cavazos–Cadena and E. Fernández-Gaucherand: Controlled Markov chains with risk-sensitive criteria: average cost, optimality equations and optimal solutions. Math. Methods Oper. Res. 43 (1999), 121–139. MR 1687362
[4] R. Cavazos–Cadena and E. Fernández–Gaucherand: Risk-sensitive control in communicating average Markov decision chains. In: Modelling Uncertainty: An examination of Stochastic Theory, Methods and Applications (M. Dror, P. L’Ecuyer, and F. Szidarovsky, eds.), Kluwer, Boston 2002, pp. 525–544.
[5] R. Cavazos–Cadena and D. Hernández-Hernández: Solution to the risk-sensitive average cost optimality equation in communicating Markov decision chains with finite state space: An alternative approach. Math. Methods Oper. Res. 56 (2003), 473–479. MR 1953028
[6] R. Cavazos–Cadena: Solution to the risk-sensitive average cost optimality equation in a class of Markov decision processes with finite state space. Math. Methods Oper. Res. 57 (2003), 263–285. MR 1973378
[7] R. Cavazos–Cadena and D. Hernández-Hernández: A characterization of the optimal risk-sensitive average cost in finite controlled Markov chains. Ann. Appl. Probab. 15 2005, 175–212. MR 2115041
[8] R. Cavazos–Cadena and D. Hernández-Hernández: Necessary and sufficient conditions for a solution to the risk-sensitive Poisson equation on a finite state space. Systems Control Lett. 58 (2009), 254–258. MR 2510639
[9] G. B. Di Masi and L. Stettner: Risk-sensitive control of discrete time Markov processes with infinite horizon. SIAM J. Control Optim. 38 (1999), 61–78. MR 1740607
[10] W. H. Fleming and W. M. McEneany: Risk-sensitive control on an infinite horizon. SIAM J. Control Optim. 33 (1995), 1881–1915. MR 1358100
[11] D. Hernández-Hernández and S. I. Marcus: Risk-sensitive control of Markov processes in countable state space. Systems Control Lett. 29 (1996), 147–155. MR 1422212
[12] O. Hernández-Lerma: Adaptive Markov Control Processes Springer, New York 1988. MR 0995463
[13] R. A. Howard and J. E. Matheson: Risk-sensitive Markov decision processes. Management Sci. 18 (1972), 356–369. MR 0292497
[14] D. H. Jacobson: Optimal stochastic linear systems with exponential performance criteria and their relation to stochastic differential games. IEEE Trans. Automat. Control 18 (1973), 124–131. MR 0441523
[15] S. C. Jaquette: Markov decison processes with a new optimality criterion: discrete time. Ann. Statist. 1 (1973), 496–505. MR 0378839
[16] S. C. Jaquette: A utility criterion for Markov decision processes. Management Sci. 23 (1976), 43–49. MR 0439037 | Zbl 0337.90053
[17] A. Jaśkiewicz: Average optimality for risk sensitive control with general state space. Ann. Appl. Probab. 17 (2007), 654–675. MR 2308338
[18] M. Loève: Probability Theory I. Springer, New York 1980. MR 0651017
[19] M. L. Puterman: Markov Decision Processes. Wiley, New York 1994. MR 1270015 | Zbl 1184.90170
[20] E. Seneta: Nonnegative Matrices. Springer, New York 1980.
[21] K. Sladký: Growth rates and average optimality in risk-sensitive Markov decision chains. Kybernetika 44 (2008), 205–226. MR 2428220
[22] K. Sladký and R. Montes-de-Oca: Risk-sensitive average optimality in Markov decision chains Raul. In: Oper. Res. Proc. 2007 (Selected Papers of the Internat. Conference on Operations Research 2007, Saarbruecken, J. Kalcsics and S. Nickel, eds.), Springer-Verlag, Berlin – Heidelberg 2008, pp. 69–74.
Partner of
EuDML logo