[1] D. Cruz-Suárez and R. Montes-de-Oca:
Uniform convergence of the value iteration policies for discounted Markov decision processes. Bol. Soc. Mat. Mexicana 12 (2006), 133–148.
MR 2301750
[2] D. Cruz-Suárez, R. Montes-de-Oca, and F. Salem-Silva:
Conditions for the uniqueness of discounted Markov decision processes. Math. Methods Oper. Res. 60 (2004), 415–436.
MR 2106092
[3] D. Cruz-Suárez, R. Montes-de-Oca, and F. Salem-Silva: Uniform approximations of discounted Markov decision processes to optimal policies. In: Proc. Prague Stochastics 2006 (M. Hušková and M. Janžura, eds.), MATFYZPRESS, Prague 2006, pp. 278–287.
[4]
O. Hernández-Lerma: Adaptive Markov Control Processes Springer-Verlag, New York 1989.
MR 0995463
[5] O. Hernández-Lerma and J. B. Lasserre:
Discrete–Time Markov Control Processes: Basic Optimality Criteria. Springer-Verlag, New York 1996.
MR 1363487
[6] O. Hernández-Lerma and J. B. Lasserre:
Further Topics on Discrete–Time Markov Control Processes. Springer-Verlag, New York 1999.
MR 1697198
[7] M. L. Puterman:
Markov Decision Processes. Discrete Stochastic Dynamic Programming. Wiley, New York 1994.
MR 1270015 |
Zbl 1184.90170
[8] R. Ritt and L. Sennott:
Optimal stationary policies in general state Markov decision chains with finite action sets. Math. Oper. Res. 17 (1992), 901–909.
MR 1196400
[9] N. L. Stokey and R. E. Lucas:
Recursive Methods in Economic Dynamics. Harvard University Press, USA 1989.
MR 1105087