1.8-总结

本章介绍了强化学习中的基本概念，这些概念在本书后将会被广泛使用。我们使用直观的网格世界的例子来介绍这些概念，进而在马尔科夫决策过程的框架中对它们进行了正式的介绍。有关马尔科夫决策过程的更多信息，读者可以参阅文献[1,2]。

[1] M. Pinsky and S. Karlin, An introduction to stochastic modeling (3^rd Edition). Academic Press, 1998.

[2] M. L. Puterman, Markov decision processes: Discrete stochastic dynamic programming. John Wiley & Sons, 2014.

评论