Boyd, Stephen, and Lieven Vandenberghe. 2004. Convex Optimization. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511804441.
François-Lavet, Vincent, Peter Henderson, Riashat Islam, Marc G. Bellemare, and Joelle Pineau. 2018. “An Introduction to Deep Reinforcement Learning.” Foundations and Trends® in Machine Learning 11 (3-4): 219–354. https://doi.org/10.1561/2200000071.
Kochenderfer, Mykel J., and Tim A. Wheeler. 2019. Algorithms for Optimization. The MIT Press.
Mnih, Volodymyr, Koray Kavukcuoglu, David Silver, Alex Graves, Ioannis Antonoglou, Daan Wierstra, and Martin Riedmiller. 2013. “Playing Atari with Deep Reinforcement Learning.” http://arxiv.org/abs/1312.5602.
Nesterov, Yurii. 2004. Introductory Lectures on Convex Optimization. Vol. 87. Applied Optimization. Kluwer Academic Publishers, Boston, MA. https://doi.org/10.1007/978-1-4419-8853-9.
Puterman, Martin L. 1994. Markov Decision Processes: Discrete Stochastic Dynamic Programming. Wiley Series in Probability and Mathematical Statistics: Applied Probability and Statistics. John Wiley & Sons, Inc., New York.
Sundaram, Rangarajan K. 1996. A First Course in Optimization Theory. Cambridge University Press, Cambridge. https://doi.org/10.1017/CBO9780511804526.
Watkins, Christopher J. C. H., and Peter Dayan. 1992. “Q-Learning” 8(3): 279–92.
Page built: 2021-03-04 using R version 4.0.3 (2020-10-10)