Deep Reinforcement Learning
Axel Perschmann
Supervisor: Ahmed Abdulkadir Seminar: Current Works in Computer Vision Research Group: Pattern Recognition and Image Processing Albert-Ludwigs-Universit¨ at Freiburg
- 07. July 2016
Deep Reinforcement Learning Axel Perschmann Supervisor: Ahmed - - PowerPoint PPT Presentation
Deep Reinforcement Learning Axel Perschmann Supervisor: Ahmed Abdulkadir Seminar: Current Works in Computer Vision Research Group: Pattern Recognition and Image Processing Albert-Ludwigs-Universit at Freiburg 07. July 2016 Reinforcement
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 2 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Image: de.slideshare.net/ckmarkohchang/ language-understanding-for-textbased-games-using-deep-reinforcement-learning Axel Perschmann Deep Reinforcement Learning Presentation 3 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Image: de.slideshare.net/ckmarkohchang/ Axel Perschmann Deep Reinforcement Learning Presentation 3 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Image: de.slideshare.net/ckmarkohchang/ language-understanding-for-textbased-games-using-deep-reinforcement-learning Axel Perschmann Deep Reinforcement Learning Presentation 3 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Image: de.slideshare.net/ckmarkohchang/ language-understanding-for-textbased-games-using-deep-reinforcement-learning Axel Perschmann Deep Reinforcement Learning Presentation 3 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Image: de.slideshare.net/ckmarkohchang/ language-understanding-for-textbased-games-using-deep-reinforcement-learning Axel Perschmann Deep Reinforcement Learning Presentation 3 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
∞
k=0
∞
k=0
Axel Perschmann Deep Reinforcement Learning Presentation 4 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 5 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
1 Q(s, a) only feasible for small environments.
Axel Perschmann Deep Reinforcement Learning Presentation 5 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
1 Q(s, a) only feasible for small environments.
2 Unknown environment:
Axel Perschmann Deep Reinforcement Learning Presentation 5 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Source: Reinforcement Learning Lecture, ue01.pdf
Axel Perschmann Deep Reinforcement Learning Presentation 6 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 7 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 7 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 7 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 7 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 8 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Source: Reinforcement Learning Lecture, ue07.pdf
Axel Perschmann Deep Reinforcement Learning Presentation 8 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 9 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 10 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 11 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 12 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 12 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 13 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 14 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 15 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 16 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 17 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 18 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 19 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 20 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 21 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 22 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 22 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 23 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
DQN trained on a single Nvidia K40 GPU Proposed methods trained on 16 CPU cores
Axel Perschmann Deep Reinforcement Learning Presentation 24 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
time required to reach a fixed reference score over seven Atari games
Axel Perschmann Deep Reinforcement Learning Presentation 25 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
A3C, LSTM additionally used 256 LSTM cells after the final hidden layer
Axel Perschmann Deep Reinforcement Learning Presentation 26 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 27 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 28 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 29 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 30 / 33
Reinforcement Learning Asynchronous Reinforcement Learning Experiments Conclusion
Axel Perschmann Deep Reinforcement Learning Presentation 31 / 33
Bellemare, M. G., Naddaf, Y., Veness, J., and Bowling, M. (2013). The arcade learning environment: An evaluation platform for general agents. Journal of Artificial Intelligence Research, 47:253–279. Chang, M. (2015). Language understanding for text-based games using deep reinforcement learning. de.slideshare.net/ckmarkohchang/ language-understanding-for-textbased-games-using-deep-reinforcement-learning. Mnih, V., Badia, A. P., Mirza, M., Graves, A., Lillicrap, T. P., Harley, T., Silver, D., and Kavukcuoglu, K. (2016). Asynchronous methods for deep reinforcement learning. CoRR, abs/1602.01783. Mnih, V., Kavukcuoglu, K., Silver, D., Rusu, A. A., Veness, J., Bellemare, M. G., Graves, A., Riedmiller, M., Fidjeland, A. K., Ostrovski, G., Petersen, S., Beattie, C., Sadik, A., Antonoglou, I., King, H., Kumaran, D., Wierstra, D., Legg, S., and Hassabis, D. (2015). Human-level control through deep reinforcement learning. Nature, 518(7540):529–533. Sutton, R. (2015). Introduction to reinforcement learning. http://slideplayer.com/slide/7966867. Todorov, E., Erez, T., and Tassa, Y. (2012). Mujoco: A physics engine for model-based control (under review), 2011a. url http://www.cs.washington.edu/homes/todorov/papers/mujoco. pdf. Wymann, Espi´ e, Guionneau, Dimitrakakis, Coulom, and Sumner (2014). TORCS, The Open Racing Car Simulator. http://www.torcs.org.